Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Nov 1.
Published in final edited form as: Prev Med. 2016 Feb 21;92:110–117. doi: 10.1016/j.ypmed.2016.02.025

Co-occurring risk factors for current cigarette smoking in a U.S. nationally representative sample

Stephen T Higgins a,*, Allison N Kurti a, Ryan Redner a, Thomas J White a, Diana R Keith a, Diann E Gaalema a, Brian L Sprague a, Cassandra A Stanton b, Megan E Roberts c, Nathan J Doogan c, Jeff S Priest a
PMCID: PMC4992654  NIHMSID: NIHMS782601  PMID: 26902875

Abstract

Introduction

Relatively little has been reported characterizing cumulative risk associated with co-occurring risk factors for cigarette smoking. The purpose of the present study was to address that knowledge gap in a U.S. nationally representative sample.

Methods

Data were obtained from 114,426 adults (≥ 18 years) in the U.S. National Survey on Drug Use and Health (years 2011–13). Multiple logistic regression and classification and regression tree (CART) modeling were used to examine risk of current smoking associated with eight co-occurring risk factors (age, gender, race/ethnicity, educational attainment, poverty, drug abuse/dependence, alcohol abuse/dependence, mental illness).

Results

Each of these eight risk factors was independently associated with significant increases in the odds of smoking when concurrently present in a multiple logistic regression model. Effects of risk-factor combinations were typically summative. Exceptions to that pattern were in the direction of less-than-summative effects when one of the combined risk factors was associated with generally high or low rates of smoking (e.g., drug abuse/dependence, age ≥65). CART modeling identified subpopulation risk profiles wherein smoking prevalence varied from a low of 11% to a high of 74% depending on particular risk factor combinations. Being a college graduate was the strongest independent predictor of smoking status, classifying 30% of the adult population.

Conclusions

These results offer strong evidence that the effects associated with common risk factors for cigarette smoking are independent, cumulative, and generally summative. The results also offer potentially useful insights into national population risk profiles around which U.S. tobacco policies can be developed or refined.

Keywords: Risk factors, Co-occurring risk factors, Adults, U.S. nationally representative sample, Cigarette smoking, Current smokers, Multiple logistic regression, Classification and regression tree (CART), Educational attainment

1. Introduction

There have been substantial decreases in U.S. national smoking prevalence since the mid 21960’s, but unfortunately these decreases have been unevenly distributed in the general population (Schroeder and Koh, 2014). Substantial reductions have been noted in some sub-populations (e.g., more affluent non-Hispanic Whites), but relatively little change has occurred in others (e.g., those with substance use disorders), and increases in still others (e.g., economically disadvantaged women) (Chilcoat, 2009; Fiore et al., 2008; Schroeder and Koh, 2014). These uneven changes in smoking prevalence underpin considerable current interest in understanding individual differences in risk for cigarette smoking and other forms of tobacco and nicotine use.

Monitoring the extent to which prevalence of smoking or use of other tobacco products differs by risk factors is now recognized to be an important element of tobacco control (Fiore et al., 2008) and regulatory science (Ashley et al., 2014). The overarching purpose of the present study is to begin characterizing the effects of co-occurring risk factors for cigarette smoking. We focus on cigarette smoking because it remains the most prevalent, toxic, and costly form of tobacco and nicotine use (U.S. DHHS, 2014). We know of no exhaustive set of risk factors for cigarette smoking, although gender, age, race/ethnicity, educational attainment, poverty status, substance use disorders, and mental illness are each well documented in the literature and are examined in the present study (Fiore et al., 2008; Higgins et al., 2015; Higgins and Chilcoat, 2009; Schroeder and Koh, 2014). While each of these risk factors inevitably co-occurs with some arrangement of the others (i.e., gender always co-occurs with chronological age, educational attainment, and race/ethnicity), there has been relatively little research reported explicitly characterizing the combined effects of co-occurring risk factors for cigarette smoking. Knowing whether effects of co-occurring risk factors are independent of each other (i.e., summative), or whether some may offset risks associated with others (i.e., less-than-summative/antagonistic), or perhaps increase risk in a multiplicative (i.e., synergistic) manner is important to the development of evidence-based tobacco policy and is the overarching purpose of the present study. Also of interest is empirically examining the relative strength of these common risk factors and how cumulative risk varies across particular risk factor profiles.

In a prior literature review on gender differences in prevalence of cigarette smoking and use of other nicotine and delivery products in the U.S., gender differences in risk were noted to generally act independently of the influence of other co-occurring risk factors, that is, gender and the other risk factors appeared to act in a cumulative and summative manner (Higgins et al., 2015). However, these were qualitative observations regarding patterns in previously published articles, none of which was explicitly designed to characterize the combined effects of co-occurring risk factors.

The present study was designed to build upon the initial findings described above by examining (a) a broader range of co-occurring risk factors than those involving gender, (b) statistically examining the independent and combined effects of common risk factors for smoking, and (c) identifying particularly low- and high-risk profiles. We know of no prior studies specifically on this topic regarding risk for current cigarette smoking, although studies characterizing effects of co-occurring risk factors are common in other areas of health research (e.g., Park et al., 2009; Schnohr et al., 2002). The present study was conducted using what at the time of study initiation was the most recent three years (2011–2013) of the National Survey on Drug Use and Health (NSDUH) (SAMHSA, 2012, 2013, 2014), a cross-sectional survey that has been used effectively in prior studies examining prevalence of cigarette smoking across various socio-demographic and psychiatric risk factors (e.g., Gfroerer et al., 2013; Redner et al., 2014a,b; White et al., 2015).

2. Methods

2.1. Data source

The NSDUH is a nationally representative survey of the U.S. non-institutionalized population aged ≥12 years that measures prevalence and correlates of drug use (Center for Behavioral Health Statistics and Quality, 2014). Detailed descriptions of survey procedures have been provided for each of the survey years (SAMHSA, 2012, 2013, 2014). Only individuals aged ≥ 18 years were included in the present study so that all participants were of legal age to purchase cigarettes. Across each of the survey years, NSDUH recruitment was completed using a multistage area probability sample design in which a predetermined number of participants were randomly recruited by address within each state. Respondents completed computer- and audio-assisted structured interviews. Respondents were selected from the civilian non-institutionalized population, including group homes, shelters, and college dormitories. Individuals on active military duty, in residential drug treatment programs, in jail, or homeless without residence were excluded. The present study included 114,426 adult respondents interviewed during 2011 (N = 39,133), 2012 (N = 37,869), and 2013 (N = 37,424). Annual weighted response rates were 74.4%, 73.0%, and 71.7% for the 2011, 2012, and 2013 surveys, respectively. Data were weighted during analysis to adjust for the differential probability of both selection and response.

Current smoker status was defined as smoking all or part of a cigarette within the 30 days preceding the interview and ≥100 cigarettes lifetime. The six racial/ethnic categories used in the survey were mutually exclusive. Persons identified as Hispanic might be of any race while persons identified as White, Black, Asian, American Indian/Alaska Native, or Other were all non-Hispanic. The category “Other” included Native Hawaiians or Other Pacific Islanders and individuals endorsing two or more races. Poverty status (living below or at/above the federal poverty line) was defined using poverty thresholds published by the U.S. Census Bureau. Any mental illness was defined as having a mental, behavioral, or emotional disorder in the past 12 months, excluding developmental or substance use disorders. To assess the presence of any mental illness, respondents aged ≥18 years answered a series of 14 questions that made up two scales measuring psychological distress (Kessler-6) and disability (World Health Organization Disability Assessment Schedule). Scores on these two scales were used to determine any mental illness status based on a statistical model developed from clinical interviews that assessed disorders based on criteria in the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV). Past year alcohol and illicit drug abuse/dependence diagnoses were also based on DSM-IV criteria.

2.2. Statistical methods

Sample-adjusted frequencies and confidence intervals (CIs) were generated across all respondents ≥18 years of age. Variables of interest were determined based on previously identified demographic and socioeconomic predictors of smoking, including age, sex, education, race/ethnicity, poverty status, past year mental illness, alcohol abuse/dependence, and illicit drug abuse/dependence.

Associations of risk factors with current smoking status were examined in separate analyses. For each risk factor, weighted, univariate logistic regression was used to determine which variables would be included in subsequent multivariable models and to conduct an initial comparison of odds ratios (with CIs) across variable levels predicting current smoker status. PROC SURVEYLOGISTIC in SAS (SAS Institute, Cary, NC) was used to conduct the analyses, relying on maximum likelihood estimation and the Fisher scoring algorithm. Variances were estimated using Taylor series linearization.

Multivariable logistic regression analyses predicting current smoker status were conducted using all variables from univariate logistic regression analyses that significantly predicted smoking status, initially without including any interaction terms. Again, odds ratios (with CIs) were generated. Analyses were repeated examining all possible two-way interactions across the eight risk factors (maximum of 28 risk-factor combinations each across analyses predicting smoking status). Interactions were first tested individually. Interactions that contributed significantly to predicting current smoker status were added to the multivariable model. Interactions that remained significant predictors were retained in the final regression model. These analyses were conducted using SAS 9.4 software (SAS Institute, Cary, NC). Across all tests, statistical significance was defined as p < .001 (2-tailed) to correct for multiple comparisons corresponding to investigating interactions.

Interaction results from the final regression model were used to initially characterize the combined effects of the risk factors. Where an interaction term was non-significant between two risk factors that each produced significant main effects, combined effects were categorized as summative (i.e., sum of the independent effects of the respective risk factors). Where significant interactions were noted between predictors, they were categorized as representing a deviation from summation. Further determination of whether an interaction represented a less-than- or greater-than-summative effect was determined by reviewing graphic displays of the relevant odds ratios (see Fig. 2).

Fig. 2.

Fig. 2

Three illustrative examples of significant two-way interactions of risk factors for current smoker status; data points represent odds ratios.

Lastly, a classification and regression tree (CART) analysis (Breiman et al., 1984) was conducted to supplement multiple logistic regression modeling, using the same eight independent variables to predict current smoking. CART analysis is a nonparametric procedure for dividing a population of interest into mutually exclusive subgroups based on a dependent variable of interest such as current smoking status in the present study (Lemon et al., 2003) and, in the process, identifying independent variables with the most explanatory power in accounting for that dependent variable. The process begins by identifying the single most important independent variable for dividing the total sample (parent node) into two groups (child nodes), using a predetermined branching criterion. Nodes are split based on their purity using the Gini impurity function (Breiman et al., 1984). A “pure” node has no variability in the dependent variable. A completely “impure” node has a conditional probability of p(k|t) = 0.5, where k refers to the dependent variable and t refers to the node (Lei et al., 2015). A splitting or branching criterion “selects the split that has the largest difference between the impurity of the parent node and a weighted average of the impurity of the two child nodes” (Lemon et al., 2003, p. 174). Given the dependent variable was binary, we used the Gini impurity function to split nodes, repeating the process recursively with every subsample, until the subsample reached a minimum size or no further splits could be made. The tree was built using R’s rpart package (R Core Team, 2013; Therneau et al., 2013). We used the classification method in R, given the dependent variable was binary, and included survey weights, given the multi-stage sampling procedures of the NSDUH. A fully saturated tree was produced initially, and then pruned by selecting the complexity parameter that minimized cross-validation error and setting a minimum sample size in terminal nodes of n = 1000.

3. Results

3.1. Logistic regression analyses

Overall prevalence of current smokers in this adult sample was 21.6% (Table 1, left-most column). Each of the eight risk factors significantly increased the odds of being a current smoker in univariate logistic regression (Table 1, center-most columns). Each of those risk factors also remained significant in a multivariate logistic regression model adjusting for the influence of the others, demonstrating significant independent associations with smoking status (Table 1, right-most columns). The largest increase in the adjusted odds of smoking was seen with educational attainment, followed by age, past year illicit drug abuse/dependence, race/ethnicity, past year alcohol abuse/dependence, income below federal poverty level, mental illness, and gender.

Table 1.

Prevalence of current smoking and results from univariate and multivariate logistic regressions predicting current cigarette smokinga among adults (aged ≥18 years) across eight potential risk factors (n = 114,426).

National Survey on Drug Use and Health (NSDUH), United States, 2011–2013.

% Current smoking
Univariate logistic regression
Multivariate logistic regression
Prevalence
Main effects
Main effects
% (95% CI) OR (95% CI) AOR (95% CI)
Overall 21.6 (21.1, 22.1)
Gender
 Male 24.3 (23.6, 25.0) 1.4** (1.3, 1.4) 1.3** (1.3, 1.4)
 Female 19.0 (18.5, 19.5) Ref. group Ref. group
Age group (years)
 18–25 24.7 (24.2, 25.1) 3.1** (2.8, 3.5) 2.5** (2.2, 2.8)
 26–44 27.0 (26.2, 27.7) 3.5** (3.1, 3.9) 4.1** (3.6, 4.6)
 45–64 21.3 (20.4, 22.1) 2.6** (2.3, 2.9) 2.8** (2.5, 3.2)
 ≥65 9.6 (8.6, 10.5) Ref. group Ref. group
Race/Ethnicityb
 White 23.4 (22.8, 24.0) 3.2** (2.8, 3.7) 2.8** (2.4, 3.3)
 Black 22.1 (20.9, 23.3) 3.0** (2.5, 3.5) 1.8** (1.5, 2.2)
 Hispanic 15.4 (14.6, 16.3) 1.9** (1.7, 2.2) 0.9 (0.8, 1.1)
 American Indian/Alaska native 37.2 (31.2, 43.1) 6.2** (4.6, 8.4) 3.0** (2.2, 4.2)
 Asian 8.7 (7.5, 9.8) Ref. group Ref. group
 Other 31.0 (27.9, 34.0) 4.7** (3.9, 5.7) 3.3** (2.7, 4.1)
Education level
 < High school 30.8 (29.6, 32.0) 3.7** (3.4, 4.0) 4.6** (4.2, 5.1)
 High school graduate 26.7 (25.9, 27.5) 3.0** (2.8, 3.2) 3.4** (3.1, 3.6)
 Some college 23.0 (22.1, 23.8) 2.5** (2.3, 2.7) 2.5** (2.3, 2.7)
 College graduate 10.8 (10.1, 11.5) Ref. group Ref. group
Poverty statusc
 Below poverty level 32.8 (31.4, 34.1) 2.0** (1.9, 2.1) 1.6** (1.5, 1.7)
 At or above poverty level 19.6 (19.1, 20.2) Ref. group Ref. group
Any mental illnessd
 Yes 31.7 (30.7, 32.7) 1.9** (1.9, 2.0) 1.5** (1.4, 1.6)
 No 19.2 (18.7, 19.7) Ref. group Ref. group
Alcohol abuse/dependencee
 Yes 44.3 (42.8, 45.9) 3.2** (3.0, 3.4) 2.3** (2.1, 2.5)
 No 19.8 (19.3, 20.3) Ref. group Ref. group
Illicit drug abuse/dependencee
 Yes 63.7 (61.4, 66.0) 6.8** (6.1, 7.6) 3.7** (3.2, 4.2)
 No 20.5 (20.0, 21.0) Ref. group Ref. group

Notes. OR = Odds ratio, AOR = Adjusted odds ratio, CI = Confidence interval, Ref. group = Reference group.

**

p < 0.001.

a

Persons who reported ever smoking all or part of a cigarette in the 30 days preceding the interview AND smoked ≥100 cigarettes in their lifetime.

b

The five racial/ethnicity categories (White, Black, Hispanic, American Indian/Alaska Native, Asian, Other) are mutually exclusive; “Other” includes Native Hawaiians or Other Pacific Islanders and persons of two or more races. Persons identified as Hispanic might be of any race.

c

Based on reported family income and poverty thresholds published by the U.S. Census Bureau.

d

Any mental illness is defined by the NSDUH as a diagnosable mental, behavioral, or emotional disorder, other than a developmental or substance use disorder, that met the criteria found in the 4th edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV). For details on the methodology, see Section B.4.3 in Appendix B of the Results from the 2011 NSDUH: Mental Health Findings.

e

Drug and alcohol abuse and dependence criteria used in the NSDUH were defined based upon the criteria listed in the DSM-IV. Illicit substances included marijuana, hallucinogens, heroin, inhalants, tranquilizers, cocaine, pain relievers, stimulants, and sedatives.

Results from including the two-way interaction term in the final multivariate logistic regression model were used for an initial characterization of combined effects of these eight significant risk factors (Table 2). Across all of the possible two-way interactions, 20 of 28 (71%) were not significant, meaning that the combined effects of these risk factors were summative (i.e., they did not deviate significantly from a summation of the effects of the two risk factors when considered alone) (Fig. 1).

Table 2.

Results from multiple logistic regression predicting current cigarette smokinga examining significant main effects and interactions across eight risk factors considered in combination (n = 114,426).

National Survey on Drug Use and Health (NSDUH), United States, 2011–2013.

Main effects Wald χ2 p Value
Gender 38.22 <0.0001
Age GROUP (years)b 3.62 0.306
Race/Ethnicityc 17.42 0.004
Education leveld 17.66 0.001
Poverty statuse 1.59 0.208
Any mental illnessf 15.15 <0.0001
Alcohol abuse/dependenceg 543.57 <0.0001
Drug abuse/dependenceg 353.84 <0.0001
Interactions Wald χ2 p value
Race/Ethnicityc * age group (years)b 138.24 <0.0001
Race/Ethnicityc * education leveld 292.01 <0.0001
Race/Ethnicityc * poverty statuse 33.58 <0.0001
Race/Ethnicityc * any mental illnessf 31.71 <0.0001
Race/Ethnicityc * gender 125.45 <0.0001
Age Group (years)b * education leveld 113.52 <0.0001
Age Group (years)b * poverty statuse 41.92 <0.0001
Drug Abuse/Dependenceg * alcohol abuse/dependenceg 30.85 <0.0001
a

Persons who reported smoking of all or part of a cigarette in the 30 days preceding the interview AND smoking ≥100 cigarettes in their lifetime.

b

Among persons aged ≥18 years (18–25, 26–44, 45–64, ≥65 years).

c

The five racial/ethnicity categories (White, Black, Hispanic, American Indian/Alaska Native, Asian, Other) are mutually exclusive; “Other” includes Native Hawaiians or Other Pacific Islanders and persons of two or more races. Persons identified as Hispanic might be of any race.

d

<HS, HS, some college, college graduate.

e

Based on reported family income and poverty thresholds published by the U.S. Census Bureau.

f

Any mental illness is defined by the NSDUH as a diagnosable mental, behavioral, or emotional disorder, other than a developmental or substance use disorder, that met the criteria found in the 4th edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV). For details on the methodology, see Section B.4.3 in Appendix B of the Results from the 2011 NSDUH: Mental Health Findings.

g

Drug and alcohol abuse and dependence criteria used in the NSDUH were defined based upon the criteria listed in the DSM-IV. Illicit substances included marijuana, hallucinogens, heroin, inhalants, tranquilizers, cocaine, pain relievers, stimulants, and sedatives.

Fig. 1.

Fig. 1

Outcomes of two-way interaction testing among significant risk factors for current smoker status in the multiple logistic regression analysis; x and − symbols indicate risk-factor combinations where there was and was not a significant interaction, respectively.

The eight (29%) significant interactions involved each of the eight risk factors, but mostly race/ethnicity which has 6 levels in the present study and was involved in five of the eight significant interactions (Table 2). Age was involved in three interactions and the six other risk factors were each involved in one interaction (Table 2).

As is illustrated in Fig. 2, these interactions largely involved less-than-summative effects where one of the two risk factors was associated with generally high or low smoking prevalence. As shown in Panel A, for example, the typical pattern of Blacks having lower odds than Whites for being a current smoker was present in the 18–25 years but not the ≥65 years age bracket where smoking rates are generally low. Similarly, as shown in Panel B, the typical pattern of Hispanics having lower smoking rates than Whites was discernible among those with less than a high school education but not among college graduates where smoking rates are generally low. In Panel C we see a similar example where the effects associated with having less than a high education are discernible among those in the 18–25 years but not the ≥65 years age bracket where smoking rates are generally low. As a final example, in Panel D we see increases in the odds of smoking associated with alcohol abuse/dependence among those without but not those with illicit drug abuse/dependence where smoking rates are generally high. We saw no evidence of interactions related to multiplicative effects of combining risk factors.

Use of logistic regression to characterize more than two-way interactions when dealing with 8 risk factors becomes impractical in terms of interpretation, and thus we used the CART analysis for that task and also for comparing the relative strength of the eight risk factors. Regression trees are adept at illustrating important risk factor combinations or profiles (Austin, 2007).

3.2. Classification and regression tree (CART) analyses

The CART analysis identified educational attainment as the strongest risk factor followed by age, race/ethnicity, drug abuse/dependence, alcohol abuse/dependence, poverty level, mental illness, and gender. Fig. 3 shows a pruned classification tree modeling changes in smoking prevalence associated with the various risk-factor combinations. The graphic is designed to represent an inverted tree.

Fig. 3.

Fig. 3

A pruned, weighted classification and regression tree (CART) model of associations between current (past 30 days) smoking status and the following eight risk factors in the U.S. adult (≥18 years of age) population: educational attainment, age, race/ethnicity, past year drug abuse/dependence, past year alcohol abuse/dependence, annual income below federal poverty level, and past year mental illness. Results from a saturated model were “pruned” using CART analytic software to reduce complexity (R Core Team, 2013). Rectangles (nodes) represent smoking prevalence rates for the entire population (top-most node) or population subgroups (all others nodes). Nodes also list the proportion of the adult population represented. Using the root node as an example, 78% of the population are non-smokers, 22% smokers, and this node represents 100 of the U.S. non-institutionalized adult population. Lines below nodes represent the binary “yes”–”no” branching around particular risk factors and risk-factor levels, with subgroups in whom the risk factor/level is present moving leftward and downward and those in whom it is absent moving rightward and downward for further potential partitioning based on additional risk factors/levels. The bottom row comprises terminal nodes (i.e., final partitioning for a particular subgroup). Note that minimal terminal node size was set to ≥1000 individuals. Terminal nodes contain the same information as the other nodes plus the percent of all adult current smokers represented by that node. Percent of current smokers represented is calculated by the following equation: % total population represented by a node × smoking prevalence in that node/smoking prevalence in the entire study sample × 100. Tallying % current smokers represented across all terminal nodes should = 100% of smokers in the U.S adult population save possible rounding error.

The rectangle shown at the top of Fig. 3 is referred to as the root node. It represents 100% of the U.S. adult non-institutionalized population as indicated in the bottom row of information displayed in the node, with 78% non-smokers and 22% smokers as indicated in top row of information displayed in the node. The 1st split of the entire population was based on whether someone was or was not a college graduate (dashed lines immediately below root node). College graduates branched leftward and downward to a terminal node (no further splitting/classification possible) with a smoking prevalence of only 11%, one half that is seen in the overall population. Note that terminal nodes also display a 3rd piece of information not shown in the root or child nodes, which is the percentage of the current smoking population that is represented by that node (bottom-most row). This terminal node representing college graduates includes prevalence rates of 11% current smokers and 89% non-smokers, and represents 30% of the U.S. adult non-institutionalized population and 15% of adult current smokers within that population. The 70% of the overall population (smokers & non-smokers) that had less than a college education branched rightward and downward to a child node (further splitting/classification possible) where prevalence of current smoking increased to 26% corresponding to removal of the relatively low-risk college graduates from the sample.

The 2nd branching was based on chronological age, dividing all those in the adult U.S. non-institutionalized population with less than a college education by whether they were ≥65 vs. 18–64 years of age. Those ≥65 years branched leftward and downward to a terminal node. Note that smoking prevalence in those ≥65 years is lowest among all age brackets. As such, the 11% smoking prevalence in this node was equal to that seen among college graduates despite this subgroup having lower educational attainment. Those whose age was below 65 years that branched rightward and downward to a child node where smoking prevalence rates increased further to 30% corresponding to having removed those in the oldest age bracket.

The next branching of this subgroup with less than a college education and ages between 18 and 64 years was based on the absence vs. presence of past year drug abuse/dependence. Those without drug abuse/dependence branched leftward and downward to a child node where smoking prevalence decreased to 28% corresponding to having removed the relatively small subgroup with drug abuse/dependence. The subgroup with past year drug abuse/dependence moved rightward and downward to a child node where smoking prevalence increased to 67%, more than three-fold above the overall 22% smoking prevalence rate seen in the entire adult population. Further branching of this subgroup was based on additional age levels followed by race/ethnicity levels resulting in three separate terminal nodes. Note that all individuals represented in those three terminal nodes had less than a college education and past year drug abuse/dependence, but smoking prevalence nevertheless varied between 46% and 74% depending on the particular age and race/ethnicity levels with which those risk factors were combined. Those terminal nodes each represented 1% or less of the entire population and thus a relatively small (1–3%) overall proportion of current smokers.

This same general branching process was repeated until the entire study population was classified in risk profiles across the 13 terminal nodes in the bottom row. The four left-most nodes represent ~90% of the U.S. adult non-institutionalized population and the 9 right-most nodes ~10%. Note that the majority of all current smokers (74%) are represented in the four left-most nodes even though smoking prevalence rates are much lower in those nodes. An alternative way of characterizing this distribution is that smoking is strikingly overrepresented in the 9 rightmost terminal nodes such that 26% of all current smokers are represented among only 10% of the total population.

Balancing smoking prevalence against proportion of the population represented, the terminal node or risk profile that represents the largest proportion of all current smokers is the fourth from left that comprises individuals with high school or some college education, ages 18–64 years, all race/ethnicity levels except Asian & Hispanic, and no past year alcohol or drug abuse/dependence. That node represents 34% of the entire population and 43% of all adult current smokers.

4. Discussion

The present study was conducted to follow up on observations reported as part of a literature review on gender differences where risk for cigarette smoking appeared to change in a cumulative and summative manner when gender was considered in combination with other co-occurring risk factors (Higgins et al., 2015). The present results confirm those earlier observations and extend them to additional risk-factor combinations beyond gender. Results from the multiple logistic regression analyses provide clear evidence that each of the eight risk factors examined act as independent predictors. Results from testing all possible two-way interactions among those eight independent predictors indicated that they usually did not interact significantly (i.e., 20 of 28 tests were non-significant). That is, these risk factors generally acted in a cumulative and summative manner where effects of the risk factors in combination did not change significantly from those observed when each was examined as single predictors in the adjusted model. In those instances where there were significant interactions, the general pattern was that one of the combined risk factors was associated with diminished effects compared to when examined alone (less-than-summative effects) (Fig. 2). The CART analysis provided numerous opportunities to observe orderly upward and downward changes in smoking prevalence across population subgroups corresponding to changes in risk-factor combinations. Considered together, these results provide an empirical framework for understanding the striking differences in smoking prevalence observed across population subgroups and for making predictions about the likely effects of novel risk-factor combinations.

Three more specific points about the risk profiles from the CART analysis merit comment. First, there was only one instance where a single risk factor – actually a single risk-factor level – acted as a stand-alone risk profile. That was being a college graduate. Moreover, this single risk-factor level classified 30% of the U.S. adult population demonstrating considerable reach. Educational attainment was also associated with the largest changes in the odds of smoking in the multiple logistic regression. These results underscoring the importance of a college education as a protective factor and identifying educational attainment more generally as the strongest risk factor among the eight examined are consistent with the emphasis that has been previously placed on educational attainment in discussing prevention of smoking among youth and young adults (e.g., see Chapter 2, U.S., HDHS, 2012) as well as addressing smoking risk among women and other vulnerable populations (Chilcoat, 2009; Graham et al., 2007; Higgins et al., 2009; Hiscock et al., 2012; Kandel et al., 2009). Also important to underscore is that educational attainment is a modifiable risk factor. Considering the pervasive and robust associations between educational attainment and smoking risk, there are grounds for a broader approach to tobacco control policy that encompasses more distal risk factors like general educational attainment in addition to the more proximal and conventional tobacco-control foci (Graham et al., 2007; Higgins et al., 2009; Kandel et al., 2009).

Second, the risk profile corresponding to the 4th terminal node in Fig. 3 (counting from left to right) warrants further comment. That risk profile represents adults with a high school or some college education level, chronological age in the 18–64 years bracket, Black, White, American Indian/Native Alaskan, or Other race ethnicity, and no past year alcohol or drug abuse/dependence. By a factor of 2.9- to 43-fold, the profile included more current smokers (43%) than any of the 12 other risk profiles identified in the CART analysis. This profile represents a large segment of the U.S. adult population who educationally fall into the unskilled and skilled labor socioeconomic class and who for reasons that are not well understood have been relatively less responsive to efforts to reduce smoking (e.g., Asfar et al., 2015). Prior reports underscoring the relatively high smoking rates in blue-collar and service occupations have directed attention to the workplace setting as an important potential focal point for tobacco control and regulatory efforts that would have the potential to reach many within this subgroup (CDC, 2011). However, additional and complementary campaigns specifically targeting this risk-profile subgroup through other channels and contexts (social media, health care providers, community/civic organizations) are likely to be needed as well. This is a risk profile subgroup with which tobacco reduction efforts will need to improve considerably if the smoking-related goals of Healthy People 2020 are to be realized (Office of Disease Prevention and Health Promotion, 2015).

Third, the subgroups represented across the nine rightmost terminal nodes in the bottom row of Fig. 3 merit further comment as well. Smoking prevalence rates ranged from 37% to 74% across those terminal nodes corresponding to a relatively larger number of risk factors and higher risk-factor levels than in the other nodes. Collectively, those nine nodes accounted for only 10% of the population but 26% of all current smokers. While not explicitly examined in this study, smokers with these profiles are more likely to be nicotine dependent and are at increased risk for adverse health impacts of smoking (Higgins and Chilcoat, 2009; Hiscock et al., 2012). The lower educational attainment levels and higher rates of alcohol and drug use disorders represented in those nodes also make it likely that other medical co-morbidities will be present further increasing vulnerability to the adverse health impacts of smoking (Cutler and Lleras-Muney, 2010; Gaalema et al., 2015; Hser et al., 2001; Niaura et al., 2012; Rowa-Dewar et al., 2015; Schroeder, 2007). These are patterns that contribute directly to the unsettling problem of health disparities (Higgins, 2014; Schroeder, 2007). If considered only in terms of absolute numbers of smokers, those nine nodes may appear to warrant less attention or resources. However, when considered in terms of the overall potential morbidity, mortality, and healthcare cost impacts involved, those nodes represent risk profiles where more effective strategies for reducing smoking are sorely needed. For potential behavioral economic and pharmacological strategies for doing interested readers may want to see reports by Davis et al. (under review) and Tidey (under review).

There are several limitations of the present study that should be acknowledged. First, the observational research design used in the present study cannot support causal inferences. Second, the NSDUH excludes several groups with relatively high smoking prevalence rates including individuals in the active military, jail, or homeless, which may limit generalizability of the results to those subgroups. Moreover, we excluded adolescents thereby potentially limiting generalizability to that important subgroup. Third, the NSDUH is a cross-sectional survey and thus does not permit examination of associations within individuals over time. Extending this research on co-occurring risk factors to a longitudinal survey such as the Population Assessment of Tobacco and Health (PATH, 2016) will be an important future research direction to assess whether cumulative risk is generally summative when examined prospectively over time. Fourth, the present study did not include tobacco products other than cigarettes, which will be an important gap to address in future research. Those limitations notwithstanding, we believe that the present study provides important new knowledge regarding effects associated with co-occurring risk factors for current cigarette smoking that has the potential to inform and advance evidence-based tobacco control and regulatory policy efforts.

Acknowledgments

Funding

This project was supported in part by Tobacco Centers of Regulatory Science (TCORS) award P50DA036114 from the National Institute on Drug Abuse (NIDA) and Food and Drug Administration (FDA), TCORS award P50CA180908 from the National Cancer Institute (NCI) and FDA, Center for Evaluation and Coordination of Training and Research award U54CA189222 from NCI and FDA, Institutional Training Grant award T32DA07242 from NIDA, and Centers of Biomedical Research Excellence P20GM103644 award from the National Institute of General Medical Sciences. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or the Food and Drug Administration.

Footnotes

Competing interests

The authors have no conflicts of interest to report.

References

  1. Asfar T, Arheart KL, Dietz NA, Caban-Martinez AJ, Fleming LE, Lee DJ. Changes in cigarette smoking behavior among US young workers 2005–2010: the role of occupation. Nicotine Tob Res. 2015 Oct 26; doi: 10.1093/ntr/ntv240. pii: ntv240. [Epub ahead of print]) [DOI] [PubMed] [Google Scholar]
  2. Ashley DL, Backinger CL, van Bemmel DM, Neveleff DJ. Tobacco regulatory science: research to inform regulatory action at the Food and Drug Administration’s Center for Tobacco Products. Nicotine Tob Res. 2014;16(8):1045–1049. doi: 10.1093/ntr/ntu038. http://dx.doi.org/10.1093/ntr/ntu038 (Epub 2014 Mar 17) [DOI] [PubMed] [Google Scholar]
  3. Austin PC. A comparison of regression trees, logistic regression, generalized additive models, and multivariate adaptive regression splines for predicting AMI mortality. Stat Med. 2007;26:2937–2957. doi: 10.1002/sim.2770. [DOI] [PubMed] [Google Scholar]
  4. Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and Regression Trees. Wadsworth; Belmont, California: 1984. [Google Scholar]
  5. Center for Behavioral Health Statistics Quality. 2013 National Survey on Drug Use and Health Public Use File Codebook. Substance Abuse and Mental Health Services Administration; Rockville, MD: 2014. [Accessed June 16, 2015]. ( http://www.icpsr.umich.edu/cgi-bin/file?comp=none&study=35509&ds=1&file_id=1166336&path=SAMHDA. [Google Scholar]
  6. Centers for Disease Control and Prevention. Current cigarette smoking prevalence among working adults—United States, 2004–2010. MMWR. 2011;60(38):1305–1309. [PubMed] [Google Scholar]
  7. Chilcoat HD. An overview of the emergence of disparities in smoking prevalence, cessation, and adverse consequences among women. Drug Alcohol Depend. 2009;104(Suppl 1):S17–S23. doi: 10.1016/j.drugalcdep.2009.06.002. http://dx.doi.org/10.1016/j.drugalcdep.2009.06.002 (Epub 2009 Jul 24) [DOI] [PubMed] [Google Scholar]
  8. Cutler DM, Lleras-Muney A. Understanding differences in health behaviors by education. J Health Econ. 2010 Jan;29(1):1–28. doi: 10.1016/j.jhealeco.2009.10.003. http://dx.doi.org/10.1016/j.jhealeco.2009.10.003. Epub 2009 Oct 31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Davis DR, Kurti AN, Redner R, White TJ, Higgins ST. A review of the literature on contingency management in the treatment of substance use disorders, 2009–2015. Prev Med. 2016 doi: 10.1016/j.ypmed.2016.08.008. under review. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Fiore M, et al. A clinical practice guideline for treating tobacco use and dependence: 2008 update. A U.S. public health service report. Am J Prev Med. 2008;35:158–176. doi: 10.1016/j.amepre.2008.04.009. http://dx.doi.org/10.1016/j.amepre.2008.04.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Gaalema DE, Cutler AY, Higgins ST, Ades PA. Smoking and cardiac rehabilitation participation: associations with referral, attendance, and adherence. Prev Med. 2015 Nov;80:67–74. doi: 10.1016/j.ypmed.2015.04.009. http://dx.doi.org/10.1016/j.ypmed.2015.04.009. Epub 2015 Apr 18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Gfroerer J, King BA, Garrett BE, Babb S, McAfee T. Vital signs: current cigarette smoking among adults aged ≥18 years with mental illness—United States, 2009–2011. MMWR. 2013;62:81–87. [PMC free article] [PubMed] [Google Scholar]
  13. Graham H, Inskip HM, Francis B, Marman J. Pathways of disadvantage and smoking careers: evidence and policy implications. J Epidemiol Community Health. 2007;60(Suppl II):ii7–ii12. doi: 10.1136/jech.2005.045583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Higgins ST. Behavior change, health, and health disparities: an introduction. Prev Med. 2014 Nov;80:1–4. doi: 10.1016/j.ypmed.2014.10.007. http://dx.doi.org/10.1016/j.ypmed.2015.07.020. Epub 2015 Aug 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Higgins ST, Chilcoat HD. Women and smoking: an interdisciplinary examination of socioeconomic influences. Drug Alcohol Depend. 2009;104(Suppl 1):S1–S5. doi: 10.1016/j.drugalcdep.2009.06.006. http://dx.doi.org/10.1016/j.drugalcdep.2009.06.006 (Epub 2009 Jul 8) [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Higgins ST, Heil SH, Badger GJ, Skelly JM, Solomon LJ, Bernstein IM. Educational disadvantage and cigarette smoking during pregnancy. Drug Alcohol Depend. 2009 Oct 1;104(Suppl 1):S100–S105. doi: 10.1016/j.drugalcdep.2009.03.013. http://dx.doi.org/10.1016/j.drugalcdep.2009.03.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Higgins ST, Kurti AN, Redner R, et al. A review of the literature on prevalence of gender differences and intersections with other vulnerabilities to tobacco use in the United States, 2004–2014. Prev Med. 2015 Jun 26; doi: 10.1016/j.ypmed.2015.06.009. http://dx.doi.org/10.1016/j.ypmed.2015.06.009. pii: S0091-7435(15)00206-6. [Epub ahead of print] [DOI] [PMC free article] [PubMed]
  18. Hiscock R, Bauld L, Amos A, Fidler JA, Munafo M. Socioeconomic status and smoking: a review. Ann N Y Acad Sci. 2012 Feb;1248:107–123. doi: 10.1111/j.1749-6632.2011.06202.x. http://dx.doi.org/10.1111/j.1749-6632.2011.06202.x. Epub 2011 Nov 17. [DOI] [PubMed] [Google Scholar]
  19. Hser YI, Hoffman V, Grella CE, Anglin MD. A 33-year follow-up of narcotics addicts. Arch Gen Psychiatry. 2001 May;58(5):503–508. doi: 10.1001/archpsyc.58.5.503. [DOI] [PubMed] [Google Scholar]
  20. Kandel DB, Griesler PC, Schaffran C. Educational attainment and smoking among women: risk factors and consequence for offspring. Drug Alcohol Depend. 2009 Oct 1;104(Suppl 1):S24–S33. doi: 10.1016/j.drugalcdep.2008.12.005. http://dx.doi.org/10.1016/j.drugalcdep.2008.12.005. Epub 2009 Jan 28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Lei Y, Nollen N, Ahluwahlia JS, Yu Q, Mayo MS. An application in identifying high-risk populations in alternative tobacco product use utilizing logistic regression and CART: a heuristic comparison. BMC Public Health. 2015;15:341. doi: 10.1186/s12889-015-1582-z. http://dx.doi.org/10.1186/s12889-015-1582-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Lemon SC, Roy J, Clark MA, Friedmann PD, Rakowski W. Classification and regression tree analysis in public health: methodological review and comparison with logistic regression. Ann Behav Med. 2003;26(3):172–181. doi: 10.1207/S15324796ABM2603_02. [DOI] [PubMed] [Google Scholar]
  23. Niaura R, Chander G, Hutton H, Stanton C. Interventions to address chronic disease and HIV: strategies to promote smoking cessation among HIV-infected individuals. Curr HIV/AIDS Rep. 2012 Dec;9(4):375–384. doi: 10.1007/s11904-012-0138-4. http://dx.doi.org/10.1007/s11904-012-0138-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Office of Disease Prevention and Health Promotion. Healthy People 2020. U.S. Department of Health and Human Services; Washington, DC: 2015. [accessed November 23rd, 2015]. http://www.healthypeople.gov. [Google Scholar]
  25. Park Y, Freedman AN, Gail MH, et al. A colorectal cancer risk prediction tool for white men and women without known susceptibility. J Clin Oncol. 2009;27(5):694–698. doi: 10.1200/JCO.2008.17.4797. http://dx.doi.org/10.1200/JCO.2008.17.4813 (Epub 2008 Dec 29) [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Population Assessment of Tobacco and Health (PATH) [Accessed February 13, 2016];2016 https://pathstudyinfo.nih.gov/UI/HomeMobile.aspx.
  27. R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing; Vienna, Austria: 2013. [Accessed 11/7/2015]. URL http://www.R-project.org/ [Google Scholar]
  28. Redner R, White TJ, Harder VS, Higgins ST. Vulnerability to smokeless tobacco use among those dependent on alcohol or illicit drugs. Nicotine Tob Res. 2014a;16:216–223. doi: 10.1093/ntr/ntt150. http://dx.doi.org/10.1093/ntr/ntt150 (Epub 2013 Sep 30) [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Redner R, White TJ, Harder VS, Higgins ST. Examining vulnerability to smokeless tobacco use among adolescents and adults meeting diagnostic criteria for major depressive disorder. Exp Clin Psychopharmacol. 2014b;22:316–322. doi: 10.1037/a0037291. http://dx.doi.org/10.1037/a0037291 (Epub 2014 Jun 30) [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Rowa-Dewar N, Lumsdaine C, Amos A. Protecting children from smoke exposure in disadvantaged homes. Nicotine Tob Res. 2015;17(4):496–501. doi: 10.1093/ntr/ntu217. http://dx.doi.org/10.1093/ntr/ntu217. [DOI] [PubMed] [Google Scholar]
  31. Schnohr P, Scharling H, Nordestgaard BG. Coronary heart disease risk factors ranked by importance for the individual and community. Eur Heart J. 2002;23(8):620–626. doi: 10.1053/euhj.2001.2842. [DOI] [PubMed] [Google Scholar]
  32. Schroeder SA. Shattuck Lecture: we can do better—improving the health of the American people. N Engl J Med. 2007 Sep 20;357(12):1221–1228. doi: 10.1056/NEJMsa073350. [DOI] [PubMed] [Google Scholar]
  33. Schroeder SA, Koh HK. Tobacco control 50 years after the 1964 surgeon general’s report. JAMA. 2014 Jan 8;311(2):141–143. doi: 10.1001/jama.2013.285243. http://dx.doi.org/10.1001/jama.2013.285243. [DOI] [PubMed] [Google Scholar]
  34. Substance Abuse and Mental Health Services Administration (SAMHSA) Results from the 2011 National Survey on Drug Use and Health: Summary of the National Findings. Rockville, MD: 2012. NSDUH Series H-44 (HHS Publication No. (SMA) 12–4713) [Google Scholar]
  35. Substance Abuse and Mental Health Services Administration (SAMHSA) Results from the 2012 National Survey on Drug Use and Health: Summary of the National Findings. Rockville, MD: 2013. NSDUH Series H-46 (HHS Publication No. (SMA) 13–4795) [Google Scholar]
  36. Substance Abuse and Mental Health Services Administration (SAMHSA) Results from the 2013 National Survey on Drug Use and Health: Summary of the National Findings. Rockville, MD: 2014. NSDUH Series H-46 (HHS Publication No. (SMA) 14–4863) [Google Scholar]
  37. Therneau T, Atkinson B, Ripley B. rpart: recursive partitioning. R Package Version 4.1–3. 2013 ( http://CRAN.R-project.org/package=rpart)
  38. Tidey JW. A behavioral economic perspective on smoking persistence in serious mental illness. Prev Med. 2016 doi: 10.1016/j.ypmed.2016.05.015. under review. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. U.S. Department of Health and Human Services. A Report of the Surgeon General. U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Chronic Disease Prevention and Health Promotion, Office on Smoking and Health; 2012. The health consequences of smoking: preventing tobacco use among youth and young adults. [Google Scholar]
  40. U.S. Department of Health and Human Services. The health consequences of smoking—50 years of progress: a report of the surgeon general. Atlanta, GA: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Chronic Disease Prevention and Health Promotion, Office on Smoking and Health; 2014. [Google Scholar]
  41. White TJ, Redner R, Bunn JY, Higgins ST. Do socioeconomic risk factors for cigarette smoking extend to smokeless tobacco use? Nicotine Tob Res. 2015 Oct 26; doi: 10.1093/ntr/ntv199. pii: ntv199. Epub ahead of print. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES