Abstract
Cigarette smoking has been associated with epigenetic alterations that may be reversible upon cessation. As the most-studied epigenetic modification, DNA methylation is strongly associated with smoking exposure, providing a potential mechanism that links smoking to adverse health outcomes. Here, we reviewed the reversibility of DNA methylation in accessible peripheral tissues, mainly blood, in relation to cigarette smoking cessation and the utility of DNA methylation as a biomarker signature to differentiate current, former, and never smokers and to quantify time since cessation. We summarized thousands of differentially methylated Cytosine-Guanine (CpG) dinucleotides and regions associated with smoking cessation from candidate gene and epigenome-wide association studies, as well as the prediction accuracy of the multi-CpG predictors for smoking status. Overall, there is robust evidence for DNA methylation signature of cigarette smoking cessation. However, there are still gaps to fill, including (1) cell-type heterogeneity in measuring blood DNA methylation; (2) underrepresentation of non-European ancestry populations; (3) limited longitudinal data to quantitatively measure DNA methylation after smoking cessation over time; and (4) limited data to study the impact of smoking cessation on other epigenetic features, noncoding RNAs, and histone modifications. Epigenetic machinery provides promising biomarkers that can improve success in smoking cessation in the clinical setting. To achieve this goal, larger and more-diverse samples with longitudinal measures of a broader spectrum of epigenetic marks will be essential to developing a robust DNA methylation biomarker assay, followed by meeting validation requirements for the assay before being implemented as a clinically useful tool.
Keywords: Cigarette smoking, Cessation, Epigenetics, DNA methylation, Biomarker
1. Introduction
Cigarette smoking has steadily declined in the United States over the decades, yet large portions of the general population are current or former smokers who face increased risks for smoking-related diseases. As of 2020, 12.5% of adults in the United States were current smokers [1], and 21% were former smokers reported in 2018 [2]. Further progress in preventing the initiation of smoking and treating those currently afflicted may depend on improvements in our ability to assess the presence and severity of tobacco consumption.
Current assessments performed in clinical and research settings largely rely on self-report, which is subject to underestimating use. Biomarkers would provide an ideal alternative by objectively distinguishing current from former smoking status and quantifying the decreasing frequency of smoking and time since last cigarette. Quitting smoking often takes multiple quit attempts, and in the clinic, biomarkers could give medical personnel reliable patient information to evaluate progress towards successfully quitting smoking and to inform treatment strategies. In the research setting, biomarkers could be a more-reliable measure than self-reported smoking habits when studying smoking behaviors over the lifetime and related health outcomes. Biomarkers could facilitate the development of personalized smoking cessation treatment strategies that are more effective than today’s standard practices, which often involve patients going through trial and error among various treatment options.
Epigenetics hold great promise as biomarker candidates for smoking cessation. Epigenetics—“epi” meaning “over and above”—is the study of covalent modifications of DNA, RNA, or protein molecules that regulate gene expression without altering the primary DNA sequence. Epigenetic modifications can be dynamic or stable over time, as they upregulate or downregulate gene expression during development, throughout the aging process, and in response to lifestyle behaviors and environmental exposures. Of the different types of epigenetic modifications—DNA methylation (DNAm), noncoding RNAs, and histone modifications—DNAm is the most commonly studied. DNAm—a chemical modification, most often involving the addition of methyl groups to Cytosine-Guanine (CpG) dinucleotides in the DNA sequence [3]—can provide a nucleic acid-based assessment of cigarette smoking history and cessation that can be measured using easily accessible biospecimens like peripheral blood or saliva.
This paper reviews the state of the field for identifying and using peripheral-based DNAm biomarkers to accurately quantify behavior patterns related to smoking cessation and discusses their potential future use for informing treatment and research strategies aimed at improving success rates for smoking cessation and curbing smoking-related diseases. Studies of epigenetic features for smoking behaviors more broadly have been reviewed elsewhere [4]. We focus the current review specifically on smoking cessation as an outcome and on DNAm as an epigenetic biomarker, given that the literature on other epigenetic features (non-coding RNAs and histone modifications) is limited for cessation.
2. The promise of DNAm as an epigenetic biomarker of smoking cessation
The classic description of an ideal biomarker stipulates that it be both sensitive and specific for the condition it is predicting (i.e., not indicative of comorbid exposures or disease conditions). However, to be clinically implemented in the real world, smoking cessation assays must be affordable and robust against attempts, both deliberate and unintentional, to confound signal detection. Currently, the U.S. Food and Drug Administration approves two approaches for detecting smoking [5]. The first assesses cotinine, the primary metabolite of nicotine, and the second assesses exhaled carbon monoxide (CO). Both are included in the PhenX Toolkit catalog of expert-recommended protocols [6]; see (1) “Biomarker of exposure to nicotine-containing products” in the Respiratory Domain for cotinine as measured in saliva, serum, or urine, and (2) “Expired Carbon Monoxide” in the Tobacco Regulatory Research Collection. Additional biomarkers, some of which are routinely used by the research community, are useful in predicting smoking status, as reviewed by Bierut et al. [7].
DNAm assessments of blood or saliva may offer yet another valuable method for determining smoking status and smoking cessation in research and clinical settings. Like any biomarker, the value and performance (critically, the sensitivity and specificity) of DNAm assessments depend on the context and type of question being asked. For example, currently available catabolite-based indices (e.g., cotinine or expired CO) have generally not provided sufficient sensitivity of smoking intake to reflects the decline in smoking heaviness in the population over the past decades, likely due to increased smoking efficiency (i.e., cigarette components biochemically delivered in the body per cigarette smoked) [8]. DNAm biomarkers may provide a more sensitive measure of smoke intake that reflects efficiency of cigarettes smoked.
As opposed to catabolite-based indices, the performance characteristics of DNAm biomarkers of smoking and smoking cessation are complex. The rationale for this complexity is perhaps best conceptualized by viewing white blood cells (WBCs), which are also approximately 75% of the total contribution to human saliva DNA [9–11], as slowly responsive biosensors of internal vascular environment. WBCs slowly develop from myeloid or lymphoid stems cells in the bone marrow [12,13]. Their original epigenetic signature is “inherited” from their progenitor cells and slowly modified as they assume the cell fate/transcriptional signature of mature WBCs. Clinicians then observe their signature as these cells transit the vasculature system, and the cells are harvested via phlebotomy. Although different cell types have different DNAm patterns, the default transcriptional and epigenetic signature of each type of WBC in the absence of disease or toxins is relatively invariant. However, this default signature may be modified by the environmental biases previously experienced by the precursor cell, changes in the stem niche during WBC maturation, or changes as the WBC migrates from the stem cell niche into the vasculature. However, because the WBCs experience the vasculature on the order of days to weeks [14,15], it is likely that most of the DNAm signal assessed in specimens of blood or saliva is encoded before migration from the bone marrow.
For WBCs and their precursors, modifications to the default epigenetic signature are slowly “written” in response to smoking and slowly “erased” in response to smoking cessation. Because of this sluggish dynamic response, DNAm is not useful a genome-wide scale for detecting smoking initiation [16]. However, at specific locus (e.g., AHRR-cg05575921), DNAm can be very sensitive and offer a powerful approach for detecting smoking in adolescents currently in the experimentation phase of smoking or for detecting the current or prior presence of a substantial daily smoking habit in the past several years in adults. Using a DNAm biomarker for smoking is analogous to using hemoglobin A1c (HbA1c) as a blood test to measure average blood sugar levels in recent months [9,10,16]. Like an HbA1c test, which does not detect acute spikes in glucose, DNAm cannot detect transient uses of tobacco. Still, like an HbA1c test is a powerful measure of average blood sugar, DNAm can offer a powerful tool for understanding trends in combustible tobacco consumption.
3. Technologies for measuring DNAm
The sensitivity and specificity of a DNAm test for smoking is dependent on the method used to quantify DNAm and the locus (or loci) being interrogated. In 2016, the BluePrint consortium provided a rigorous assessment of the techniques that were commonly being used to assess DNAm [17]. Overall, they concluded that amplicon-specific sequencing and pyrosequencing had the best overall performance. They also highlighted the power of arrays for assessing genome-wide patterns, while noting their high error rates.
Since 2016, those methods have been joined by DNAm-sensitive digital polymerase chain reaction (dPCR) techniques and whole bisulfitome sequencing (Table 1). Before reviewing these technical methods, a few caveats are in order. First, all of these methods, with the exception of certain single molecule sequencing methods [18], use bisulfite conversion as part of the analytic process. Because sodium bisulfite does not react with either hydroxy-methylated or methylated cytosines, the total methylated fraction quantified by bisulfite conversion-based methods is a sum of both modified forms of cytosine [19]. However, because the fraction of hydroxy-methylated cytosine in peripheral WBCs is approximately two orders of magnitude lower than that of methylcytosine (<1%), this caveat generally is not an issue [20]. A second caveat is that the DNAm signature at a given locus may be affected by cellular heterogeneity. If that is the case, correction of the signature using either genome-wide or single-locus methods is recommended [21,22].
Table 1.
A Summary of the Pros and Cons for Each of the Major Types of Methylation Measurement Methods.
| Speed* | Precision | Accuracy | Breadth | Cost | Ease | Refs. | |
|---|---|---|---|---|---|---|---|
| Targeted sequencing | Slow | Fair | Fair | Good | $$$ | Difficult | [23] |
| Hybridization array | Slow | Fair | Fair | Excellent | $$$ | Difficult | [24] |
| Pyrosequencing | Slow | Good | Good to Excellent | Poor | $$ | Easy | [27] |
| Single Molecule (SMRT) | Slow | Fair | Fair | Excellent | $$$$ | Difficult | [97] |
| Mass Array | Slow | Fair | Fair to Good | Good | $$$ | Difficult | [29] |
| qPCR | Quick | Fair to Good | Fair to Good | Poor | $ | Easy | [25] |
| dPCR | Quick | Excellent | Excellent | Poor | $$ | Easy | [98] |
| Whole Bisulfitome Sequencing | Slow | Fair | Fair | Excellent | $$$$ | Difficult | [99] |
Speed is defined by an estimated lab time of 1 day (quick) or 2 or more days (slow). The hybridization array is the slowest method, with an estimated lab time of 5 days.
The choice of method used to quantify DNAm will depend on a number of factors, including type of sample being analyzed, loci of interest, budget, and “need for speed.” Table 1 summarizes the current pros and cons of the methods commonly used to assess DNAm for research and clinical purposes and includes a reference to an illustrative article using the technique. The exact properties of each technique, in particular the precision (i.e., the repeatability of the results) and accuracy (i.e., the result in comparison with the “true value”), will vary with respect to the specific application and practitioner.
Targeted sequencing approaches, using either reduced representation or whole bisulfitome approaches, have been a traditionally popular method for assessing DNAm [23]. Advantages of targeted sequencing include providing information on thousands of CpG residues focused on only the regions of interest and reducing data for the analytical pipeline. However, like nearly all sequencing-based approaches, targeted approaches tend to use complicated sequencing by synthesis platforms that necessitate complex analytic pipelines on runs of large batches, and their precision and accuracy are affected by amplification biases in the preparatory phase of the technique.
Genome-wide hybridization arrays, such as those produced by Illumina, have been a staple of the DNAm literature for many years. As a laboratory technique, they are difficult and expensive to conduct, challenging to analyze, and require at least 1 μg of high-quality DNA [24,25]. As a result, as with targeted sequencing approaches, researchers generally outsource their intensive time and technical resource to genome centers to optimize performance of the assays. Despite the intuitive appeal of this “one-size-fits-all” method, DNA samples analyzed via this approach need to be batched, and the relatively lower accuracy limits their clinical utility [17,26]. In addition, like more-complex sequencing-based approaches, the “real cost” of conducting these assays should include the extensive need for bioinformatic support. However, for studies of smoking seeking to use epigenetics to determine other health outcomes, such as epigenetic aging, the breadth of assessment that arrays provide is a major advantage over other approaches, and the resulting data are useful for any number of secondary analyses.
Pyrosequencing is a relatively precise, affordable, and easy-to-interpret method for quantifying DNAm at discrete loci. However, careful attention must be paid to the potential of amplification bias in the template preparation phase, and the low throughput limits its clinical utility [25,27].
Single Molecular Real Time (SMRT) technology has the distinct advantage of simultaneously capturing genetic and epigenetic variation, as first theoretically described for the capacity of single pores to detect differences in the DNAm status of native cytosine molecules [28]. The third generation of this technology, from companies such as Oxford Nanopore Technologies and Pacific Biosciences, is currently in research implementation [29]. To date, though, the practical utility of this approach in everyday research has been limited.
Mass array approaches use mass spectroscopy to quantify C to T ratios of PCR products produced from batteries of targeted PCR amplifications of bisulfite-converted DNA. The strength of this family of approaches rests upon the theoretical precision and throughput afforded by the individual mass spectroscopy technique [29]. However, in practice, the accuracy and precision of the method are often compromised by the amplification bias used to produce the PCR products. As a result, mass arrays have not been widely used in the past several years.
Quantitative PCR (qPCR) methods, despite their assessments being limited to single CpG residues, are the easiest and least expensive to perform. As a result, they have become the current preferred method for most clinical applications, with all but one of the in vitro cancer diagnostics reviewed recently by Tarym-Lesniak et al. using DNAm-sensitive qPCR (MSqPCR) to assess epigenetic status [30]. However, the precision and accuracy of these assays greatly varies, with larger amplicon size and increasing cycle count adversely affecting assay performance.
DNAm sensitive digital PCR (MSdPCR) methods that promise to provide greater precision and accuracy than qPCR-based approaches have recently been introduced. Unlike qPCR methods, dPCR-based methods offer the potential for reference-free assessments of DNAm, and when designed properly, have minimal amplification bias [22,31]. However, not all loci can be targeted using this method, and like many of the current sequencing approaches, the need for costly reagents and instrumentation can be a barrier for clinical implementation.
Finally, increases in availability of sequencing resources have made whole-genome bisulfitome sequencing approaches more tractable. Even now, these reference-independent approaches are extremely expensive to conduct and present unique data processing and management challenges. Yet their unparalleled breadth of assessment is state-of-the-art, and with more-efficient costing in the future, could become a mainstay in research.
In summary, DNAm status can be assessed through a variety of technological methods. As a general rule, these methods vary with respect to the cost, breadth, accuracy, and precision of their assessments. However, at a practical level, their performance can vary substantially depending on the exact details of the procedures being used. As a result, careful review of the specific protocol being employed is encouraged to ensure that it meets the need of the research or clinical question being considered.
The choice of DNAm technology will also depend on the type of sample being analyzed. Theoretically, because smoking can affect the DNAm signature of other tissues, it should be possible to assess smoking status using DNA from tissues other than blood or saliva when conducting research studies to better understand disease processes using pertinent tissues. Still, for reasons of practicality, the two most-commonly used sources of DNA for research and clinical studies of smoking are blood and saliva. Blood-based studies are the most common, with the potential for integration of data with other forms of medical assessments enriching the scope and certainty of investigations of health outcomes. However, the use of blood generally necessitates the need for phlebotomy, which is expensive, potentially painful, and incompatible with most telemedicine-based investigations and interventions. The use of saliva DNA circumvents the need for phlebotomy and is compatible with telemedicine-based approaches. However, DNA quality issues are much more common with saliva-based approaches, with the greater cellular heterogeneity of saliva posing potentially greater rates of error to the ensuing DNAm analyses, particularly for array-based approaches. All things being equal, as the transition of methylomic studies of smoking from the laboratory to the clinic proceeds, the use of qPCR and dPCR techniques to assess smoking status will likely become increasingly common. DNAm assessment is still considerably more costly than the existing catabolite-based methods for smoking assessments, given that the most cost-efficient qPCR method costs $19.6 per sample [25], whereas urine cotinine “dipsticks” cost less than $1 each [32].
In summary, each technology for assaying DNAm comes with strengths and limitations to take into account when implemented in research and interpreting resulting associations with measures of smoking intake. Weighing the technologies’ strengths and limitations is further exemplified in considering the clinical setting, where a singular, easily measurable, and reliable predictor is ideal.
4. Literature search to review genetic loci with DNAm associations with smoking cessation
To identify relevant studies, we used the following key words in PubMed: “Smoking cessation + Epigenetics”, “Smoking cessation + DNA methylation”, “Smoking + Epigenetics”, and “Smoking + DNA methylation”. The PubMed search for primary references was conducted on February 4, 2022. We reviewed any secondary references cited by the primary references to retrieve those articles not captured by the key word search. All studies included in this review reported DNAm changes related to a smoking cessation phenotype, including DNAm changes observed in cord blood when mothers quit smoking.
5. Single site DNAm associations with smoking cessation
Before the introduction of epigenome-wide arrays (the most commonly used being the Illumina 450K array), only one study examined DNAm status as a function of cessation. This study from 2010 focused on monoamine oxidase A (MAOA) as a candidate gene. MAOA is located on chromosome X, and it encodes mitochondrial enzymes responsible for catalyzing the oxidative deamination of amines, including dopamine. Philibert et al. tested 74 CpGs spanning the MAOA gene and showed that DNAm was associated with cessation and further demonstrated genetically contextual DNAm effects [33].
Array technology brought about epigenome-wide association study (EWAS) analyses, beginning in 2012, to conduct agnostic genome-wide searches for CpGs associated with smoking cessation. Between 2012 and February 2022, 10 pubications reported EWASs for smoking cessation [34–43]. See Table 2 for a summary of their study designs and major findings. All studies focused on adult populations, with study-specific sample sizes ranging from 70 [38] to 15,907 (the latter in a meta-analysis of 16 independent cohorts) [40]. Collectively, these studies included a non-overlapping total sample size of 24,656 individuals: 84.4% of European ancestry, 15.2% of African ancestry, and 0.4% of Asian ancestry. Two of the EWASs included only women [37,39], and the other 8 EWASs included both women and men. Most of the EWAS analyses were conducted cross-sectionally with a single time point for DNAm measurement and smoking status defined as current, former, or never. Two studies leveraged longitudinal data for both DNAm and smoking status, either for agnostic EWAS [43] or follow-up analyses of cross-sectional findings; however, sample sizes with longitudinal data were relatively limited (largest N = 1344).
Table 2.
Significantly Associated CpGs Reported from Epigenome-Wide Association Studies of Smoking Cessation Phenotypes, Statistical significance is based on Bonferroni correction for the number of tests conducted: p < 2 × 10−6 for the 27K array and p < 10−7 for the 450K array. Studies, as sorted chronologically, were analyzed with cross-sectional smoking cessation phenotype and DNA methylation data, except where indicated by an asterisk (*) for the single study with longitudinal data available.
| Cohort(s) | Blood Sample Type(s) | Array | Total N, by Ancestry | Cessation Phenotype(s) | N, Significantly Associated CpGs | Covariates | Results in EWAS Catalog [44] | Year and Refs. |
|---|---|---|---|---|---|---|---|---|
| International Chronic Obstructive Pulmonary Disease (COPD) Genetics Network and Boston Early-Onset COPD study | Leukocytes | 27K | 1085 EUR 1085 EUR 369 EUR |
Current vs. former (discovery) Time since quitting Current vs. former and never (replication) |
5 2 2 |
Age, sex, surrogate variables | No | 2012 [34] |
| Kooperative Gesundheitsforschung in der Region Augsburg (KORA) | Whole blood | 450K | 1011 EUR 468 EUR 1531 EUR |
Current vs. never (discovery) Current vs. never (replication) Former vs. never |
972 79 14 |
Age, sex, body mass index, alcohol consumption, cell composition | Yes | 2013 [35] |
| CARDIOGENICS | Whole blood | 450K | 464 EUR 201 EUR 442 EUR 285 EUR |
Current vs. former Current vs. never Former vs. never Current vs. former |
32 56 20 8 |
Age, sex, coronary artery disease status, recruitment site, batch, bisulfite-treated DNA input Batch effects, cell composition (reference free) | Yes | 2014 [36] 20,221, [4,4] |
| European Prospective Investigation into Cancer and Nutrition (EPIC-Italy) and Norwegian Women and Cancer Study Cohort (NOWAC) | Whole blood | 450K | 582 EUR 568 EUR |
Current vs. never Former vs. never |
448 3 |
Cell composition, batch | Yes | 2015 [37] |
| Korean COPD cohort | Blood | 450K | 70 ASN 100 ASN |
Current vs. never Current and former vs. never |
9 7 |
COPD status, age, sex, BMI, cell type composition | No | 2016 [38] |
| EPIC | Whole blood | 450K | 721 EUR 382 EUR 717 EUR 189 EUR |
Current vs. never Current vs. former Former vs. never Time since smoking cessation |
196 62 4 4 |
Age, breast cancer status, menopausal status, use of contraceptive pill, use of hormone replacement therapy, batch | Yes No |
2016 [39] |
| Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium (16 cohorts) | Whole blood, buffy coat, CD4+ T cells, monocytes | 450K | 2639 AFR, 6750 EUR 2825 AFR, 10,649 EUR |
Current vs. never Former vs. never |
2622 158 |
Age, sex, cell composition, batch | Yes | 2016 [40] |
| KORA | Whole blood | 450K | 895 EUR | Current vs. never | 590 | Age, sex, alcohol consumption, body mass index, white blood cell count and estimated white blood cell proportions | No | 2017 [41] |
| Melbourne Collaborative Cohort Study | Whole blood | 450K | 3034 EUR 4389 EUR |
Current vs. never Former vs. never | 1851 156 |
Sex, country of birth, age, alcohol consumption, cell composition, batch effects | Yes | 2020 [42] |
| * Framingham Heart Study (FHS) and Atherosclerosis Risk in Communities (ARIC) Study | Whole blood | 450K | 110 AFR, 149 EUR 138 AFR, 124 EUR 222 AFR, 199 EUR |
Current-current vs. interim quitters Interim quitters vs. former-former Current-current vs. all non-smokers |
4 0 6 |
Age, sex, imputed blood count fraction, assay-specific technical covariates (batch, row, and column in FHS and genotype principal components in ARIC) | No | 2021 [43] |
All of the published EWASs reported statistically signficant associations of single CpGs with a smoking cessation phenotype, based on a Bonferroni correction for the number of tests conducted (p < 2 × 10−6 for the single 27k array study or p < 10−7 for the other 9 studies that used the 450K array; Table 2). Across the studies, 4267 unique CpGs were significantly associated with smoking cessation, as we have summarized in Supplemental Table 1. Results were compiled using the EWAS catalog [44] and original publications for those not available in the catalog.
The signficantly associated CpGs are distributed across the genome (Fig. 1). The genetic regions with the most statistically significant associations are chr2q37 (alkaline phosphatase genes; ALPP, ALPPL2, and ALPI), chr3q11 (G protein-coupled receptor 15 gene, GPR15), chr5p15 (aryl hydrocarbon receptor repressor gene, AHRR), chr6p21, and chr19p13 (F2R like thrombin or trypsin receptor 3 gene, F2RL3), all of which have CpGs associated at p < 10−100. These loci include CpGs that were reported across EWASs of independent cohorts.
Fig. 1.

Circular Manhattan Plots Summarizing Epigenome-Wide Significant CpGs Associated with Four Smoking Cessation Outcomes. From the inner to outer circles, the panels show significantly associated CpGs (p < 10−6 for 27K array data or p < 10−7 for 450K array data) for (a) time since quitting; (b) current vs. former smokers; (c) former vs. never smokers; and (d) current vs. never smokers. The five genetic loci with the most statistically significant associations (p < 10−100) are annotated with the gene/region names on chromosomes 2, 3, 5, 6, and 19.
Fig. 1 also shows that most CpGs with a significant associated were observed when comparing current smokers to never smokers (4236 unique CpGs). Only 241 unique CpGs showed signficiant association when comparing former to never smokers and 80 when comparing current to former smokers. This pattern is not caused by sample size differences, as studies with comparable sample sizes across the phenotypes have observed that effect sizes are generally attenuated in former smokers, and thus, fewer CpGs surpass the epigenome-wide statistical significance threshold [35,37,39,40,42]. Instead, this pattern is expected, as DNAm at CpG sites often, but not always, revert to physiologically normal levels over time with successful quitting as the effects of smoking exposure dissipate. For example, in a follow-up analysis of the Framingham Heart Study with smoking data collected prospectively over 30 years, the DNAm levels of most cessation-associated CpGs returned to levels comparable with never smokers within 5 years of smoking cessation; however, the levels of some CpGs, including ones annotated to AHRR and F2RL3, did not return to never smoker levels even up to 30 years after cessation [40].
Some EWAS-identified loci have been further studied for their dosereponse effects and longitudinal change over time since quitting. For example, Zhang et al. showed that DNAm levels in a targeted region at F2RL3 have an inverse relatinonship with both current intensity (average number of cigarettes smoked per day) and lifetime pack-years of smoking, and its DNAm levels in former smokers who quit more than 20 years ago were gradually restored over time close to the levels observed in never smokers [45]. With a larger set of 90 EWAS-identified CpGs, McCartney et al. supported the reversion of DNAm levels upon cessation and further showed that reversion is dependent on time since quitting and dose before quitting, as the reversion of DNAm occurs more swiftly in former low-dose smokers than in former high-dose smokers [46]. Thus, DNAm levels can accurately quantify the time and dose of smoking and serve as an efficient assessment in smoking cessation trials.
In summary, rigorous EWASs have been conducted via array-based technology to identify blood-based DNAm biomarkers related to smoking cessation. The replicable CpGs across independent studies show great potential for serving in clinical practice to quantify smoking exposure during the cessation attempt.
6. Regional DNAm associations with smoking cessation
Beginning in 2016, in addition to single CpG association testing, three EWASs examined associations of differentially methylated regions (DMRs) with smoking cessation [38,39]. DMRs are genomic regions that contain multiple adjacent CpGs that show differential DNAm with the phenotype of interest. DMRs can vary in length and occur in different contexts (for example, specific to a tissue, cell type, or phenotype) [47], and concentration of differential DNAm is thought to enhance the plausiblity that the region influences regulation of gene transcription [48]. DMR testing also has a practical advantage of reducing the multiple testing burden for declaring statistically significant association by aggregating correlated individual CpG sites into fewer independent regions across the genome. Statistically significant DMRs can be found even in the absense of single genome-wide significant CpGs, as the aggregation of multiple, nominally significant, adjacent CpGs can together lead to a genome-wide significant DMR [48].
For the EWAS that tested DMR associations with smoking cessation, phenotypes inlcuded current vs. never smokers [38,39], interim quitters vs. current smokers (or former smokers) [43], and current smokers vs. nonsmokers (interim quitters, former smokers, and never smokers) [43]. Many DMRs were identified, including AHRR [38,39,43], ALPPL2 [38,39], growth factor independent 1 transcriptional repressor gene (GFI1) [39], myosin IG gene (MYO1G) [39], and more than 50 small nucleolar RNA genes [43].
AHRR is the most-consistently observed gene locus across studies, at both single CpGs and DMRs, and often with the most statistically significant support for its association with smoking cessation. At the DMR level, AHRR was shown to be hypomethylated in current smokers vs. former or never smokers [39], and this hypomethylation is thought to inactivate the aryl hydrocarbon receptor (AhR) pathway and plausibly influence risk of lung cancer or cardiovascular disease [43]. As observed for the AHRR probe cg05575921, hypomethylation can be observed after the consumption of several cartons of cigarettes [16,49,50]. This initial signature can be transient, with the signature showing reversion within a few months after quitting [32,51], and the signature of those who stop relatively early in their smoking trajectory renormalizing their DNAm level within a year of cessation. As the intensity and duration of smoking increases, the breadth and magnitude of the remodeling steadily increases [32,52]. Once stopped, this smoking-induced DNAm signature reverts over a period of years [36,41,53]. Takeuchi et al. examined cg05575921 methylation over a 30-year window and suggested that full reversion of smoking-induced changes may take up to 20 years for heavy smokers [53]. With its consistent pattern of lower DNAm with higher smoking levels and current smoking status and its reversion to higher DNAm soon after quitting (for example, DNAm differences observed in current smokers vs. interim quitters [43]), AHRR is a strong candidate for a robust biomarker for cessation.
7. Multiple CpG predictors of smoking cessation
With a large number of associations between DNAm and smoking cessation, Elliot et al. [54] was the first to calculate a smoking score by adding up the products of DNAm levels and corresponding effect sizes of the CpGs reported to be significantly associated with smoking in a previous study [35]. Elliot et al. also used absolute thresholds to classify never, former, and current smokers. Since then, many DNAm-based smoking predictors have been developed with different feature selection techniques [40,55–61]. The various smoking predictors can distinguish current smokers from never smokers almost perfectly [62]. However, former smokers tend to be misclassified because of the reversal of the DNAm process as a function of time since cessation [37,39,54].
Langdon et al. [62] compared the performance of four smoking predictors based on DNAm: (1) DNAm at AHRR-cg05575921; (2) 9 candidate smoking CpGs from the literature that were trained by a least absolute shrinkage and selection operator (LASSO) model; (3) weighted scores from 13 CpGs created by Maas et al. [56]; and (4) weighted scores from a LASSO model supplied with genome-wide 450K array data (agnostic LASSO model, 29 CpGs). The results showed that the DNAm at AHRR-cg05575921 itself achieved an overall accuracy of 0.815 to predict ever versus never smokers in external independent validation dataset, and the accuracy was reduced to 0.612 for ternary classification of smoking status (current, former, and never smoking). The last model trained from all 450K CpGs using the LASSO model achieved a slightly better performance: 0.822 for binary classification (ever vs. never) and 0.637 for ternary classification. The results suggest AHRR-cg05575921 captures most of the variance in the current and former smoking class and thus can predict ever versus never smokers with high accuracy. Adding new CpG sites provides limited improvement in the prediction. However, the multi-CpG predictors additionally capture more variance between current and former smokers and can distinguish them apart more accurately. Other largely untested smoking outcomes, such as time since quitting, may also benefit from cumulative evidence from different loci and not sole reliance on AHRR.
8. Epigenetic influence of mothers’ smoking cessation in newborn cord blood
Cigarette smoking during pregnancy is a major risk factor for adverse birthing, infant, and childhood outcomes, yet maternal smoking is common. One in 14 women who gave birth in the United States in 2016 reported smoking during pregnancy [63]. Although the mechanisms underlying the smoking-related adverse outcomes are poorly understood, it is plausible that smoking-related DNAm changes may play a causal role. Quitting smoking as early as possible is strongly advised to mitigate the consequences of smoking on the mother and child.
A significant body of evidence has accumulated suggesting that maternal smoking during pregnancy induces changes of DNAm patterns in the offspring, usually examined in newborn cord blood [64–68]. Joubert et al. [65] investigated the influence of quitting smoking during pregnancy on DNAm in the newborns cord blood; the effects of maternal smoking on the peripheral WBC DNAm signatures in newborn cord blood were not significant if the mothers quit smoking during early pregnancy, as defined by quitting before 18 gestational weeks. A later study [69] confirmed that smoking cessation during early pregnancy restored DNAm patterns in newborn cord blood.
9. Current research limitations, technical challenges, and ethical considerations
Our understanding of epigenetics continues to evolve. One major challenge is the tissue-and cell type–specific variability that occurs across the epigenetic machinery. Most studies reviewed here measured DNAm in peripheral blood samples, which consists of a mixture of different cell types, including monocytes, lymphocytes, neutrophils, eosinophils, macrophages, and others. Different cell types have different life cycles; for example leukocytes (a WBC type) have a cycle of 6 to 20 days, whereas erythrocytes (red blood cells) have a cycle of approximately 120 days. Epigenetic signatures that reflect exposures that extend far beyond cell type life cycles (e.g., former smoking) must rely on other cell types or tissues that drive long-lived signals [70], yet the cell type specificity of DNAm or other epigenetic alterations and their underlying drivers in relation to smoking cessation are largely unknown. The DNAm patterns observed are averaged across cell compositions, the heterogeneity of which potentially confounds true DNAm differences between current, former, and never smoker groups. Further, peripheral biomarkers may help infer biology underlying disease processes for epigenetic marks that are shared across tissues, as seen in adipose tissue for metabolic disease [71], but this potential use is also limited by the incompleteness of the map of tissue- and cell type–specific epigenetic changes related to smoking across peripheral and other disease-relevant tissues and cell types.
Unlike genetic variants, which are fixed, epigenetic elements exhibit dynamic changes in response to genetic background, age, sex, ancestry group, other environmental exposure, clinical co-morbidities, and so on. Thus, epigenome analyses are sensitive to confounding factors [4].
The measurement of DNAm, either array or sequencing methods, extensively relies on bisulfite conversion, which converts unmethylated cytosines to thymines but keeps methylated cytosines unchanged. However, such conversion does not distinguish between the two cytosine modifications: 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) [72,73]. Thus, the observed DNAm patterns are the combination of the two cytosine modifications, and the two types of modifications are indistinguishable in most studies of DNAm differences.
The promise of epigenetic biomarkers comes with important ethical considerationsto ensure their fair and equitable use. First, many findings reviewed here have focused on predominately populations of European ancestry, reflecting their widely recognized disproportionate representation in research. It is necessary to expand these studies into other populations so epigenetic biomarkers do not unintentionally widen health disparities, with their best prediction occurring in European ancestry populations and their predictive ability decreasing in populations that are underrepresented in research. Second, although there is evidence in plants and model organisms that epigenetic modifications have a small heritable contribution [74–76], studying transgenerational epigenetic inheritance in humans is challenging. Indirect evidence from epidemiologic studies is subject to potential confounding for nongenetic factors, and study designs are limited for directly assessing evidence in the germline (i.e., egg and sperm cells) [77]. To date, the preponderance of evidence suggests that the transmission of epigenetic modifications associated with smoking occurs through intrauterine and childhood exposures [75,77]. These transgenerational effects of epigenetic marks (i.e., transmission of risk to future generations) could shift the focus from the general population to targeting disadvantaged individuals with a cumulation of exposures. Researchers can help address this concern by following the principle of equity, as reviewed elsewhere [78], to ensure that epigenetic signatures are not used to disproportionately target future generations. Third, epigenetic biomarkers are designed to reveal information more reliably than self-reporting. This information about current and past behaviors and environmental exposures may be sensitive, however, and legal and other protections may be inadequate.
10. Conclusions and future directions
Blood-based epigenetic marks show great promise for serving as a robust biomarker to differentiate current, former, and never smokers and to objectively quantify time since quitting. At present, thousands of DNAm sites (CpGs) have been identified, and multisite prediction algorithms have been developed with the top CpGs to differentiate smoking groups. For time since cessation, initial epigenome-wide studies and follow-up studies of well-established smoking-associated CpGs have provided convincing evidence that DNAm responds to the presence (or absence) of cigarette smoking exposure over time, but more data and research are needed.
Larger samples with longitudinal data on smoking phenotypes and repeat methylation measures are needed to gain a better understanding about the reversion of DNAm levels at specific CpG sites and improve algorithms that combine information from multiple CpGs into a singular predictor. A singular, easily measurable, and reliable predictor is ideal for clinical use. If applied today, however, a multisite predictor would prove more useful in populations of European ancestry than other ancestries, given the preponderance of European ancestry-based data used to discover and characterize DNAm as a function of cigarette smoking. This disproportionate representation within the United States and globally is analogous with the historical trend for genome-wide association studies [79]. The current DNAm profile related to smoking is largely driven by European ancestry group and thus most applicable to this group. There are studies in other race/ethnic groups reporting replicated and novel DNAm biomarkers [38,80,81]. The replicated DNAm changes around the genes AHRR, F2RL3 and GPR15 reflect the common associations with smoking exposure, and novel DNAm changes identified in different race/ethnic groups indicate the interactions between smoking exposure and different genetic/environmental factors. It is critically imperative that future studies expand the available data and combine these data together with the existing data in non-European ancestry populations to identify novel CpGs that may respond differently to cigarette smoking by ancestry and to refine predictors that account for ancestral diversity. Without a concentrated effort to improve the diversity in basic science and clinical research, the use of epigenetic biomarkers for smoking cessation could exacerbate disparities in underrepresented populations.
In this review, we focused on DNAm as the most common epigenetic data type used to study peripheral biomarkers for smoking cessation. Three main types of epigenetic mechanisms can alter gene regulation at progressive stages from RNA transcription to protein translation: DNAm (transcriptional regulation), noncoding RNAs (translational regulation), and histone modification (post-translational regulation). Kaur et al. conducted a systematic review of all three epigenetic types for smoking phenotypes more broadly [4]. No reports of histone modifications and only a few studies of noncoding RNAs were found using blood or other noninvasive peripheral tissues to study smoking cessation in humans [4]. Noncoding RNAs represent a large family of RNAs, and they are categorized into at least 17 different types according to their size (200 nucleotides used as the threshold to classify RNAs as small or long) or function [82]. Among them, for example, miRNAs—short RNA transcripts that regulate expression of many genes by binding to target protein-coding transcripts to block their translation or destabilize them—have garnered attention as strong biomarker and drug discovery candidates [83]. Other types of epigenetic biomarkers, beyond DNAm, merit attention for future research to improve the prediction with a more-holistic epigenetic signature and to inform treatment and research strategies for smoking cessation.
Smoking exposure has been linked to various types of cancer, and many studies have revealed the underlying genetic mechanisms linking smoking exposure to cancer risks [84]. However, in most of these studies, the smoking status is largely dependent on self-report, which is heavily underestimated in cancer patients. Using DNAm biomarkers could provide a more precisely defined smoking status phenotype and further strengthen the evidence for genetic associations between smoking exposure and cancer risks. Recently, two independent studies have used the DNAm level at cg05575921-AHRR as part of an algorithm for guiding low dose computed tomography for lung cancer screening [85,86], and the improved specificity suggests a potential use of DNAm biomarkers in clinical use for cancer screening. Relying on a DNAm biomarker to evaluate smoking status rather than self-report would also provide medical personnel with a more accurate smoking assessment for individuals undergoing cancer treatment.
In the setting of clinical trials, one of the most difficult barriers to implementing smoking cessation interventions is accurately quantifying decreases in smoking and the success of cessation therapy. The two most widely used assessment tools, cotinine and exhaled CO levels, can generally detect recent smoking exposure in a window of hours or a few days. Use of cotinine levels to assess smoking is confounded by other forms of tobacco consumption, including secondhand smoking, e-cigarettes, and nicotine replacement therapy [87]. Thus, its usage is limited in smoking cessation treatment. Assessment with exhaled CO levels is relatively insensitive to light-to-moderate smoking [88,89], which largely reduces its efficiency in guiding smoking cessation. In contrast, DNAm as a biomarker can better archive the historical smoking exposure within years and accurately track the progress of smoking reduction and cessation [90]. Other nicotine-containing products may have effects on DNAm profiles. A recent study [91] found that e-cigarette users have distinct DNAm profiles from cigarette smokers and similar to non-smokers. In contrast, the heat-not-burn consumption of tobacco shows similar DNAm patterns to smokers [92], but non-combustible forms of tobacco consumption do not have an effect on DNAm levels at AHRR-cg05575921 [60]. In recent clinical trials, methylation status at AHRR-cg05575921 has been employed as a biomarker to quantify the smoking status and decrease during cessation with high accuracy [51,60], suggesting periodic assessment of changes in methylation can help guide smoking cessation therapy. It is important to note that continuous measures of cotinine levels and exhaled CO were taken to provide a more accurate measure of smoking than relying on self-report and thus more objective verification of the DNAm changes observed. Future research would benefit from more widespread adoption of this approach of taking continuous measures to corroborate self-reporting. DNAm biomarkers show high potential in clinical practice for quantifying smoking behaviors (or lack thereof) over time among smoking undergoing cessation treatment, as they offer a marked advantage over currently available detection methods (such as, cotinine) with only a narrow window of detection of current smoking status. The clinical trials leveraging DNAm biomarkers were limited by small sample size and few time points (base-line, 1-, 3- and 6-month), yet still showed the reversion of DNAm level at cg05575921 was not simply linear and may offer a personalized profile for evaluating quit attempts. More intense investigation is needed to fully understand the DNAm changes as a function of time since cessation, former smoking behaviors, and changes in smoking heaviness. Given the variety of nicotine containing products on the market, the popularity of cannabis and the co-morbidity of other forms of substance use, we would recommend that the best designed studies should include robust, comprehensive laboratory examinations of multiple substances at many timepoints. Moreover, the importance and utility of biosample collection has been highlighted in many clinical trials for smoking cessation therapy [90,93]. The availability of a reservoir of biosamples may facilitate the usage of DNAm assessment in clinical trials at all stages.
Before clinical implementation in markets that are regulated by the FDA or the European Union, in accordance with their regulatory guidelines, any epigenetic biomarker assessments will require analytical and clinical validation. Sestakova et al. provided an overview of validation methods for DNAm biomarker assays [25], and Cowley et al. summarized the methods for developing and validating clinical prediction algorithms in general [94]. Because of the clinical testing regulations, at a practical level, any implemental assay must have clinical verification of the performance standards as described by the Centers for Medicare and Medicaid Services or newly described for the European Union [95]. In brief, each assay must have documented accuracy, precision, reportable range, and reference values for the intended testing population before first clinical use. Finally, because these assays will be classified by regulatory bodies as in vitro diagnostics, they will need to adhere to the extensive network of regulations guiding clinical laboratory testing. The guidelines carefully describe the conditions under which the components of the assay, including instrumentation and reagents, are produced, and the qualifications individuals need to conduct the assays for clinical purposes [96]. At a practical level, the net effect of these standards and regulation means that any smoking cessation biomarker assay with promise for clinical use will need to be produced by a manufacturer compliant with FDA and ISO 13485:2016 standards and conducted in standard clinical molecular biology laboratories that already conduct nucleic acid-based clinical tests.
Although great progress has been made over the past decade in identifying epigenetic biomarkers of smoking cessation, the present state of the field still largely relies on subjective self-report, which is subject to differential misclassification (i.e., smokers more likely to misclassify themselves than nonsmokers) and recall bias. A future state is desired, where granular information on past and current smoking behaviors can be objectively assessed in the research and clinical settings using noninvasive biospecimens, prompt and accurate measurements, and results that are reliable and interpretable for researchers, clinicians, and patients. Much remains to meet this goal of tailoring smoking cessation treatment based on personalized signatures (i.e., precision medicine) to improve success in quitting smoking and curb smoking’s adverse health effects. Achieving this goal will necessitate studies with larger and more-diverse samples, longitudinal data to track the impact of smoking exposure and cessation on the epigenome over time, and a broader capture of epigenetic biomarkers.
Supplementary Material
Acknowledgements
This work was supported by the National Institutes of Health, National Institute on Drug Abuse (Grant Nos. R01DA042090, R01DA051913, and R01DA048824), National Institute on Alcohol Abuse and Alcoholism (grant number R21AA029435), and National Cancer Institute (R01CA220254 and R43CA257372), and by the Fellow Program at RTI International.
Footnotes
Declaration of Competing Interest
Dr. Philibert is the Chief Executive Officer of Behavioral Diagnostics and the Chief Medical Officer of Cardio Diagnostics. The use of cg05575921 to assess smoking status is covered by existing and pending patents, including U.S. Patents 8637,652 and 9273,358. The use of DMR11 to impute cell heterogeneity is covered by pending patents assigned to Behavioral Diagnostics. The other authors have no competing interests to declare.
Supplementary materials
Supplementary material associated with this article can be found, in the online version, at doi: 10.1016/j.addicn.2023.100079.
Data availability
No data was used for the research described in the article.
References
- [1].Cornelius ME, Loretan CG, Wang TW, Jamal A, Homa DM, Tobacco product use among adults - United States, 2020, MMWR Morb. Mortal. Wkly. Rep 71 (11) (2022) 397–405 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [2].United States Public Health Service Office of the Surgeon General, National Center for Chronic Disease Prevention and Health Promotion (US) Office on Smoking and Health, Chapter 2, patterns of smoking cessation among U.S. adults, young adults, and youthSmoking Cessation: A Report of the Surgeon General, US Department of Health and Human Services, Washington (DC), 2020. [Google Scholar]
- [3].Bird A, Perceptions of epigenetics, Nature 447 (7143) (2007) 396–398. [DOI] [PubMed] [Google Scholar]
- [4].Kaur G, Begum R, Thota S, S Batra A systematic review of smoking-related epigenetic alterations, Arch. Toxicol 93 (10) (2019) 2715–2740 [DOI] [PubMed] [Google Scholar]
- [5].Florescu A, Ferrence R, Einarson T, Selby P, Soldin O, Koren G, Methods for quantification of exposure to cigarette smoking and environmental tobacco smoke: focus on developmental toxicology, Ther. Drug Monit 31 (1) (2009) 14–30 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Hamilton CM, Strader LC, Pratt JG, Maiese D, Hendershot T, Kwok RK, et al. , The PhenX Toolkit: get the most from your measures, Am. J. Epidemiol 174 (3) (2011) 253–260 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Bierut LJ, Hendershot T, Benowitz NL, Cummings M, Mermelstein RJ, Piper ME, et al. Smoking cessation, harm reduction and biomarkers protocols in the PhenX toolkit: tools for standardized data collection. Addict. Neurosci. Submitted [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Chang CM, Edwards SH, Arab A, Del Valle-Pinero AY, Yang L, Hatsukami DK, Biomarkers of tobacco exposure: summary of an FDA-sponsored public workshop, Cancer Epidemiol. Biomark. Prev 26 (3) (2017) 291–302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Dawes K, Andersen A, Reimer R, Mills JA, Hoffman E, Long JD, et al. , The relationship of smoking to cg05575921 methylation in blood and saliva DNA samples from several studies, Sci. Rep 11 (1) (2021) 21627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Dawes K, Andersen A, Vercande K, Papworth E, Philibert W, Beach SRH, et al. , Saliva DNA methylation detects nascent smoking in adolescents, J. Child Adolesc. Psychopharmacol 29 (7) (2019) 535–544 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Smith AK, Kilaru V, Klengel T, Mercer KB, Bradley B, Conneely KN, et al. , DNA extracted from saliva for methylation studies of psychiatric traits: evidence tissue specificity and relatedness to brain, Am. J. Med. Genet. Part B Neuropsychiatr. Genet. Off. Publ. Int. Soc. Psychiatr. Genet 168B (1) (2015) 36–44 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Zhao E, Xu H, Wang L, Kryczek I, Wu K, Hu Y, et al. , Bone marrow and the control of immunity, Cell. Mol. Immunol 9 (1) (2012) 11–19 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Cheng H, Zheng Z, Cheng T, New paradigms on hematopoietic stem cell differentiation, Protein Cell 11 (1) (2020) 34–44 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Blumenreich MS, The white blood cell and differential count, in: Walker HK, Hall WD, Hurst JW (Eds.), Clinical Methods: The History, Physical, and Laboratory Examinations, Butterworths, Boston, 1990 [PubMed] [Google Scholar]
- [15].Baliu-Piqué M, Verheij MW, Drylewicz J, Ravesloot L, de Boer RJ, Koets A, et al. , Short lifespans of memory t-cells in bone marrow, blood, and lymph nodes suggest that T-cell memory is maintained by continuous self-renewal of recirculating cells, Front. Immunol 9 (2018) [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Dawes K, Andersen A, Papworth E, Hundley B, Hutchens N, El Manawy H, et al. , Refinement of cg05575921 demethylation response in nascent smoking, Clin. Epigenet 12 (1) (2020) 1–11 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].The Blueprint Consortium, Bock C, Halbritter F, Carmona FJ, Tierling S, Datlinger P, et al. , Quantitative comparison of DNA methylation assays for biomarker development and clinical applications, Nat. Biotechnol 34 (7) (2016) 726. [DOI] [PubMed] [Google Scholar]
- [18].Rhoads A, Au KF, PacBio sequencing and its applications, Genom. Proteom. Bioinf 13 (5) (2015) 278–289 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Li S, Tollefsbol TO, DNA methylation methods: global DNA methylation and methylomic analyses, Methods 187 (2021) 28–43 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Godderis L, Schouteden C, Tabish A, Poels K, Hoet P, Baccarelli AA, et al. , Global methylation and hydroxymethylation in DNA from blood and saliva in healthy volunteers, Biomed. Res. Int 2015 (2015) 845041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].Houseman EA, Kile ML, Christiani DC, Ince TA, Kelsey KT, Marsit CJ, Reference-free deconvolution of DNA methylation data and mediation by cell composition effects, BMC Bioinf 17 (1) (2016) 259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Dawes K, Andersen A, Reimer R, Mills JA, Hoffman E, Long JD, et al. , The relationship of smoking to cg05575921 methylation in blood and saliva DNA samples from several studies, Sci. Rep 11 (1) (2021) 21627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [23].Moser DA, Muller S, Hummel EM, Limberg AS, Dieckmann L, Frach L, et al. , Targeted bisulfite sequencing: a novel tool for the assessment of DNA methylation with high sensitivity and increased coverage, Psychoneuroendocrinology 120 (2020) 104784. [DOI] [PubMed] [Google Scholar]
- [24].Pidsley R, Zotenko E, Peters TJ, Lawrence MG, Risbridger GP, Molloy P, et al. , Critical evaluation of the Illumina MethylationEPIC BeadChip microarray for whole-genome DNA methylation profiling, Genome Biol. 17 (1) (2016) 208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Sestakova S, Salek C, Remesova H, DNA methylation validation methods: a coherent review with practical comparison, Biol. Proced Online 21 (2019) 19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Kruppa J, Sieg M, Richter G, Pohrt A, Estimands in epigenome-wide association studies, Clin. Epigenet 13 (1) (2021) 98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Daunay A, Baudrin LG, Deleuze JF, How-Kit A Evaluation of six blood-based age prediction models using DNA methylation analysis by pyrosequencing, Sci. Rep 9 (1) (2019) 8862. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [28].Clarke J, Wu HC, Jayasinghe L, Patel A, Reid S, Bayley H, Continuous base identification for single-molecule nanopore DNA sequencing, Nat. Nanotechnol 4 (4) (2009) 265–270 [DOI] [PubMed] [Google Scholar]
- [29].Kunze S, Quantitative region-specific DNA methylation analysis by the EpiTYPER technology, Methods Mol. Biol 1708 (2018) 515–535 [DOI] [PubMed] [Google Scholar]
- [30].Taryma-Le ś niak O, Sokolowska KE, Wojdacz TK, Current status of development of methylation biomarkers for in vitro diagnostic IVD applications, Clin. Epigenet 12 (1) (2020) 1–16 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [31].Pinheiro L, Emslie KR, Basic concepts and validation of digital PCR measurements, in: Karlin-Neumann G, Bizouarn F (Eds.), Digital PCR: Methods and Protocols, Springer New York, New York, NY, 2018, pp. 11–24 [DOI] [PubMed] [Google Scholar]
- [32].Philibert R, Mills JA, Long JD, Salisbury SE, Comellas A, Gerke A, et al. , The reversion of cg05575921 methylation in smoking cessation: a potential tool for incentivizing healthy aging, Genes (Basel) 11 (12) (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [33].Philibert RA, Beach SRH, Gunter TD, Brody GH, Madan A, Gerrard M, The effect of smoking on MAOA promoter methylation in DNA prepared from lymphoblasts and whole blood, Am. J. Med. Genet. Part B Neuropsychiatr. Genet. Off. Publ. Int. Soc. Psychiatr. Genet 153B (2) (2010) 619–628 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [34].Wan ES, Qiu W, Baccarelli A, Carey VJ, Bacherman H, Rennard SI, et al. , Cigarette smoking behaviors and time since quitting are associated with differential DNA methylation across the human genome, Hum. Mol. Genet 21 (13) (2012) 3073–3082 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [35].Zeilinger S, Kuhnel B, Klopp N, Baurecht H, Kleinschmidt A, Gieger C, et al. , Tobacco smoking leads to extensive genome-wide changes in DNA methylation, PLoS One 8 (5) (2013) e63812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [36].Tsaprouni LG, Yang TP, Bell J, Dick KJ, Kanoni S, Nisbet J, et al. , Cigarette smoking reduces DNA methylation levels at multiple genomic loci but the effect is partially reversible upon cessation, Epigenetics 9 (10) (2014) 1382–1396. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [37].Guida F, Sandanger TM, Castagne R, Campanella G, Polidoro S, Palli D, et al. , Dynamics of smoking-induced genome-wide methylation changes with time since smoking cessation, Hum. Mol. Genet 24 (8) (2015) 2349–2359 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [38].Lee MK, Hong Y, Kim SY, London SJ, Kim WJ, DNA methylation and smoking in Korean adults: epigenome-wide association study, Clin. Epigenet 8 (2016) 103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [39].Ambatipudi S, Cuenin C, Hernandez-Vargas H, Ghantous A, Le Calvez-Kelm F, Kaaks R, et al. , Tobacco smoking-associated genome-wide DNA methylation changes in the EPIC study, Epigenomics 8 (5) (2016) 599–618 [DOI] [PubMed] [Google Scholar]
- [40].Joehanes R, Just AC, Marioni RE, Pilling LC, Reynolds LM, Mandaviya PR, et al. , Epigenetic signatures of cigarette smoking, Circ. Cardiovasc. Genet 9 (5) (2016) 436–447 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [41].Wilson R, Wahl S, Pfeiffer L, Ward-Caviness CK, Kunze S, Kretschmer A, et al. , The dynamics of smoking-related disturbed methylation: a two time-point study of methylation change in smokers, non-smokers and former smokers, BMC Genom. Electron. Resour 18 (1) (2017) 805. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [42].Dugue PA, Jung CH, Joo JE, Wang X, Wong EM, Makalic E, et al. , Smoking and blood DNA methylation: an epigenome-wide association study and assessment of reversibility, Epigenetics 15 (4) (2020) 358–368 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [43].Keshawarz A, Joehanes R, Guan W, Huan T, DeMeo DL, Grove ML, et al. , Longitudinal change in blood DNA epigenetic signature after smoking cessation, Epigenetics (2021) 1–12 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [44].Battram T, Yousefi P, Crawford G, Prince C, Sheikhali Babaei M, Sharp G, et al. , The EWAS Catalog: a database of epigenome-wide association studies, Wellcome Open Res. 7 (2022) 41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [45].Zhang Y, Yang R, Burwinkel B, Breitling LP, Brenner H, F2RL3 methylation as a biomarker of current and lifetime smoking exposures, Environ. Health Perspect 122 (2) (2014) 131–137 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [46].McCartney DL, Stevenson AJ, Hillary RF, Walker RM, Bermingham ML, Morris SW, et al. , Epigenetic signatures of starting and stopping smoking, EBioMedicine 37 (2018) 214–220 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [47].Rakyan VK, Down TA, Balding DJ, Beck S, Epigenome-wide association studies for common human diseases, Nat. Rev. Genet 12 (8) (2011) 529–541 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [48].Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, et al. , Human DNA methylomes at base resolution show widespread epigenomic differences, Nature 462 (7271) (2009) 315–322 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [49].Philibert RA, Beach SR, Brody GH, Demethylation of the aryl hydrocarbon receptor repressor as a biomarker for nascent smokers, Epigenetics 7 (11) (2012) [DOI] [PMC free article] [PubMed] [Google Scholar]
- [50].Philibert RA, Beach SR, Lei MK, Brody GH, Changes in DNA methylation at the aryl hydrocarbon receptor repressor may be a new biomarker for smoking, Clin. Epigenet 5 (1) (2013) 1–8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [51].Philibert R, Hollenbeck N, Andersen E, McElroy S, Wilson S, Vercande K, et al. , Reversion of AHRR demethylation is a quantitative biomarker of smoking cessation, Front. Psychiatry 7 (2016) 55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [52].Dogan MV, Beach SRH, Philibert RA, Genetically contextual effects of smoking on genome wide DNA methylation, Am. J. Med. Genet. B Neuropsychiatr. Genet 174 (6) (2017) 595–607 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [53].Takeuchi F, Takano K, Yamamoto M, Isono M, Miyake W, Mori K, et al. , Clinical implication of smoking-related aryl-hydrocarbon receptor repressor (AHRR) hypomethylation in Japanese adults, Circ. J (2022) CJ–21–0958 [DOI] [PubMed] [Google Scholar]
- [54].Elliott HR, Tillin T, McArdle WL, Ho K, Duggirala A, Frayling TM, et al. , Differences in smoking associated DNA methylation patterns in South Asians and Europeans, Clin. Epigenet 6 (1) (2014) 4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [55].Andersen AM, Philibert RA, Gibbons FX, Simons RL, Long J, Accuracy and utility of an epigenetic biomarker for smoking in populations with varying rates of false self-report, Am. J. Med. Genet. B Neuropsychiatr. Genet 174 (6) (2017) 641–650 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [56].Maas SCE, Vidaki A, Wilson R, Teumer A, Liu F, van Meurs JBJ, et al. , Validated inference of smoking habits from blood with a finite DNA methylation marker set, Eur. J. Epidemiol 34 (11) (2019) 1055–1074 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [57].Philibert R, Dogan M, Noel A, Miller S, Krukow B, Papworth E, et al. , Dose response and prediction characteristics of a methylation sensitive digital PCR assay for cigarette consumption in adults, Front. Genet 9 (2018) 137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [58].McCartney DL, Hillary RF, Stevenson AJ, Ritchie SJ, Walker RM, Zhang Q, et al. , Epigenetic prediction of complex traits and death, Genom. Biol 19 (1) (2018) 136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [59].Corley J, Cox SR, Harris SE, Hernandez MV, Maniega SM, Bastin ME, et al. , Epigenetic signatures of smoking associate with cognitive function, brain structure, and mental and physical health outcomes in the Lothian Birth Cohort 1936, Transl. Psychiatry 9 (1) (2019) 248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [60].Philibert R, Hollenbeck N, Andersen E, Osborn T, Gerrard M, Gibbons FX, et al. , A quantitative epigenetic approach for the assessment of cigarette consumption, Front. Psychol 6 (2015) 656. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [61].Bollepalli S, Korhonen T, Kaprio J, Anders S, Ollikainen M, EpiSmokEr: a robust classifier to determine smoking status from DNA methylation data, Epigenomics 11 (13) (2019) 1469–1486 [DOI] [PubMed] [Google Scholar]
- [62].Langdon RJ, Yousefi P, Relton CL, Suderman MJ, Epigenetic modelling of former, current and never smokers, Clin. Epigenet 13 (1) (2021) 206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [63].Drake P, Driscoll AK, Mathews TJ, Cigarette smoking during pregnancy: United States, 2016, NCHS Data Brief (305) (2018) 1–8 [PubMed] [Google Scholar]
- [64].Joubert BR, Felix JF, Yousefi P, Bakulski KM, Just AC, Breton C, et al. , DNA methylation in newborns and maternal smoking in pregnancy: genome-wide consortium meta-analysis, Am. J. Hum. Genet 98 (4) (2016) 680–696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [65].Joubert BR, Haberg SE, Bell DA, Nilsen RM, Vollset SE, Midttun O, et al. , Maternal smoking and DNA methylation in newborns: in utero effect or epigenetic inheritance? Cancer Epidemiol. Biomarkers Prev 23 (6) (2014) 1007–1017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [66].Joubert BR, Haberg SE, Nilsen RM, Wang X, Vollset SE, Murphy SK, et al. , 450K epigenome-wide scan identifies differential DNA methylation in newborns related to maternal smoking during pregnancy, Environ. Health Perspect 120 (10) (2012) 1425–1431. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [67].Bauer T, Trump S, Ishaque N, Thurmann L, Gu L, Bauer M, et al. , Environment-induced epigenetic reprogramming in genomic regulatory elements in smoking mothers and their children, Mol. Syst. Biol 12 (3) (2016) 861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [68].Howe CG, Zhou M, Wang X, Pittman GS, Thompson IJ, Campbell MR, et al. , Associations between maternal tobacco smoke exposure and the cord blood [Formula: see text] DNA methylome, Environ. Health Perspect 127 (4) (2019) 47009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [69].Miyake K, Kawaguchi A, Miura R, Kobayashi S, Tran NQV, Kobayashi S, et al. , Association between DNA methylation in cord blood and maternal smoking: the Hokkaido study on environment and children’s, Health Sci. Rep 8 (1) (2018) 5654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [70].Pollock JD, Lossie AC, Phenotype and environment matter: discovering the genetic and epigenetic architecture of alcohol use disorders, Am. J. Psychiatry 176 (2) (2019) 92–95 [DOI] [PubMed] [Google Scholar]
- [71].Tsai PC, Glastonbury CA, Eliot MN, Bollepalli S, Yet I, Castillo-Fernandez JE, et al. , Smoking induces coordinated DNA methylation and gene expression changes in adipose tissue with consequences for metabolic health, Clin. Epigenet 10 (1) (2018) 126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [72].Booth MJ, Ost TW, Beraldi D, Bell NM, Branco MR, Reik W, et al. , Oxidative bisulfite sequencing of 5-methylcytosine and 5-hydroxymethylcytosine, Nat. Protoc 8 (10) (2013) 1841–1851 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [73].Yu M, Hon GC, Szulwach KE, Song CX, Jin P, Ren B, et al. , Tet-assisted bisulfite sequencing of 5-hydroxymethylcytosine, Nat. Protoc 7 (12) (2012) 2159–2170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [74].Heard E, Martienssen RA, Transgenerational epigenetic inheritance: myths and mechanisms, Cell 157 (1) (2014) 95–109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [75].Perez MF, Lehner B, Intergenerational and transgenerational epigenetic inheritance in animals, Nat. Cell Biol 21 (2) (2019) 143–151 [DOI] [PubMed] [Google Scholar]
- [76].Feiner N, Radersma R, Vasquez L, Ringner M, Nystedt B, Raine A, et al. , Environmentally induced DNA methylation is inherited across generations in an aquatic keystone species, iScience 25 (5) (2022) 104303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [77].van Otterdijk SD, Michels KB, Transgenerational epigenetic inheritance in mammals: how good is the evidence? FASEB J. 30 (7) (2016) 2457–2465. [DOI] [PubMed] [Google Scholar]
- [78].Rothstein MA, Cai Y, Marchant GE, The ghost in our genes: legal and ethical implications of epigenetics, Health Matrix 19 (1) (2009) 1–62. [PMC free article] [PubMed] [Google Scholar]
- [79].Mills MC, Rahal C, The GWAS Diversity Monitor tracks diversity by disease in real time, Nat. Genet 52 (3) (2020) 242–243. [DOI] [PubMed] [Google Scholar]
- [80].Barcelona V, Huang Y, Brown K, Liu J, Zhao W, Yu M, et al. , Novel DNA methylation sites associated with cigarette smoking among African Americans, Epigenetics 14 (4) (2019) 383–391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [81].Zhu X, Li J, Deng S, Yu K, Liu X, Deng Q, et al. , Genome-wide analysis of DNA methylation and cigarette smoking in a Chinese population, Environ. Health Perspect 124 (7) (2016) 966–973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [82].Ratti M, Lampis A, Ghidini M, Salati M, Mirchev MB, Valeri N, et al. , MicroRNAs (miRNAs) and long non-coding RNAs (lncRNAs) as new tools for cancer therapy: first steps from bench to bedside, Target Oncol. 15 (3) (2020) 261–278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [83].Smith ACW, Kenny PJ, MicroRNAs regulate synaptic plasticity underlying drug addiction, Genes Brain Behav. 17 (3) (2018) e12424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [84].Al-Obaide MAI, Ibrahim BA, Al-Humaish S, Abdel-Salam AG, Genomic and bioinformatics approaches for analysis of genes associated with cancer risks following exposure to tobacco smoking, Front. Public Health 6 (2018) 84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [85].Jacobsen KK, Schnohr P, Jensen GB, Bojesen SE, AHRR (cg05575921) methylation safely improves specificity of lung cancer screening eligibility criteria: a cohort study, Cancer Epidemiol. Biomark. Prev 31 (4) (2022) 758–765. [DOI] [PubMed] [Google Scholar]
- [86].Philibert R, Dawes K, Moody J, Hoffman R, Sieren J, Long J, Using Cg05575921 methylation to predict lung cancer risk: a potentially bias-free precision epigenetics approach, Epigenetics 17 (13) (2022) 2096–2108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [87].Hajek P, McRobbie H, Gillison F, Dependence potential of nicotine replacement treatments: effects of product type, patient characteristics, and cost to user, Prev. Med 44 (3) (2007) 230–234. [DOI] [PubMed] [Google Scholar]
- [88].Chatkin G, Chatkin JM, Aued G, Petersen GO, Jeremias ET, V Thiesen F, Evaluation of the exhaled carbon monoxide levels in smokers with COPD, J. Bras. Pneumol 36 (3) (2010) 332–338. [DOI] [PubMed] [Google Scholar]
- [89].Sato S, Nishimura K, Koyama H, Tsukino M, Oga T, Hajiro T, et al. , Optimal cutoff level of breath carbon monoxide for assessing smoking status in patients with asthma and COPD, Chest 124 (5) (2003) 1749–1754. [DOI] [PubMed] [Google Scholar]
- [90].Saccone NL, Baurley JW, Bergen AW, David SP, Elliott HR, Foreman MG, et al. , The value of biosamples in smoking cessation trials: a review of genetic, metabolomic, and epigenetic findings, Nicotine Tob. Res 20 (4) (2018) 403–413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [91].Richmond RC, Sillero-Rejon C, Khouja JN, Prince C, Board A, Sharp G, et al. , Investigating the DNA methylation profile of e-cigarette use, Clin. Epigenet 13 (1) (2021) 183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [92].Ohmomo H, Harada S, Komaki S, Ono K, Sutoh Y, Otomo R, et al. , DNA methylation abnormalities and altered whole transcriptome profiles after switching from combustible tobacco smoking to heated tobacco products, Cancer Epidemiol. Biomark. Prev 31 (1) (2022) 269–279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [93].Chen LS, Zawertailo L, Piasecki TM, Kaprio J, Foreman M, Elliott HR, et al. , Leveraging genomic data in smoking cessation trials in the era of precision medicine: why and how, Nicotine Tob. Res 20 (4) (2018) 414–424 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [94].Cowley LE, Farewell DM, Maguire S, Kemp AM, Methodological standards for the development and evaluation of clinical prediction rules: a review of the literature, Diagn. Progn. Res 3 (2019) 16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [95].Bank PCD, Jacobs LHJ, van den Berg SAA, van Deutekom HWM, Hamann D, Molenkamp R, et al. , The end of the laboratory developed test as we know it? Recommendations from a national multidisciplinary taskforce of laboratory specialists on the interpretation of the IVDR and its complications, Clin. Chem. Lab. Med (2020) [DOI] [PubMed] [Google Scholar]
- [96].Graden KC, Bennett SA, Delaney SR, Gill HE, Willrich MAV, A high-level overview of the regulations surrounding a clinical laboratory and upcoming regulatory challenges for laboratory developed tests, Lab. Med 52 (4) (2021) 315–328 [DOI] [PubMed] [Google Scholar]
- [97].Tse OYO, Jiang P, Cheng SH, Peng W, Shang H, Wong J, et al. , Genome-wide detection of cytosine methylation by single molecule real-time sequencing, Proc. Natl. Acad. Sci. U. S. A 118 (5) (2021) [DOI] [PMC free article] [PubMed] [Google Scholar]
- [98].Philibert R, Miller S, Noel A, Dawes K, Papworth E, Black DW, et al. , A four marker digital PCR toolkit for detecting heavy alcohol consumption and the effectiveness of its treatment, J. Insur. Med 48 (1) (2019) 90–102 [DOI] [PubMed] [Google Scholar]
- [99].Ziller MJ, Hansen KD, Meissner A, Aryee MJ, Coverage recommendations for methylation analysis by whole-genome bisulfite sequencing, Nat. Methods 12 (3) (2015) 230–232 1 p following 32 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
No data was used for the research described in the article.
