Skip to main content
PLOS Medicine logoLink to PLOS Medicine
. 2020 Jul 23;17(7):e1003178. doi: 10.1371/journal.pmed.1003178

Smoking, alcohol consumption, and cancer: A mendelian randomisation study in UK Biobank and international genetic consortia participants

Susanna C Larsson 1,2,*, Paul Carter 3, Siddhartha Kar 4, Mathew Vithayathil 5, Amy M Mason 6,7, Karl Michaëlsson 1, Stephen Burgess 3,8
Editor: Konstantinos K Tsilidis9
PMCID: PMC7377370  PMID: 32701947

Abstract

Background

Smoking is a well-established cause of lung cancer and there is strong evidence that smoking also increases the risk of several other cancers. Alcohol consumption has been inconsistently associated with cancer risk in observational studies. This mendelian randomisation (MR) study sought to investigate associations in support of a causal relationship between smoking and alcohol consumption and 19 site-specific cancers.

Methods and findings

We used summary-level data for genetic variants associated with smoking initiation (ever smoked regularly) and alcohol consumption, and the corresponding associations with lung, breast, ovarian, and prostate cancer from genome-wide association studies consortia, including participants of European ancestry. We additionally estimated genetic associations with 19 site-specific cancers among 367,643 individuals of European descent in UK Biobank who were 37 to 73 years of age when recruited from 2006 to 2010. Associations were considered statistically significant at a Bonferroni corrected p-value below 0.0013. Genetic predisposition to smoking initiation was associated with statistically significant higher odds of lung cancer in the International Lung Cancer Consortium (odds ratio [OR] 1.80; 95% confidence interval [CI] 1.59–2.03; p = 2.26 × 10−21) and UK Biobank (OR 2.26; 95% CI 1.92–2.65; p = 1.17 × 10−22). Additionally, genetic predisposition to smoking was associated with statistically significant higher odds of cancer of the oesophagus (OR 1.83; 95% CI 1.34–2.49; p = 1.31 × 10−4), cervix (OR 1.55; 95% CI 1.27–1.88; p = 1.24 × 10−5), and bladder (OR 1.40; 95% CI 1.92–2.65; p = 9.40 × 10−5) and with statistically nonsignificant higher odds of head and neck (OR 1.40; 95% CI 1.13–1.74; p = 0.002) and stomach cancer (OR 1.46; 95% CI 1.05–2.03; p = 0.024). In contrast, there was an inverse association between genetic predisposition to smoking and prostate cancer in the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome consortium (OR 0.90; 95% CI 0.83–0.98; p = 0.011) and in UK Biobank (OR 0.90; 95% CI 0.80–1.02; p = 0.104), but the associations did not reach statistical significance. We found no statistically significant association between genetically predicted alcohol consumption and overall cancer (n = 75,037 cases; OR 0.95; 95% CI 0.84–1.07; p = 0.376). Genetically predicted alcohol consumption was statistically significantly associated with lung cancer in the International Lung Cancer Consortium (OR 1.94; 95% CI 1.41–2.68; p = 4.68 × 10−5) but not in UK Biobank (OR 1.12; 95% CI 0.65–1.93; p = 0.686). There was no statistically significant association between alcohol consumption and any other site-specific cancer. The main limitation of this study is that precision was low in some analyses, particularly for analyses of alcohol consumption and site-specific cancers.

Conclusions

Our findings support the well-established relationship between smoking and lung cancer and suggest that smoking may also be a risk factor for cancer of the head and neck, oesophagus, stomach, cervix, and bladder. We found no evidence supporting a relationship between alcohol consumption and overall or site-specific cancer risk.


In a mendelian randomization study, Susanna Larsson and colleagues identify genetic factors contributing to relationships between smoking and alcohol consumption and risk of overall and site-specific cancers among individuals in the UK Biobank and several international genome-wide association studies consortia.

Author summary

Why was this study done?

  • Tobacco smoking and alcoholic beverage consumption are common addictive behaviours and important risk factors for mortality.

  • Observational evidence has shown that smoking is a risk factor for cancers of the lung, bladder, kidney, gastrointestinal tract, and cervix, but uncertainty persists about the causal role of smoking for the development of other cancers.

  • The causal role of alcohol consumption for site-specific cancers is uncertain, as available evidence originates from observational studies which are susceptible to confounding and reverse causation bias.

What did the researchers do and find?

  • Using the mendelian randomisation (MR) design, we found that genetic predisposition to smoking is associated with a statistically significant increased risk of cancer of the lung, oesophagus, cervix, and bladder and with a statistically nonsignificant increased risk of head and neck and stomach cancer at the Bonferroni-corrected significance threshold.

  • Genetically predicted alcohol consumption was statistically significantly positively associated with lung cancer but not with any other site-specific cancer or overall cancer.

What do these findings mean?

  • In this study, we observed a relationship between smoking and lung cancer, as well as evidence that smoking may also be a risk factor for cancer of the head and neck, oesophagus, stomach, cervix, and bladder.

  • We found no evidence supporting a relationship between alcohol consumption and overall or site-specific cancer risk.

Introduction

Tobacco smoking and alcoholic beverage consumption are common addictive behaviours and important causes of mortality [13]. The causal link between smoking and risk of lung cancer is well established [4,5]. Strong experimental and observational evidence also indicates that smoking is a risk factor for cancers of the bladder, kidney, gastrointestinal tract (head and neck, oesophagus, stomach, colorectum, pancreas, and liver) and cervix [48], but uncertainty persists about the causal role of smoking for the development of other cancers.

Observational data on alcohol consumption in relation to various cancers are contrasting [714]. Alcohol consumption has been reported to be positively associated with risk of cancers of the head and neck, oesophagus, stomach, liver, and breast [7,9,1214] but inversely associated with kidney cancer [9,11] and non-Hodgkin lymphoma [9,12]. With regard to colorectal cancer, a recent meta-analysis of 16 cohort studies showed that heavy alcohol consumption was associated an increased risk of colorectal cancer, whereas light-to-moderate drinking was associated with a decreased risk [10]. Given that much of the current evidence originates from observational epidemiological studies, the causal nature of these findings needs assessment. This is particularly important in the context of these addictive health behaviours, which can be open to confounding by factors such as other destructive health behaviours and socioeconomic status.

A recent meta-analysis of genome-wide association studies identified a number of single-nucleotide polymorphisms (SNPs) associated with smoking and alcohol consumption [15]. Those SNPs can be used as instruments for these exposures in mendelian randomisation (MR) analyses to infer causality. The MR technique is based on Mendel’s law of independent assortment, which states that the inheritance of one characteristic is independent of the inheritance of another. Thus, levels of the exposure predicted by the SNPs are usually independent of other exposures, thereby reducing confounding in MR studies. In addition, the MR study design avoids reverse causality because disease development cannot affect genotype.

The primary aim of this study was to use MR to examine the associations of smoking and alcohol consumption with 19 site-specific cancers using data from four large-scale genome-wide association studies consortia and UK Biobank. In complementary analyses, we assessed the associations of genetically predicted smoking and alcohol consumption with overall cancer in UK Biobank to determine the overall impact of smoking and alcohol from a public health perspective.

Methods

Outcome data sources

Publicly available summary-level data for lung, breast, ovarian, and prostate cancer were obtained respectively from the International Lung Cancer Consortium (11,348 cases and 15,861 controls) [16], the Breast Cancer Association Consortium (122,977 cases and 105,974 controls) [17], the Ovarian Cancer Association Consortium (25,509 cases and 40,941 controls) [18], and the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome consortium (79,148 cases and 61,106 controls) [19]. All participants included in the consortia were of European ancestry and came from European and North American countries and Australia. The lung cancer consortium included both women and men, whereas the breast and ovarian cancer consortia included women only and the prostate cancer consortium men only. Data from the consortia were extracted through the MR-Base platform [20].

In addition, we estimated genetic associations with 19 site-specific cancers (with at least 400 cases) and overall cancer in UK Biobank, a cohort study of about 500,000 adults (37 to 73 years of age at baseline) recruited between 2006 and 2010 [21]. In the current analyses, we included 367,643 UK Biobank participants of European descent after exclusion of participants with other ethnicities (to reduce population stratification bias), those with relatedness of third degree or higher, excess heterozygosity, and low genotype call rate. Cancer cases were ascertained until March 31, 2017, and were defined based on data from national registries (International Classification of Diseases, 9th and 10th revision codes) and self-reported information verified by interview with a nurse (S1 Table). We calculated beta coefficients and standard errors of the genetic associations with cancer using logistic regression, with adjustment for age, sex, and 10 genetic principal components. The UK Biobank study was approved by the North West Multicenter Research Ethics Committee. All participants provided written informed consent. The present analyses were approved by the Swedish Ethical Review Authority.

Genetic instruments

Instrumental variables for the exposures were selected from a meta-analysis of genome-wide association studies with a total of 1,232,091 individuals of European descent in the analysis of smoking initiation (i.e., probability of ever smoked regularly) and 941,280 individuals of European descent in the analysis of alcohol consumption [15]. A total of 378 and 99 conditionally independent SNPs associated with smoking initiation and alcohol consumption (log-transformed alcoholic drinks per week), respectively, at the genome-wide significance threshold (p < 5 × 10−8) were identified [15]. Linkage disequilibrium (defined as R2 > 0.1) between SNPs was assessed using LDlink [22] and was detected among 16 SNP pairs for smoking initiation and among four SNP pairs for alcohol consumption. The SNP with the weakest association with the exposure was removed. SNPs that were unavailable in the outcome datasets were replaced by a suitable proxy (minimum linkage disequilibrium R2 = 0.8) where available. Our genetic instruments comprised between 346 and 361 independent SNPs for smoking initiation and between 89 and 94 independent SNPs for alcohol consumption. The SNPs explained 2.3% and 0.2%–0.3% of the variation in smoking initiation and alcohol consumption, respectively [15]. The genetic instruments were strongly associated with the exposures in UK Biobank, with an F statistic from regression of the exposure on the variants of about 75 for smoking initiation and 19–29 for alcohol consumption. The two exposures were moderately genetically correlated (rg = 0.36) [15]. The effect sizes are expressed per standard deviation increase in the exposure [15]. For smoking, this was calculated from the weighted average prevalence of ever smokers across the studies included in the meta-analysis [15].

Statistical analysis

The principal analyses were conducted using the inverse-variance weighted approach (under a multiplicative random-effects model), which provides the most precise estimates but assumes that all SNPs are valid instrumental variables [23]. In sensitivity analyses, the following approaches were applied: (1) multivariable MR analysis (inverse-variance weighted method) with smoking adjusted for alcohol consumption and vice versa; (2) weighted median method, which provides a causal estimate if at least 50% of the weight in the analysis comes from valid instrumental variables [23]; (3) contamination mixture method, which performs MR robustly and efficiently in the presence of invalid instrumental variables [24]; (4) MR pleiotropy residual sum and outlier (MR-PRESSO) method, which can detect and adjust for horizontal pleiotropy by outlier removal [25]; and (5) MR-Egger regression method, which can detect and adjust for directional pleiotropy but has low precision [23]. The mrrobust [26], MendelianRandomization [27], and MRPRESSO [25] packages were used for the statistical analyses. The reported odds ratios (OR) are per one standard deviation increase in the prevalence of smoking initiation and per standard deviation increase in log-transformed alcoholic drinks per week. We calculated the power at different ORs for each cancer site [28]. All statistical tests were 2-tailed. Associations were considered statistically significant at a Bonferroni corrected p-value below 0.0013 (correcting for 2 exposures and 19 outcomes).

Protocol

There was no formal predefined protocol or prospective analysis plan for this study. The MR analyses of genetic predisposition to smoking and alcohol consumption in relation to overall and site-specific cancers in UK Biobank were initiated in August 2019. Following peer review, we (i) conducted MR analyses of smoking and alcohol consumption in relation to lung, breast, ovarian, and prostate cancer using publicly available data from international consortia for these cancers; (ii) defined the analysis of site-specific cancer as the primary analysis and applied Bonferroni correction to site-specific cancers; (iii) calculated statistical power for each site-specific cancer; and (iv) performed a complementary analysis using the contamination mixture method.

This study is reported as per the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guideline (S1 Checklist).

Results

In analyses using consortia data, we had 80% power at a significance level of 0.05 to detect an OR ranging from 1.05 (for breast cancer) to 1.25 (for lung cancer) for smoking initiation, and a corresponding OR ranging from 1.28 to 1.90 for alcohol consumption (S2 Table). Relatively strong magnitude of associations was necessary to detect significant ORs in site-specific cancer analyses based on UK Biobank data, particularly for alcohol consumption (S2 Table).

Genetic predisposition to smoking initiation was associated with a statistically significant higher odds of lung cancer in the International Lung Cancer Consortium (OR 1.80; 95% confidence interval [CI] 1.59–2.03; p = 2.26 × 10−21) and UK Biobank (OR 2.26; 95% CI 1.92–2.65; p = 1.17 × 10−22) (Fig 1). Additionally, genetic predisposition to smoking was associated with statistically significantly higher odds of cancer of the oesophagus (OR 1.83; 95% CI 1.34–2.49; p = 1.31 × 10−4), cervix (OR 1.55; 95% CI 1.27–1.88; p = 1.24 × 10−5), and bladder (OR 1.40; 95% CI 1.92–2.65; p = 9.40 × 10−5) and with statistically nonsignificant higher odds of head and neck (OR 1.40; 95% CI 1.13–1.74; p = 0.002) and stomach cancer (OR 1.46; 95% CI 1.05–2.03; p = 0.024) (Fig 1). In contrast, there was a statistically nonsignificant inverse association between genetic predisposition to smoking and prostate cancer in the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome consortium (OR 0.90; 95% CI 0.83–0.98; p = 0.011) and in UK Biobank (OR 0.90; 95% CI 0.80–1.02; p = 0.104) (Fig 1). Results were consistent in sensitivity analyses (S3 Table). Genetic predisposition to smoking initiation was associated with a statistically significant higher odds of both lung adenocarcinoma and squamous cell lung cancer but was not associated with significantly higher odds of oestrogen receptor positive or negative breast tumours or with any subtype of non-Hodgkin lymphoma or leukaemia (S4 Table).

Fig 1. Associations of genetic predisposition to smoking initiation with site-specific cancers.

Fig 1

ORs are per one standard deviation increase in probability of smoking initiation (ever smoked regularly). Results are obtained from the random-effects inverse-variance weighted method. The I2 statistic quantifies the amount of heterogeneity among estimates based on individual SNPs. BCAC, Breast Cancer Association Consortium; CI, confidence interval; ILCCO, International Lung Cancer Consortium; OCAC, Ovarian Cancer Association Consortium; OR, odds ratio; PRACTICAL, Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome consortium; SNP, single-nucleotide polymorphism.

Genetically predicted alcohol consumption was statistically significantly positively associated with lung cancer in the International Lung Cancer Consortium (OR 1.94; 95% CI 1.41–2.68; p = 4.68 × 10−5) but not in UK Biobank (OR 1.12; 95% CI 0.65–1.93; p = 0.686) (Fig 2). After adjustment for genetic predisposition to smoking using multivariable MR, the association between genetically predicted alcohol consumption and lung cancer in the International Lung Cancer Consortium was attenuated and nonsignificant (OR 1.75; 95% CI 1.23–2.49; p = 0.002) (S5 Table). There was no statistically significant association between genetically predicted alcohol consumption and any other site-specific cancer (Fig 2). However, the precision of the estimates was generally low and the ORs were above 1.50 for testicular, head and neck, and oesophageal cancers and below 0.70 for leukaemia, non-Hodgkin lymphoma, melanoma, and brain cancer (Fig 2). The ORs were consistently above 2 for testicular cancer in all sensitivity analyses, but none of the associations reached statistical significance (S5 Table). Genetically predicted alcohol consumption was associated with statistically nonsignificant higher odds of both lung adenocarcinoma and squamous cell lung cancer in the International Lung Cancer Consortium but was not associated with any lung cancer subtype in UK Biobank, oestrogen receptor positive or negative breast tumours, or any subtype of non-Hodgkin lymphoma or leukaemia (S6 Table).

Fig 2. Associations of genetically predicted alcohol consumption with site-specific cancers.

Fig 2

ORs are per one standard deviation increase of log-transformed alcoholic drinks per week. Results are obtained from the multiplicative random-effects inverse-variance weighted method. The I2 statistic quantifies the amount of heterogeneity among estimates based on individual SNPs. BCAC, Breast Cancer Association Consortium; CI, confidence interval; ILCCO, International Lung Cancer Consortium; OCAC, Ovarian Cancer Association Consortium; OR, odds ratio; PRACTICAL, Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome consortium; SNP, single-nucleotide polymorphism.

For overall cancer (n = 75,037 cases, also including cancers with <400 cases, in UK Biobank) a statistically nonsignificant positive association was observed with genetic predisposition to smoking in the inverse-variance weighted analysis (OR 1.06; 95% CI 1.02–1.10; p = 0.007), but the association was attenuated in the alcohol-adjusted analysis (OR 1.03; 95% CI 0.98–1.08; p = 0.304) (S3 Table). Genetically predicted alcohol consumption was not associated with overall cancer (OR 0.95; 95% CI 0.84–1.07; p = 0.376) (S5 Table).

Discussion

In the present MR study, we systematically assessed the associations of genetically predicted smoking and alcohol consumption with a broad range of cancer types. Our results showed that genetic predisposition to smoking was associated with an increased risk of several cancers, supporting a causal relationship as consistent with previous observational studies. Genetically predicted alcohol consumption was positively associated with lung cancer in the International Lung Cancer Consortium but was not statistically significantly associated with any other site-specific cancer or overall cancer. However, the precision was low in the site-specific cancer analyses and there were relatively strong but statistically nonsignificant positive associations of genetically predicted alcohol consumption (ORs above 1.5) with testicular, head and neck, and oesophageal cancer, as well as inverse associations (ORs below 0.7) with leukaemia, non-Hodgkin lymphoma, melanoma, and brain cancer.

Both tobacco and alcohol are biologically plausible tumour-promoting behaviours. Tobacco smoke can increase the risk of cancer through its content of carcinogens, such as nitrosamines, polycyclic aromatic hydrocarbons, acrylamines, volatile organics, and cadmium [7]. The carcinogenic effect of smoking on lung cancer is well established, and our MR findings corroborate the observational evidence that smoking is also a risk factor for cancers of the head and neck, oesophagus, stomach, cervix, and bladder [47]. Smoking is the major risk factor for bladder cancer, and it has been estimated that ever smoking accounts for about two thirds and one third of all bladder cancer cases in men and women, respectively [6]. Conventional observational studies have further shown that smoking increases the risk of kidney, pancreatic, and colorectal cancer [4,5,7]. Although we failed to detect significant associations of genetic predisposition to smoking with those cancers, the associations were positive, with an OR of 1.28 for kidney cancer, 1.15 for pancreatic cancer, and 1.10 for colorectal cancer. Similarly, smoking has been found to have a protective effect on melanoma risk in multiple meta-analyses, and although our estimate was in this direction (OR 0.92), the association did not reach statistical significance. Conventional observational studies have reported inconclusive results or no association of smoking with breast, ovarian, and brain cancers, melanoma, non-Hodgkin lymphoma, leukaemia, and multiple myeloma [4,5,7,29,30]. This MR study provided no evidence of a causal association between smoking and those cancers. Furthermore, we found no support of an inverse association between smoking and uterine cancer, consistent with findings from most observational studies [5,7]. However, we found a statistically nonsignificant inverse association between smoking and prostate cancer. It is worth interpreting this result in light of the fact that only 20% of the prostate cancer cases included in the prostate cancer consortium were of advanced or highly aggressive disease [19]. Genetic association estimates specific to advanced or highly aggressive prostate cancer were not publicly available. Our MR results for prostate cancer, based largely on non-advanced disease, are consistent with the inverse association between smoking and prostate cancer risk that has been reported specifically for localised and low-grade disease in observational studies [31,32]. The inverse association may reflect detection bias, such that smokers may be less likely to undergo prostate-specific antigen screening and therefore are not diagnosed with prostate cancer until a late stage or not at all.

Alcohol consumption may increase the risk of cancer through its oxidised metabolite acetaldehyde, which is carcinogenic to humans [7]. Alcohol consumption might also lower cancer risk by other mechanisms, such as increased insulin sensitivity through increased adiponectin levels [33]. Red wine in particular has been suggested as potentially anticarcinogenic due to its flavonoid content, which is both anti-inflammatory and antioxidative [34,35]. The present MR findings are consistent with those of observational studies [6,7,9,1214] and previous MR studies [36,37] indicating that alcohol consumption increases the risk of cancers of the head and neck and oesophagus, although our estimates had low precision and did not reach statistical significance. Alcohol drinking has also been reported to be associated with breast cancer risk in a dose-response manner, with 8% to 12% increase in risk per 10 g/day increase of alcohol consumption [9,13,14]. This MR study was underpowered to detect such a relatively modest association. A harmful effect of heavy drinking on colorectal cancer has been suggested by observational studies [10,12] as well as an MR study using the ALDH2 genotype as a marker of alcohol exposure [38]. In contrast, light alcohol consumption has been shown to be unrelated [12,14] or inversely [10] associated with colorectal cancer risk. If anything, our findings indicated a positive association of alcohol consumption with colorectal cancer risk, but we were unable to investigate a potential nonlinear relation. This study could not confirm an inverse association of alcohol consumption with colorectal [10] and kidney [9,11] cancer and non-Hodgkin lymphoma [9,12], but the association with non-Hodgkin lymphoma was in the same direction. Our results agree with observational studies, which have shown no consistent relation between alcohol consumption and risk of leukaemia, melanoma, and brain cancer [9,12,39], and with a previous MR study demonstrating no association between alcohol exposure and prostate cancer incidence [40]. Likewise, data on alcohol consumption in relation to testicular cancer risk are limited. The strong though nonsignificant positive association between genetically predicted alcohol consumption and testicular cancer needs confirmation by larger MR studies.

The inconsistent results for alcohol consumption and lung cancer between the International Lung Cancer Consortium and UK Biobank may simply be a chance finding. Another possibility is that the association of genetically predicted alcohol consumption with lung cancer may be due to smoking rather than alcohol. While we were able to adjust for individual smoking behaviour through multivariable MR analysis, increased alcohol consumption may lead to greater exposure to cigarette smoke via passive smoking. This mechanism would be less represented in UK Biobank due to the relatively low smoking rate in UK Biobank, and the smoking ban in the UK.

Key strengths of this study include the MR design, which mitigated bias due to confounding and reverse causality, and the use of multiple instrumental variables for smoking and alcohol consumption, which allowed sensitivity analyses to identify and adjust for pleiotropy. Another strength is that the associations of smoking and alcohol consumption with cancer at many sites could be assessed in a single large population of individuals of European descent. This enabled comparison of the magnitude of the adverse impact of smoking and alcohol consumption on different cancers while minimising population stratification bias.

A major limitation of this study is that the precision was low in some analyses. Analyses of alcohol consumption in particular had low precision owing to low variance explained by the SNPs, albeit with an F statistic above the conventional cutoff of 10. A low power may explain the lack of statistically significant association of genetically predicted smoking and alcohol consumption with certain site-specific cancers. For overall analyses, while overall cancer is a combination of different malignancies that may have different underlying aetiologies, all cancers are known to share common underlying “hallmark” molecular and cellular aberrations. Combining these malignancies together means that our endpoint is dependent on the characteristics of the analytic sample and the relative prevalence of different cancer types. In particular, cancers with greater survival chances will be overrepresented in the case sample. However, for most individuals, the choice to smoke cigarettes or drink alcohol is likely to be guided by the impact on the overall risk of cancer, not any particular site-specific cancer. Hence, we believe the results for overall cancer have direct relevance for public health. Another shortcoming is that we were unable to assess possible U- or J-shaped relations between alcohol consumption and cancer, or differential effects of specific alcoholic beverage types or drinking patterns. A further potential limitation is that UK Biobank participants were included in both the exposure and outcome datasets, which might have introduced some bias in the MR estimates in the direction of the estimates of observational studies. Nonetheless, only around one third of participants in the genome-wide association studies of smoking and alcohol consumption came from the UK Biobank study and the genetic instruments were relatively strongly related to the exposures (F statistic >10), implying that bias from participant overlap is relatively small [41]. An additional limitation is that smoking initiation is a binary exposure. An MR estimate with a binary exposure and binary outcome is difficult to interpret as a specific causal effect [42]. Furthermore, the MR estimate cannot simply be compared with the association between self-reported smoking and cancer risk estimated by observational studies. However, even in this setting, MR estimates are unbiased under the null [43] and thus provide a valid test of the causal null hypothesis even if the estimates do not reflect meaningful causal parameters. Here, we expressed all estimates in terms of the association between genetically predicted levels of smoking initiation and the outcome, rather than making claims about the numerical magnitude of the causal effect. Our analyses included individuals of European descent and therefore might not be generalisable to other populations.

These analyses broaden the evidence base for the harmful effect of cigarette smoking, which has previously been demonstrated for most cardiovascular diseases [44,45], type 2 diabetes [46], and bone fracture [47]. Those findings along with results of the present study provide further evidential support for public health interventions to reduce cigarette smoking initiation in the population, and suggest that these strategies will have an important impact on lessening the burden of major diseases, and of cancer in particular.

Conclusions

The results of this study support the well-established relationship between smoking and lung cancer, and suggest that smoking may also be a risk factor for cancer of the head and neck, oesophagus, stomach, cervix, and bladder. We found no evidence in support of a relationship between alcohol consumption and overall cancer risk, but associations between alcohol consumption and risk of site-specific cancer should be further investigated.

Supporting information

S1 STROBE Checklist. STROBE, Strengthening the Reporting of Observational Studies in Epidemiology.

(DOCX)

S1 Table. Sources and definition of cancers in UK Biobank.

(XLSX)

S2 Table. Power calculation for the associations of smoking and alcohol consumption with cancer.

(XLSX)

S3 Table. Associations of genetic predisposition to smoking initiation with site-specific cancers in the primary inverse-variance weighted analysis and in sensitivity analyses using other MR methods.

MR, mendelian randomisation.

(XLSX)

S4 Table. Associations of genetic predisposition to smoking initiation with subtypes of lung cancer, non-Hodgkin lymphoma, and leukaemia.

(XLSX)

S5 Table. Associations of genetically predicted alcohol consumption with site-specific cancers in the primary inverse-variance weighted analysis and in sensitivity analyses using other MR methods.

MR, mendelian randomisation.

(XLSX)

S6 Table. Associations of genetically predicted alcohol consumption with subtypes of lung cancer, breast cancer, non-Hodgkin lymphoma, and leukaemia in MR analyses.

MR, mendelian randomisation.

(XLSX)

S1 Data. Derived summary statistics data supporting the result of this study.

(XLSX)

Acknowledgments

This research has been conducted using the UK Biobank Resource under Application number 29202.

Disclaimer: The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, or the Department of Health and Social Care.

Abbreviations

CI

confidence interval

MR

mendelian randomisation

MR-PRESSO

mendelian randomisation pleiotropy residual sum and outlier

OR

odds ratio

SNP

single-nucleotide polymorphism

STROBE

Strengthening the Reporting of Observational Studies in Epidemiology

Data Availability

Primary data from the UK Biobank resource are accessible upon application (https://www.ukbiobank.ac.uk/). Derived data supporting the results of this study are available in the S1 Data file.

Funding Statement

SCL reports support from the Swedish Research Council for Health, Working Life and Welfare (2018-00123), the Swedish Research Council (2019-00977), and the Swedish Heart-Lung Foundation (20190247). SK reports support from a Cancer Research UK programme grant, the Integrative Cancer Epidemiology Programme (C18281/A19169) and a Junior Research Fellowship from Homerton College, Cambridge. AMM is funded by the National Institute for Health Research [Cambridge Biomedical Research Centre at the Cambridge University Hospitals NHS Foundation Trust]. SB reports support from a Sir Henry Dale Fellowship jointly funded by the Wellcome Trust and the Royal Society (204623/Z/16/Z). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Collaborators GBDA. Alcohol use and burden for 195 countries and territories, 1990–2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet. 2018;392: 1015–35. 10.1016/S0140-6736(18)31310-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Inoue-Choi M, Liao LM, Reyes-Guzman C, Hartge P, Caporaso N, Freedman ND. Association of long-term, low-intensity smoking with all-cause and cause-specific mortality in the National Institutes of Health-AARP Diet and Health Study. JAMA Intern Med. 2017;177: 87–95. 10.1001/jamainternmed.2016.7511 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.World Health Organization. WHO global report on mortality attributable to tobacco. Geneva: World Health Organization, 2012. Available from: https://apps.who.int/iris/handle/10665/44815. [Google Scholar]
  • 4.Gandini S, Botteri E, Iodice S, Boniol M, Lowenfels AB, Maisonneuve P, et al. Tobacco smoking and cancer: a meta-analysis. Int J Cancer. 2008;122: 155–64. 10.1002/ijc.23033 [DOI] [PubMed] [Google Scholar]
  • 5.Pirie K, Peto R, Reeves GK, Green J, Beral V, Million Women Study C. The 21st century hazards of smoking and benefits of stopping: a prospective study of one million women in the UK. Lancet. 2013;381: 133–41. 10.1016/S0140-6736(12)61720-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.IARC Working Group on the Evaluation of Carcinogenic Risks to Humans. Tobacco smoke and involuntary smoking. IARC Monogr Eval Carcinog Risks Hum 2004;83: 1–1438. [PMC free article] [PubMed] [Google Scholar]
  • 7.Secretan B, Straif K, Baan R, Grosse Y, El Ghissassi F, Bouvard V, et al. A review of human carcinogens—Part E: tobacco, areca nut, alcohol, coal smoke, and salted fish. Lancet Oncol. 2009;10: 1033–4. 10.1016/s1470-2045(09)70326-2 [DOI] [PubMed] [Google Scholar]
  • 8.McGee EE, Jackson SS, Petrick JL, Van Dyke AL, Adami HO, Albanes D, et al. Smoking, alcohol, and biliary tract cancer risk: a Pooling Project of 26 prospective studies. J Natl Cancer Inst. 2020;111: 1263–78. 10.1093/jnci/djz103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Allen NE, Beral V, Casabonne D, Kan SW, Reeves GK, Brown A, et al. Moderate alcohol intake and cancer incidence in women. J Natl Cancer Inst. 2009;101: 296–305. 10.1093/jnci/djn514 [DOI] [PubMed] [Google Scholar]
  • 10.McNabb S, Harrison TA, Albanes D, Berndt SI, Brenner H, Caan BJ, et al. Meta-analysis of 16 studies of the association of alcohol with colorectal cancer. Int J Cancer. 2020;146: 861–73. 10.1002/ijc.32377 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Lee JE, Hunter DJ, Spiegelman D, Adami HO, Albanes D, Bernstein L, et al. Alcohol intake and renal cell cancer in a pooled analysis of 12 prospective studies. J Natl Cancer Inst. 2007;99: 801–10. 10.1093/jnci/djk181 [DOI] [PubMed] [Google Scholar]
  • 12.Bagnardi V, Rota M, Botteri E, Tramacere I, Islami F, Fedirko V, et al. Alcohol consumption and site-specific cancer risk: a comprehensive dose-response meta-analysis. Br J Cancer. 2015;112: 580–93. 10.1038/bjc.2014.579 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Jung S, Wang M, Anderson K, Baglietto L, Bergkvist L, Bernstein L, et al. Alcohol consumption and breast cancer risk by estrogen receptor status: in a pooled analysis of 20 studies. Int J Epidemiol. 2016;45: 916–28. 10.1093/ije/dyv156 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Bagnardi V, Rota M, Botteri E, Tramacere I, Islami F, Fedirko V, et al. Light alcohol drinking and cancer: a meta-analysis. Ann Oncol. 2013;24: 301–8. 10.1093/annonc/mds337 [DOI] [PubMed] [Google Scholar]
  • 15.Liu M, Jiang Y, Wedow R, Li Y, Brazel DM, Chen F, et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat Genet. 2019;51: 237–44. 10.1038/s41588-018-0307-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Wang Y, McKay JD, Rafnar T, Wang Z, Timofeeva MN, Broderick P, et al. Rare variants of large effect in BRCA2 and CHEK2 affect risk of lung cancer. Nat Genet. 2014;46: 736–41. 10.1038/ng.3002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Michailidou K, Lindstrom S, Dennis J, Beesley J, Hui S, Kar S, et al. Association analysis identifies 65 new breast cancer risk loci. Nature. 2017;551: 92–4. 10.1038/nature24284 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Phelan CM, Kuchenbaecker KB, Tyrer JP, Kar SP, Lawrenson K, Winham SJ, et al. Identification of 12 new susceptibility loci for different histotypes of epithelial ovarian cancer. Nat Genet. 2017;49: 680–91. 10.1038/ng.3826 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Schumacher FR, Al Olama AA, Berndt SI, Benlloch S, Ahmed M, Saunders EJ, et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat Genet. 2018;50: 928–36. 10.1038/s41588-018-0142-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Hemani G, Zheng J, Elsworth B, Wade KH, Haberland V, Baird D, et al. The MR-Base platform supports systematic causal inference across the human phenome. Elife. 2018;7: pii: e34408. 10.7554/eLife.34408 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12: e1001779 10.1371/journal.pmed.1001779 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Machiela MJ, Chanock SJ. LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants. Bioinformatics. 2015;31: 3555–7. 10.1093/bioinformatics/btv402 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Burgess S, Bowden J, Fall T, Ingelsson E, Thompson SG. Sensitivity analyses for robust causal inference from Mendelian randomization analyses with multiple genetic variants. Epidemiology. 2017;28: 30–42. 10.1097/EDE.0000000000000559 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Burgess S, Foley CN, Allara E, Staley JR, Howson JMM. A robust and efficient method for Mendelian randomization with hundreds of genetic variants. Nat Commun. 2020;11: 376 10.1038/s41467-019-14156-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Verbanck M, Chen CY, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet. 2018;50: 693–8. 10.1038/s41588-018-0099-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Spiller W, Davies NM, Palmer TM. Software application profile: mrrobust—a tool for performing two-sample summary Mendelian randomization analyses Int J Epidemiol. 2019;48: 684–90. 10.1093/ije/dyy195 [DOI] [Google Scholar]
  • 27.Yavorska OO, Burgess S. MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data. Int J Epidemiol. 2017;46: 1734–9. 10.1093/ije/dyx034 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Burgess S. Sample size and power calculations in Mendelian randomization with a single instrumental variable and a binary outcome. Int J Epidemiol. 2014;43: 922–9. 10.1093/ije/dyu005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Fircanis S, Merriam P, Khan N, Castillo JJ. The relation between cigarette smoking and risk of acute myeloid leukemia: an updated meta-analysis of epidemiological studies. Am J Hematol. 2014;89: E125–32. 10.1002/ajh.23744 [DOI] [PubMed] [Google Scholar]
  • 30.Huncharek M, Haddock KS, Reid R, Kupelnick B. Smoking as a risk factor for prostate cancer: a meta-analysis of 24 prospective cohort studies. Am J Public Health. 2010;100: 693–701. 10.2105/AJPH.2008.150508 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Rohrmann S, Linseisen J, Allen N, Bueno-de-Mesquita HB, Johnsen NF, Tjonneland A, et al. Smoking and the risk of prostate cancer in the European Prospective Investigation into Cancer and Nutrition. Br J Cancer. 2013;108: 708–14. 10.1038/bjc.2012.520 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Watters JL, Park Y, Hollenbeck A, Schatzkin A, Albanes D. Cigarette smoking and prostate cancer in a prospective US cohort study. Cancer Epidemiol Biomarkers Prev. 2009;18: 2427–35. 10.1158/1055-9965.EPI-09-0252 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Brien SE, Ronksley PE, Turner BJ, Mukamal KJ, Ghali WA. Effect of alcohol consumption on biological markers associated with risk of coronary heart disease: systematic review and meta-analysis of interventional studies. BMJ. 2011;342: d636 10.1136/bmj.d636 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Udenigwe CC, Ramprasath VR, Aluko RE, Jones PJ. Potential of resveratrol in anticancer and anti-inflammatory therapy. Nutr Rev. 2008;66: 445–54. 10.1111/j.1753-4887.2008.00076.x [DOI] [PubMed] [Google Scholar]
  • 35.Fernandez-Panchon MS, Villano D, Troncoso AM, Garcia-Parrilla MC. Antioxidant activity of phenolic compounds: from in vitro results to in vivo evidence. Crit Rev Food Sci Nutr. 2008;48: 649–71. 10.1080/10408390701761845 [DOI] [PubMed] [Google Scholar]
  • 36.Boccia S, Hashibe M, Galli P, De Feo E, Asakage T, Hashimoto T, et al. Aldehyde dehydrogenase 2 and head and neck cancer: a meta-analysis implementing a Mendelian randomization approach. Cancer Epidemiol Biomarkers Prev. 2009;18: 248–54. 10.1158/1055-9965.EPI-08-0462 [DOI] [PubMed] [Google Scholar]
  • 37.Lewis SJ, Smith GD. Alcohol, ALDH2, and esophageal cancer: a meta-analysis which illustrates the potentials and limitations of a Mendelian randomization approach. Cancer Epidemiol Biomarkers Prev. 2005;14: 1967–71. 10.1158/1055-9965.EPI-05-0196 [DOI] [PubMed] [Google Scholar]
  • 38.Wang J, Wang H, Chen Y, Hao P, Zhang Y. Alcohol ingestion and colorectal neoplasia: a meta-analysis based on a Mendelian randomization approach. Colorectal Dis. 2011;13: e71–8. 10.1111/j.1463-1318.2010.02530.x [DOI] [PubMed] [Google Scholar]
  • 39.Rota M, Porta L, Pelucchi C, Negri E, Bagnardi V, Bellocco R, et al. Alcohol drinking and risk of leukemia-a systematic review and meta-analysis of the dose-risk relation. Cancer Epidemiol. 2014;38: 339–45. 10.1016/j.canep.2014.06.001 [DOI] [PubMed] [Google Scholar]
  • 40.Brunner C, Davies NM, Martin RM, Eeles R, Easton D, Kote-Jarai Z, et al. Alcohol consumption and prostate cancer incidence and progression: A Mendelian randomisation study. Int J Cancer. 2017;140: 75–85. 10.1002/ijc.30436 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Burgess S, Davies NM, Thompson SG. Bias due to participant overlap in two-sample Mendelian randomization. Genet Epidemiol. 2016;40: 597–608. 10.1002/gepi.21998 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Palmer TM, Sterne JA, Harbord RM, Lawlor DA, Sheehan NA, Meng S, et al. Instrumental variable estimation of causal risk ratios and causal odds ratios in Mendelian randomization analyses. Am J Epidemiol. 2011;173: 1392–403. 10.1093/aje/kwr026 [DOI] [PubMed] [Google Scholar]
  • 43.Vansteelandt S, Bowden J, Babanezhad M, Goetghebeur E. On instrumental variables estimation of causal odds ratios. Stat Sci. 2011;26: 403–22. [Google Scholar]
  • 44.Larsson SC, Mason AM, Back M, Klarin D, Damrauer SM, Million Veteran P, et al. Genetic predisposition to smoking in relation to 14 cardiovascular diseases. Eur Heart J. 2020. 10.1093/eurheartj/ehaa193 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Larsson SC, Burgess S, Michaëlsson K. Smoking and stroke: a Mendelian randomization study. Ann Neurol. 2019;86: 468–71. 10.1002/ana.25534 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Yuan S, Larsson SC. A causal relationship between cigarette smoking and type 2 diabetes mellitus: A Mendelian randomization study. Sci Rep. 2019;9: 19342 10.1038/s41598-019-56014-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Yuan S, Michaelsson K, Wan Z, Larsson SC. Associations of Smoking and Alcohol and Coffee Intake with Fracture and Bone Mineral Density: A Mendelian Randomization Study. Calcif Tissue Int. 2019;105: 582–8. 10.1007/s00223-019-00606-0 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Caitlin Moyer

12 Mar 2020

Dear Dr. Larsson,

Thank you very much for submitting your manuscript "Smoking, alcohol consumption and cancer in UK Biobank: A Mendelian randomisation study" (PMEDICINE-D-19-03758) for consideration at PLOS Medicine.

Your paper was evaluated by a senior editor and discussed among all the editors here. It was also discussed with an academic editor with relevant expertise, and sent to independent reviewers, including a statistical reviewer. The reviews are appended at the bottom of this email and any accompanying reviewer attachments can be seen via the link below:

[LINK]

In light of these reviews, I am afraid that we will not be able to accept the manuscript for publication in the journal in its current form, but we would like to consider a revised version that addresses the reviewers' and editors' comments. Obviously we cannot make any decision about publication until we have seen the revised manuscript and your response, and we plan to seek re-review by one or more of the reviewers.

In revising the manuscript for further consideration, your revisions should address the specific points made by each reviewer and the editors. Please also check the guidelines for revised papers at http://journals.plos.org/plosmedicine/s/revising-your-manuscript for any that apply to your paper. In your rebuttal letter you should indicate your response to the reviewers' and editors' comments, the changes you have made in the manuscript, and include either an excerpt of the revised text or the location (eg: page and line number) where each change can be found. Please submit a clean version of the paper as the main article file; a version with changes marked should be uploaded as a marked up manuscript.

In addition, we request that you upload any figures associated with your paper as individual TIF or EPS files with 300dpi resolution at resubmission; please read our figure guidelines for more information on our requirements: http://journals.plos.org/plosmedicine/s/figures. While revising your submission, please upload your figure files to the PACE digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at PLOSMedicine@plos.org.

We expect to receive your revised manuscript by Apr 02 2020 11:59PM. Please email us (plosmedicine@plos.org) if you have any questions or concerns.

***Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.***

We ask every co-author listed on the manuscript to fill in a contributing author statement, making sure to declare all competing interests. If any of the co-authors have not filled in the statement, we will remind them to do so when the paper is revised. If all statements are not completed in a timely fashion this could hold up the re-review process. If new competing interests are declared later in the revision process, this may also hold up the submission. Should there be a problem getting one of your co-authors to fill in a statement we will be in contact. YOU MUST NOT ADD OR REMOVE AUTHORS UNLESS YOU HAVE ALERTED THE EDITOR HANDLING THE MANUSCRIPT TO THE CHANGE AND THEY SPECIFICALLY HAVE AGREED TO IT. You can see our competing interests policy here: http://journals.plos.org/plosmedicine/s/competing-interests.

Please use the following link to submit the revised manuscript:

https://www.editorialmanager.com/pmedicine/

Your article can be found in the "Submissions Needing Revision" folder.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see http://journals.plos.org/plosmedicine/s/submission-guidelines#loc-methods.

Please ensure that the paper adheres to the PLOS Data Availability Policy (see http://journals.plos.org/plosmedicine/s/data-availability), which requires that all data underlying the study's findings be provided in a repository or as Supporting Information. For data residing with a third party, authors are required to provide instructions with contact information for obtaining the data. PLOS journals do not allow statements supported by "data not shown" or "unpublished results." For such statements, authors must provide supporting data or cite public sources that include it.

We look forward to receiving your revised manuscript.

Sincerely,

Caitlin Moyer, Ph.D.

Associate Editor

PLOS Medicine

plosmedicine.org

-----------------------------------------------------------

Requests from the editors:

Ref 1 raises important points, which we think are especially important to address.

General point: please tone down sentences such as this in the abstract: “These findings indicate that smoking is causally associated with an increased risk of cancer”. While MR study is indicative, it cannot show causality and so please change this sentence and other instances in the text (including the Author Summary and the conclusion section).

Please provide p values in the abstract with 95%Cis. In addition, is it possible to include some summary demographic information in the abstract?

Please remove spaces in between multiple refs in square brackets.

Please use sections and Paragraphs for the STROBE – pages change during revisions and formatting etc.

Comments from the reviewers:

Reviewer #1: Larsson et al. use Mendelian randomisation (MR) to examine the relationship between smoking and alcohol consumption, and 19 site-specific cancers. The manuscript is concise and well written, and the methodologies applied are appropriate. The key limitation of the study is its low power to identify causal relationships with many of the cancers. This is especially true for alcohol consumption, which the authors causally relate to no site-specific cancer, despite strong prior evidence from other studies. This low power prevents the study from providing further information on cancers for which evidence of a relationship with smoking and alcohol consumption has been so far mixed.

Major comments:

1) Low power to detect causal relationships occurs due to the authors use of UK BioBank data for the site-specific cancers. I think this is the main issue with this study. For many cancers, much larger datasets have been published and are publicly available (for example the BCAC breast cancer GWAS contains >100,000 breast cancer cases, in contrast to the ~14,000 breast cancer cases in UK BioBank). Where possible, these larger cancer GWAS should be used in place of the UK BioBank data. I appreciate that in some instances data access limitations and scientific politics would prevent the authors of this manuscript from accessing these larger data sets. However, use of larger datasets where possible would substantially improve this analysis. Currently, the analysis of alcohol consumption with site-specific cancer risk feels like a missed opportunity, given that for 13/19 site-specific cancer, the study has less than 80% power to detect ORs of 3.

2) Where UK BioBank data cannot be replaced with data from a larger cancer GWAS, further information should be provided about how cancer association statistics were computed using UK BioBank data. Did the authors compute these statistics themselves, or were they obtained from another source, such as the Neale Lab (who have computed association statistics for UK BioBank data http://www.nealelab.is/uk-biobank). Were any samples excluded for QC reasons?

3) The authors note that sample overlap between the exposure and outcome datasets is likely biasing their results. Ideally, the authors should recompute association statistics for smoking and alcohol consumption excluded UK BioBank participants. If this is not possible, then this caveat should be more prominent, preferably in the abstract. The potential magnitude of this bias should also be estimated, as per Burgess et al. (https://www.ncbi.nlm.nih.gov/pubmed/27625185).

4) We have previously observed that wald-type ratio estimators do not provide accurate estimates of the causal OR when both the exposure and outcome traits are binary (https://www.ncbi.nlm.nih.gov/pubmed/29540232, full disclosure: I am an author on this paper). This has also been noted by others (e.g. Palmer et al. https://www.ncbi.nlm.nih.gov/pubmed/21555716). Are the causal estimates between smoking initiation and cancer risk reported here potentially affected by such bias? If so, this should be noted.

Minor comments:

1) Supplementary Figure 1 should be annotated with the site-specific cancers, to make it easier for readers to identify the cancers for which the study has suitable power.

2) The authors consider their analysis of overall cancer risk to be their "primary" analysis, whilst the analysis of site-specific cancers are "secondary" analyses. How smoking affects cancer risk differs substantially between sites (as seen in Figure 2), and I am therefore unsure how useful the analysis of overall cancer risk is. The analysis of site-specific cancers is potentially more interesting, and therefore this should be the "primary" analysis.

3) The conclusion currently states: "These MR findings indicate smoking is causally associated with an increased risk of cancer, particularly of the lung...". The causal association between smoking and lung cancer is very well understood and more focus should therefore be placed on the site-specific cancers over which there is more debate.

4) Further information should be provided about how the lifetime smoking IV takes into account duration, heaviness etc.

5) The authors apply Bonferroni correction to overall cancer risk, but not the site-specific cancers. Multiple testing should be corrected for in both analyses.

Alex Cornish (ICR, London) - following Stephen Burgess' lead of open peer review.

Reviewer #2: This is a two-sample MR analysis that investigates the association of smoking initiation, lifetime smoking and alcohol consumption with several cancers in the UK Biobank. This is the largest MR study investigating smoking, alcohol in relation to cancer risk. It has verified several positive associations observed in the observational literature for smoking and risk of several cancers, but failed to do so for alcohol consumption; the authors have correctly reasoned this finding to the low power of the alcohol consumption analysis, as only 0.3% of the variation in alcohol consumption is explained by the known GWAS identified genetic variants. The analysis is straightforward and the paper is well-written. I have only a few minor comments:

1. In the Introduction, second paragraph, where the authors describe the alcohol and cancer observational literature, they should add that the evidence is also strong for high alcohol consumption and risk of stomach and colorectal cancer.

2. The authors performed an analysis for lifetime smoking using a UK Biobank GWAS to define relevant instruments. They have already commented in the Discussion about potential bias caused due to overlap of the exposure and outcome datasets. It would be nice to extend the discussion to situations where the overlap is complete.

3. Cancer is a heterogeneous set of diseases, and it doesn't make a lot of sense to conduct analyses for all cancer sites combined.

4. The contamination mixture method developed by one of the study authors could be another method that the investigators could use for smoking initiation to investigate potential pleiotropy but also whether the 361 SNP IV could be subgrouped to more than one potential mechanisms.

Reviewer #3: This study uses the approach of Mendelian randomization (MR) to assess causality for associations of cancer with smoking and alcohol consumption, respectively. While this is an interesting and worthwhile study and the methodology seems to be sound, I feel that the authors somewhat overestimate the explanatory power of their study. Given that causality has already been established for many of the associations based on comprehensive reviews of experimental and observational evidence by renowned institutions (International Agency for Research on Cancer, US Surgeon General, World Cancer Research Fund…), it seems weird when the authors question causality of associations to justify their MR approach, even though they can only provide low precision estimates due to the small amount of variance explained by their genetic instruments despite high sample sizes and thus cannot exclude false-negative findingns. Having said that, their approach is indeed worthwhile and could be used to further underpin available evidence. But I nevertheless would recommend authors to tone down some of their statements. For example on page 9 of the discussion, where authors state that their findings "extend the observational evidence that smoking is also a risk factor for cancers of the head and neck, oesophagus, stomach, pancreas, cervix, bladder, and kidney." The causality of these associations has been already well-established for years and decades. The authors go on to say that their results offer "strong support for smoking to be considered as a risk factor for this wider range of cancers in clinical practice". I think we are already way past this point - the long available evidence should have already been translated into clinical practice years ago.

The study in its current form is missing one important piece of the puzzle: there is no data supporting whether in this specific sample the genetic instruments are actually associated with higher smoking and alcohol consumption, respectively. This would be further confirmation for the suitability of the genetic instruments, but it would also allow for better assessment of the statistical power to detect effects. Some information on the amount of variance explained by the genetic instruments is reported in the manuscript, but my understanding is that those stem from published meta-analyses. I would thus recommend to add some data on the associations between the genetic instruments and different indicators of smoking and alcohol exposure in the sample.

Any attachments provided with reviews can be seen via the following link:

[LINK]

Decision Letter 1

Caitlin Moyer

18 May 2020

Dear Dr. Larsson,

Thank you very much for re-submitting your manuscript "Smoking, alcohol consumption and cancer: A Mendelian randomisation study" (PMEDICINE-D-19-03758R1) for review by PLOS Medicine.

I have discussed the paper with my colleagues and the academic editor and the revised version was seen by two reviewers. There are remaining editorial and production issues need to be dealt with before we would be able to accept the paper for publication in the journal. In particular, we require that you please temper instances of strong causal language throughout the manuscript, provide sufficient data access information in the data availability statement, and include participant summary demographic information.

The remaining issues that need to be addressed are listed at the end of this email. Any accompanying reviewer attachments can be seen via the link below. Please take these into account before resubmitting your manuscript:

[LINK]

Our publications team (plosmedicine@plos.org) will be in touch shortly about the production requirements for your paper, and the link and deadline for resubmission. DO NOT RESUBMIT BEFORE YOU'VE RECEIVED THE PRODUCTION REQUIREMENTS.

***Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.***

In revising the manuscript for further consideration here, please ensure you address the specific points made by each reviewer and the editors. In your rebuttal letter you should indicate your response to the reviewers' and editors' comments and the changes you have made in the manuscript. Please submit a clean version of the paper as the main article file. A version with changes marked must also be uploaded as a marked up manuscript file.

Please also check the guidelines for revised papers at http://journals.plos.org/plosmedicine/s/revising-your-manuscript for any that apply to your paper. If you haven't already, we ask that you provide a short, non-technical Author Summary of your research to make findings accessible to a wide audience that includes both scientists and non-scientists. The Author Summary should immediately follow the Abstract in your revised manuscript. This text is subject to editorial change and should be distinct from the scientific abstract.

We expect to receive your revised manuscript within 1 week. Please email us (plosmedicine@plos.org) if you have any questions or concerns.

We ask every co-author listed on the manuscript to fill in a contributing author statement. If any of the co-authors have not filled in the statement, we will remind them to do so when the paper is revised. If all statements are not completed in a timely fashion this could hold up the re-review process. Should there be a problem getting one of your co-authors to fill in a statement we will be in contact. YOU MUST NOT ADD OR REMOVE AUTHORS UNLESS YOU HAVE ALERTED THE EDITOR HANDLING THE MANUSCRIPT TO THE CHANGE AND THEY SPECIFICALLY HAVE AGREED TO IT.

Please ensure that the paper adheres to the PLOS Data Availability Policy (see http://journals.plos.org/plosmedicine/s/data-availability), which requires that all data underlying the study's findings be provided in a repository or as Supporting Information. For data residing with a third party, authors are required to provide instructions with contact information for obtaining the data. PLOS journals do not allow statements supported by "data not shown" or "unpublished results." For such statements, authors must provide supporting data or cite public sources that include it.

If you have any questions in the meantime, please contact me (cmoyer@plos.org) or the journal staff on plosmedicine@plos.org.

We look forward to receiving the revised manuscript by May 25 2020 11:59PM.

Sincerely,

Caitlin Moyer, Ph.D.

Associate Editor

PLOS Medicine

plosmedicine.org

------------------------------------------------------------

Requests from Editors:

1.Response to Reviewer comments: As requested by the reviewer, please include your caveat regarding Reviewer 1, Point #4, a a discussion point in the manuscript (perhaps in the section describing limitations).

2. Title: Please mention the study population in the title (e.g. the UK Biobank, consortia). Please revise your title, we suggest: “Smoking, alcohol consumption and risk of cancer: A Mendelian randomisation study in UK Biobank and international genetic consortia participants” or similar.

3. Competing Interests: Please add this statement to the manuscript's Competing Interests in the manuscript submission form: "SB is a paid statistical consultant on PLOS Medicine's statistical board."

4. Data Availability: Thank you for providing the summary-level data as a supporting information file. However, we require that you please make the de-identified primary data available, or provide contact information for interested researchers to apply for access to such data (please note that the contact for data access cannot be one of the study’s authors).

PLOS defines the “minimal data set” to consist of the data set used to reach the conclusions drawn in the manuscript with related metadata and methods, and any additional data required to replicate the reported study findings in their entirety. Authors do not need to submit their entire data set, or the raw data collected during an investigation. Please submit the following data:

The values behind the means, standard deviations and other measures reported;

The values used to build graphs;

The points extracted from images for analysis.

5. Prospective analysis plan: Did your study have a prospective protocol or analysis plan? Please state this (either way) early in the Methods section.

a) If a prospective analysis plan (from your funding proposal, IRB or other ethics committee submission, study protocol, or other planning document written before analyzing the data) was used in designing the study, please include the relevant prospectively written document with your revised manuscript as a Supporting Information file to be published alongside your study, and cite it in the Methods section. A legend for this file should be included at the end of your manuscript.

b) If no such document exists, please make sure that the Methods section transparently describes when analyses were planned, and when/why any data-driven changes to analyses took place.

c) In either case, changes in the analysis-- including those made in response to peer review comments-- should be identified as such in the Methods section of the paper, with rationale.

6. Abstract (and throughout manuscript): Early in the abstract, the wording could be adjusted to acknowledge that there is, in fact, little or no doubt that smoking causes lung cancer. Also, and more generally throughout the paper, the wording should accommodate existing knowledge more realistically (e.g., the ACS website states quite unambiguously that "Smoking is the most important risk factor for bladder cancer. Smokers are at least 3 times as likely to get bladder cancer as non-smokers. Smoking causes about half of all bladder cancers in both men and women.") Perhaps "genetic predisposition to smoking" is an imperfect proxy for actual smoking.

7. Abstract: Introduction: Please revise the final sentence to: “Mendelian randomisation study sought to investigate associations in support of a causal relationship between smoking and alcohol consumption and 19 site-specific cancers.”

8. Abstract: Methods and Findings: Please identify some demographics of the participants included in the study (country, etc.) for the genome-wide association studies consortia.

9. Abstract: Methods and Findings: Please clarify that the associations between smoking and prostate cancer in the UK Biobank, and between overall cancer and alcohol consumption did not reach statistical significance. Please provide the confidence intervals and p-values for: “A positive association between alcohol consumption and lung cancer was observed in the International Lung Cancer Consortium, but not in UK Biobank.”

10. Abstract: Methods and Findings: Please revise this sentence to reflect that there was no statistically significant relationship between alcohol consumption and overall cancer, i.e. “no evidence” rather than “limited evidence”: “We found limited evidence that genetically-predicted alcohol consumption was associated with overall cancer (n=75 037 cases; OR 0.95; 95% CI 0.84-1.07; p=0.376).”

11. Abstract: Methods and Findings: In the last sentence of the Abstract Methods and Findings section, please describe the main limitation(s) of the study's methodology.

12. Abstract: Conclusions: Please revise the first sentence to: “Our findings support the well-established relationship between smoking and lung cancer... and suggest that smoking may also be a risk factor for cancer of the head and neck, oesophagus, stomach, cervix and bladder.” Please revise the final sentence to: “We found no evidence supporting a relationship between alcohol consumption and overall or site-specific cancer risk.” or similar, to clarify the meaning of “an association...cannot be precluded.” (please also revise this in the final “Conclusions” paragraph of the Discussion section).

13. Author Summary: Under “What did the researchers do and find?”: Please remove “strong or suggestive” from the first bullet point.

14. Author Summary: Under “What did the researchers do and find?”: Please remove “though

several estimates were in the same direction as those reported by observational studies.” as the directions aren’t specifically mentioned, this is not helpful information.

15. Author Summary: Under “What do these findings mean?”: This text should be distinct from the scientific abstract. (Please see our author guidelines for more information: https://journals.plos.org/plosmedicine/s/revising-your-manuscript#loc-author-summary)

Please revise the bullet points; we suggest; In this study, we observed a relationship between smoking and lung cancer, as well as evidence that smoking may also be a risk factor for cancer of the head and neck, oesophagus, stomach, cervix, and bladder.” For the second bullet point, we suggest: “We found no evidence supporting a relationship between alcohol consumption and overall or site-specific cancer risk.” or similar, to clarify the meaning of “an association...cannot be precluded.”

16. Methods: “All participants provided informed consent.” Please specify whether consent was written or oral.

17. Methods: Please provide summary demographic information (population and setting, years of inclusion) for the study participants, in the UK Biobank and GWAS consortia.

18. Methods: Please add the following statement, or similar, to the Methods: "This study is reported as per the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guideline (S1 Checklist)."

19. Results: Paragraph 2 (and throughout): Please revise “strong or suggestive evidence” and “suggestive evidence” to be more clear- if statistical significance is intended, please indicate that. If clinical significance is intended, please remove the term because there is no clinical aspect in the study and so clinical significance cannot be addressed.

20. Results: Top of page 10: Please remove the word “slightly” from the sentence: “Adjustment for genetic predisposition to smoking using multivariable Mendelian randomization slightly attenuated the association of genetically-predicted alcohol consumption with lung cancer (OR 1.75; 95% CI 1.23-2.49; p=0.002) (S5 Table).”

21. Results: Where you describe relationships between alcohol consumption and site-specific cancers, please make it clear in the text whether these relationships were statistically significant or not.

22. Discussion: First sentence: Please avoid assertions of primacy ("We are the first....") We suggest you temper this with the phrase “To our knowledge…” or similar.

23. Discussion: First paragraph (bottom of page 10): Please revise this sentence, to temper the causal implications: “... offering strong support of causation to previous observational studies.” (e.g., "supporting a causal relationship as consistent with previous observational ..." might be helpful).

24. Discussion: Please present and organize the Discussion as follows: a short, clear summary of the article's findings; what the study adds to existing research and where and why the results may differ from previous research; strengths and limitations of the study; implications and next steps for research, clinical practice, and/or public policy; one-paragraph conclusion. Specifically, your discussion is missing a final paragraph reflecting on the study’s implications.

25. Conclusion: Please revise this sentence to: “The results of this study support the well-established relationship between smoking and lung cancer, and suggest that smoking may also be a risk factor for cancer of the head and neck, oesophagus, stomach, cervix and bladder. We found no evidence in support of a relationship between alcohol consumption and overall cancer risk, but associations between alcohol consumption and risk of site-specific cancer should be further investigated.” or similar. Note that this paragraph appears to be identical to the abstract conclusions and the Author Summary.

26. Figure 1: In the legend, the “2” in I2 should be superscript.

Comments from Reviewers:

Reviewer #1: Larsson et al. have satisfactorily addressed the majority of my comments.

The only comment that I don't think was fully addressed was Major Comment 4. I think the caveat that Larsson et al. responded with should be added to the manuscript (i.e. that Wald-type methods are unbiased under the null, and that the estimators do not reflect meaningful causal parameters).

Alex Cornish (ICR, London).

Reviewer #2: The authors have adequately addressed my comments.

Kostas Tsilidis, Imperial College London and University of Ioannina

Any attachments provided with reviews can be seen via the following link:

[LINK]

Decision Letter 2

Caitlin Moyer

25 Jun 2020

Dear Dr. Larsson,

On behalf of my colleagues and the academic editor, Dr. Konstantinos K Tsilidis, I am delighted to inform you that your manuscript entitled "Smoking, alcohol consumption and cancer: A Mendelian randomisation study in UK Biobank and international genetic consortia participants" (PMEDICINE-D-19-03758R2) has been accepted for publication in PLOS Medicine.

PRODUCTION PROCESS

Before publication you will see the copyedited word document (in around 1-2 weeks from now) and a PDF galley proof shortly after that. The copyeditor will be in touch shortly before sending you the copyedited Word document. We will make some revisions at the copyediting stage to conform to our general style, and for clarification. When you receive this version you should check and revise it very carefully, including figures, tables, references, and supporting information, because corrections at the next stage (proofs) will be strictly limited to (1) errors in author names or affiliations, (2) errors of scientific fact that would cause misunderstandings to readers, and (3) printer's (introduced) errors.

If you are likely to be away when either this document or the proof is sent, please ensure we have contact information of a second person, as we will need you to respond quickly at each point.

PRESS

A selection of our articles each week are press released by the journal. You will be contacted nearer the time if we are press releasing your article in order to approve the content and check the contact information for journalists is correct. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact.

PROFILE INFORMATION

Now that your manuscript has been accepted, please log into EM and update your profile. Go to https://www.editorialmanager.com/pmedicine, log in, and click on the "Update My Information" link at the top of the page. Please update your user information to ensure an efficient production and billing process.

Thank you again for submitting the manuscript to PLOS Medicine. We look forward to publishing it.

Best wishes,

Caitlin Moyer, Ph.D.

Associate Editor

PLOS Medicine

plosmedicine.org

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 STROBE Checklist. STROBE, Strengthening the Reporting of Observational Studies in Epidemiology.

    (DOCX)

    S1 Table. Sources and definition of cancers in UK Biobank.

    (XLSX)

    S2 Table. Power calculation for the associations of smoking and alcohol consumption with cancer.

    (XLSX)

    S3 Table. Associations of genetic predisposition to smoking initiation with site-specific cancers in the primary inverse-variance weighted analysis and in sensitivity analyses using other MR methods.

    MR, mendelian randomisation.

    (XLSX)

    S4 Table. Associations of genetic predisposition to smoking initiation with subtypes of lung cancer, non-Hodgkin lymphoma, and leukaemia.

    (XLSX)

    S5 Table. Associations of genetically predicted alcohol consumption with site-specific cancers in the primary inverse-variance weighted analysis and in sensitivity analyses using other MR methods.

    MR, mendelian randomisation.

    (XLSX)

    S6 Table. Associations of genetically predicted alcohol consumption with subtypes of lung cancer, breast cancer, non-Hodgkin lymphoma, and leukaemia in MR analyses.

    MR, mendelian randomisation.

    (XLSX)

    S1 Data. Derived summary statistics data supporting the result of this study.

    (XLSX)

    Attachment

    Submitted filename: Rebuttal letter.docx

    Attachment

    Submitted filename: Response to Editors 2nd revision.docx

    Data Availability Statement

    Primary data from the UK Biobank resource are accessible upon application (https://www.ukbiobank.ac.uk/). Derived data supporting the results of this study are available in the S1 Data file.


    Articles from PLoS Medicine are provided here courtesy of PLOS

    RESOURCES