Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

medRxiv logoLink to medRxiv
[Preprint]. 2024 Apr 6:2024.04.04.24305361. [Version 1] doi: 10.1101/2024.04.04.24305361

Genetic Risk, Health-Associated Lifestyle, and Risk of Early-onset Total Cancer and Breast Cancer

Yin Zhang 1,#, Sara Lindström 2,3, Peter Kraft 4,5,6, Yuxi Liu 5,7,#
PMCID: PMC11023660  PMID: 38633776

Abstract

Importance

Early-onset cancer (diagnosed under 50 years of age) is associated with aggressive disease characteristics and its rising incidence is a global concern. The association between healthy lifestyle and early-onset cancer and whether it varies by common genetic variants is unknown.

Objective

To examine the associations between genetic risk, lifestyle, and risk of early-onset cancers.

Design, Setting, and Participants

We analyzed a prospective cohort of 66,308 white British participants who were under age 50 and free of cancer at baseline in the UK Biobank.

Exposures

Sex-specific composite total cancer polygenic risk scores (PRSs), a breast cancer-specific PRS, and sex-specific health-associated lifestyle scores (HLSs, which summarize smoking status, body mass index [males only], physical activity, alcohol consumption, and diet).

Main Outcomes and Measures

Hazard ratios (HRs) and 95% confidence intervals (CIs) for early-onset total and breast cancer.

Results

A total of 1,247 incident invasive early-onset cancer cases (female: 820, male: 427, breast: 386) were documented. In multivariable-adjusted analyses with 2-year latency, higher genetic risk (highest vs. lowest tertile of PRS) was associated with significantly increased risks of early-onset total cancer in females (HR, 95% CI: 1.85, 1.50–2.29) and males (1.94, 1.45–2.59) as well as early-onset breast cancer in females (3.06, 2.20–4.25). An unfavorable lifestyle (highest vs. lowest category of HLS) was associated with higher risk of total cancer and breast cancer in females across genetic risk categories; the association with total cancer was stronger in the highest genetic risk category than the lowest: HRs in females and men were 1.85 (1.02, 3.36), 3.27 (0.78, 13.72) in the highest genetic risk category and 1.15 (0.44, 2.98), 1.16 (0.39, 3.40) in the lowest.

Conclusions and Relevance

Both genetic and lifestyle factors were independently associated with early-onset total and breast cancer risk. Compared to those with low genetic risk, individuals with a high genetic risk may benefit more from adopting a healthy lifestyle in preventing early-onset cancer.

Introduction

Cancers are caused by inherited variants and acquired mutations induced by environmental factors or result from unavoidable DNA replication errors. Modifiable lifestyle factors, such as smoking, physical activity, and diet, and their joint effect have been linked with the incidence of overall and major cancer types.1,2 In the United States, data from the Centers for Disease Control and Prevention and the National Cancer Institute suggested that over 40% of all incident cancers and cancer deaths were attributable to unhealthy lifestyle factors.3 Evidence from large population-based prospective studies suggest that a substantial number of cancer cases may be preventable through lifestyle modification.1 On the other hand, genome-wide association studies (GWAS) have identified thousands of susceptibility loci across major cancer types.4 Emerging studies have investigated the degree to which adherence to a healthy lifestyle may attenuate the risk of cancer across strata defined by common genetic variants, aggregated into cancer polygenic risk scores (PRS). These studies have focused on overall cancer5 and a few major cancer types.614

Early-onset cancer, defined as cancer diagnosed under age 50, represents a unique spectrum of malignancies15,16 and generally manifests with a more aggressive disease phenotype.15,16 The incidence of early-onset cancer has increased globally over the past decades, possibly supporting a basis in changing environmental hazards or interactions between hazardous environments and genetics.15,1719 Mounting evidence has established close epidemiological and biological links between unhealthy lifestyle and early-onset cancer.20 In addition, cancer PRS tend to be more strongly associated with early-onset compared to late-onset cancer.21 However, no previous study has evaluated the magnitude to which adopting a healthy lifestyle may attenuate the impact of common genetic variants on early-onset cancer risk, highlighting a significant knowledge gap.

To address this important unanswered question, we conducted a large prospective cohort study to investigate the association between genetic risk, health-associated lifestyle, and risk of early-onset total cancer in both sexes, as well as early-onset breast cancer in females (which accounts for almost half of the early-onset cancer cases in our female study population).

Study Population

The study was performed using data from the UK Biobank longitudinal cohort, the details of which have been described previously.2224 In brief, the UK Biobank began between 2006 and 2010 when more than 500,000 participants aged 40 to 70 years from 22 assessment centers across England, Scotland, and Wales were enrolled. Information on demographics (age, sex, ethnicity), lifestyle and other health-related factors were collected via extensive baseline questionnaires, interviews, and physical measurements. Blood samples were collected at baseline and were used for genotyping. Informed consent was obtained from participants during the baseline assessment.

Ascertainment of Analytic Population

We restricted the study population to self-reported white British participants. We excluded participants diagnosed with any cancer before or at the cohort baseline, aged over 50 at the cohort baseline, or without genotype data. We further excluded individuals with genetic sex discordance, second (or higher)-degree related individuals (kinship coefficients >0.088), and heterozygosity or call rate outliers based on genotyping data. Individuals who had withdrawn consent to participate, or with missingness in smoking, body mass index (BMI), physical activity, alcohol intake, and diet were further removed, resulting in 66,308 eligible participants (females: 34,383, males: 31,925) for the final analyses.

Ascertainment of Genetic Risk

Genetic risk for early-onset total cancer was assessed using sex-specific composite PRSs. Briefly, we calculated composite PRSs as weighted sums of a spectrum of published cancer site-specific PRSs for European ancestry populations.2534 These PRSs were selected based on their original training sample size, methods, test-set performance, possibility of overfit, and the availability of SNPs and weights. We used weights obtained from lasso regression by regressing early-onset total cancer on the PRSs of individual cancers (including lifestyle factors and a range of variables selected a priori [detailed in the Statistical Analysis section] as covariates, with a partial penalty on PRSs only).

We constructed sex-specific composite total cancer PRSs due to the difference in cancer spectra between females and males. Specifically in analyses focused on early-onset breast cancer among females, we constructed a breast cancer-specific PRS.25 The lists of published cancer site-specific PRSs included in the lasso regression and those ultimately selected for the development of sex-specific composite total cancer PRSs are summarized in eTables 12 in Supplement 1. The complete lists of SNPs included in the cancer site-specific PRSs are presented in eTables S1S23 in Supplement 2. The early-onset cancer spectrum in females and males, showcasing cancers with published site-specific PRSs qualified for inclusion in the lasso regression, are summarized in eTable 3 in the Supplement 1.

PRSs for individual cancer types were calculated as weighted sums of the effect allele dosage for selected SNPs for each individual assuming an additive model based on the formula β1 × SNPi,1 + β2 × SNPi,2 + … + βj × SNPi,j + … + βn × SNPi,n, where SNPi,j is the effect allele dosage for SNP j for individual i, βj is per-allele log odds ratio for SNP j for a specific cancer type, n is the total number of selected SNPs for the specific cancer type. The sex-specific composite total cancer PRSs were constructed based on the formula h1 ×PRSi,1 +h2 ×PRSi,2 +⋯+hk ×PRSi,k, where PRSi,k is the PRS for cancer type k for individual i, hk is the weight for cancer type k obtained from lasso regression.

Ascertainment of Health-associated Lifestyle Scores

Sex-specific health-associated lifestyle scores (HLSs) were calculated based on a combination of baseline smoking status, BMI (males only), physical activity, alcohol consumption, and diet.57 BMI was not included as a component of female-specific HLS due to the widely recognized inverse association of BMI with the risk of early-onset/pre-menopausal breast cancer,35 coupled with the fact that early-onset total cancer in our female study population was predominantly driven by early-onset breast cancer. Healthy lifestyles were defined as no current smoking, normal BMI (18.5 to 25 kg/m2), adequate physical activity (exercised for at least 75 minutes of vigorous activity per week or 150 minutes of moderate activity per week or an equivalent combination), no alcohol consumption, and healthy diet (consumed an increased amount of fruits, vegetables, whole grains, and a reduced amount of red meats and processed meats).57,36,37 To streamline the delivery of the public health message in a simple way, participants received a score of 1 if they didn’t meet the criterion for a specific lifestyle factor or 0 otherwise.57 We then added up the sum across the lifestyle components resulting in a final unweighted HLS ranging from 0–4 (females) or 0–5 (males), with higher scores indicating an unhealthier lifestyle. More details of the sex-specific HLSs can be found in eTable 4 in Supplement 1.

Ascertainment of Cancer and Death

Incident invasive cancer cases were ascertained via linkage to the National Health Service central registers and death registries in England, Wales, and Scotland. Cases were coded using the International Classification of Diseases (10th Revision). Early-onset cancer cases were defined as those diagnosed under the age of 50 years.17 Deaths were ascertained through linkage to death registries.

Statistical Analysis

The analyses of early-onset total cancer were conducted separately by sex, given the difference in cancer spectra between females and males. The analyses of early-onset breast cancer were restricted to females only. To minimize the potential impact of reverse causation, a latency period was introduced by excluding the first 2 years of follow-up.

Person-years of follow-up was calculated from the baseline until the date of diagnosis of any cancer (except non-melanoma skin cancer), death, loss to follow-up, or 50th birthday, whichever occurred earliest. Descriptive analyses were performed to assess population characteristics across categories of PRS and HLS. Multivariable Cox proportional hazards models with age as the timescale were used to estimate hazard ratios (HRs) and 95% confidence intervals (CIs) of early-onset total cancer and early-onset breast cancer across PRS and HLS groups (PRS: high, intermediate, vs. low, based on tertiles; HLS: unhealthy, intermediate, vs. healthy, based on the predefined cutoffs detailed in eTable 4 in Supplement 1) in females and males, as well as for their joint associations by using the collapsed categories. HRs and 95% CIs per 1-SD increase in PRS and per 1-unit increase in HLS were also estimated. Moreover, stratified analysis was performed to assess the association between HLS and early-onset total cancer and breast cancer risk within each PRS stratum. The proportional hazards assumption was tested using the likelihood ratio test to compare models with and without product terms between exposures and log-transformed age. No violation of the assumption was detected.

Covariates were selected a priori. Multivariate analyses of early-onset total cancer were stratified by sex, and adjusted for the first 10 genetic principal components (continuous), genotyping batch (categorical), average total household income (less than 18,000, 18,000–30,999, 31,000–51,999, 52,000–100,000, greater than 100,000, prefer not to answer or do not know), education (college or university degree, some professional qualifications, secondary education, others, prefer not to answer or do not know), BMI (continuous, kg/m2; in analyses of females only), and family history of cancer (yes, no; family history of breast cancer was used in analyses of breast cancer). PRS and HLS were mutually adjusted.

Multivariable analyses of early-onset breast cancer were adjusted for the above-mentioned covariates, and were additionally adjusted for age at menarche (<12, 12–13, ≥14 years, prefer not to answer or do not know), parity (nulliparous, parous, prefer not to answer or do not know), age at first live birth (<25, 25–29, ≥30 years, prefer not to answer or do not know), oral contraceptive use (yes, no, prefer not to answer or do not know), menopausal status and hormone replacement therapy use (premenopausal, postmenopausal with hormone replacement therapy, postmenopausal without hormone replacement therapy, prefer not to answer or do not know), and history of mammograms (yes, no, prefer not to answer or do not know).

Data analyses were performed using R software (version 4.3.1, https://www.r-project.org/), PLINK (version 2.0, https://www.cog-genomics.org/plink/2.0/),38 and the Polygenic Score Catalog Calculator (version 2.0, https://github.com/PGScatalog/pgsc_calc).39 Tests were two-sided, with P values <0.05 indicating statistical significance.

Results

Population Characteristics

A total of 1,247 incident invasive early-onset cancer cases (820 in females, 427 in males) were documented among 66,308 eligible participants during 329,509 person-years of follow-up (females: 168,897, males: 160,612), including 386 cases of incident early-onset breast cancer in females. The top three most common incident invasive early-onset cancers documented were breast cancer, melanoma, and colorectal cancer in females, and melanoma, colorectal cancer, and prostate cancer in males. (eTable 3 in Supplement 1) The distribution of age, HLS and its components (smoking status, BMI, physical activity, alcohol consumption, and diet), total household income, and education did not vary appreciably across the PRSs of interest (female-specific total cancer PRS, male-specific total cancer PRS, and breast cancer PRS). Further, we observe no difference in age at menarche, parity, age at first live birth, oral contraceptive use, menopausal status, and hormone replacement therapy use across the PRSs. In contrast, individuals with higher PRSs (indicating higher genetic risk) in both sexes tended to have family history of cancer, and females with higher PRSs were more likely to have undergone mammography. (Tables 1, and eTable 5 in Supplement 1)

Table 1.

Characteristics of female and male participants at baseline according to female-specific and male-specific total cancer PRSs

Characteristicsa,b Females (N=34383) Males (N=31925)
Female-specific Total Cancer PRS Male-specific Total Cancer PRS
Low (N=11444) Intermediate (N=11476) High (N=11463) Low (N=10601) Intermediate (N=10607) High (N=10717)
Age, mean (SD) 45.0 (2.73) 45.0 (2.72) 45.0 (2.72) 44.9 (2.75) 45.0 (2.74) 44.9 (2.75)
Sex-specific HLS, n (%)
 Healthy 2566 (22.4%) 2515 (21.9%) 2507 (21.9%) 792 (7.5%) 740 (7.0%) 745 (7.0%)
 Intermediate 8462 (73.9%) 8527 (74.3%) 8477 (74.0%) 6625 (62.5%) 6560 (61.8%) 6636 (61.9%)
 Unhealthy 416 (3.6%) 434 (3.8%) 479 (4.2%) 3184 (30.0%) 3307 (31.2%) 3336 (31.1%)
Smoking, n (%)
 Current smoker 1204 (10.5%) 1216 (10.6%) 1225 (10.7%) 1551 (14.6%) 1547 (14.6%) 1568 (14.6%)
 Others 10240 (89.5%) 10260 (89.4%) 10238 (89.3%) 9050 (85.4%) 9060 (85.4%) 9149 (85.4%)
BMI, n (%)
 18.5≤ to <25 kg/m2 5447 (47.6%) 5477 (47.7%) 5538 (48.3%) 2912 (27.5%) 2914 (27.5%) 2964 (27.7%)
 Others 5997 (52.4%) 5999 (52.3%) 5925 (51.7%) 7689 (72.5%) 7693 (72.5%) 7753 (72.3%)
Physical activity, n (%)
 Met the guidelinesc 5901 (51.6%) 5877 (51.2%) 5838 (50.9%) 6224 (58.7%) 6127 (57.8%) 6183 (57.7%)
 Didn’t meet the guidelines 5543 (48.4%) 5599 (48.8%) 5625 (49.1%) 4377 (41.3%) 4480 (42.2%) 4534 (42.3%)
Alcohol intake, n (%)
 Never 637 (5.6%) 575 (5.0%) 567 (4.9%) 413 (3.9%) 423 (4.0%) 406 (3.8%)
 Special occasions only 1274 (11.1%) 1311 (11.4%) 1222 (10.7%) 617 (5.8%) 567 (5.3%) 666 (6.2%)
 One to three times a month 1745 (15.2%) 1646 (14.3%) 1790 (15.6%) 1184 (11.2%) 1217 (11.5%) 1226 (11.4%)
 Once or twice a week 3432 (30.0%) 3403 (29.7%) 3533 (30.8%) 3318 (31.3%) 3236 (30.5%) 3286 (30.7%)
 Three or four times a week 2818 (24.6%) 2818 (24.6%) 2737 (23.9%) 2977 (28.1%) 3015 (28.4%) 3057 (28.5%)
 Daily or almost daily 1538 (13.4%) 1723 (15.0%) 1614 (14.1%) 2092 (19.7%) 2149 (20.3%) 2076 (19.4%)
Fruit and vegetable intake, n (%)
 <3 servings/day 3958 (34.6%) 4105 (35.8%) 4014 (35.0%) 4735 (44.7%) 4867 (45.9%) 4880 (45.5%)
 ≥3 to <5 servings/day 4966 (43.4%) 4906 (42.8%) 5025 (43.8%) 3886 (36.7%) 3867 (36.5%) 3849 (35.9%)
 ≥5 servings/day 2520 (22.0%) 2465 (21.5%) 2424 (21.1%) 1980 (18.7%) 1873 (17.7%) 1988 (18.6%)
Whole grain intake, n (%)
 <2 servings/day 7342 (64.2%) 7249 (63.2%) 7302 (63.7%) 5766 (54.4%) 5849 (55.1%) 5884 (54.9%)
 ≥2 to <5.5 servings/day 4062 (35.5%) 4187 (36.5%) 4120 (35.9%) 4525 (42.7%) 4454 (42.0%) 4542 (42.4%)
 ≥5.5 servings/day 40 (0.3%) 40 (0.3%) 41 (0.4%) 310 (2.9%) 304 (2.9%) 291 (2.7%)
Red and processed meat intake, n (%)
 <2 times/week 2330 (20.4%) 2362 (20.6%) 2274 (19.8%) 1033 (9.7%) 966 (9.1%) 986 (9.2%)
 ≥2 to <4 times/week 5478 (47.9%) 5400 (47.1%) 5416 (47.2%) 3847 (36.3%) 3807 (35.9%) 3848 (35.9%)
 ≥4 times/week 3636 (31.8%) 3714 (32.4%) 3773 (32.9%) 5721 (54.0%) 5834 (55.0%) 5883 (54.9%)
Average total household income, n (%)
 Less than 18,000 1199 (10.5%) 1181 (10.3%) 1208 (10.5%) 788 (7.4%) 833 (7.9%) 820 (7.7%)
 18,000 to 30,999 1907 (16.7%) 1881 (16.4%) 1945 (17.0%) 1525 (14.4%) 1546 (14.6%) 1587 (14.8%)
 31,000 to 51,999 3379 (29.5%) 3271 (28.5%) 3353 (29.3%) 3251 (30.7%) 3176 (29.9%) 3307 (30.9%)
 52,000 to 100,000 3268 (28.6%) 3396 (29.6%) 3292 (28.7%) 3467 (32.7%) 3502 (33.0%) 3414 (31.9%)
 Greater than 100,000 924 (8.1%) 937 (8.2%) 888 (7.7%) 1000 (9.4%) 981 (9.2%) 982 (9.2%)
 Prefer not to answer or do not know 767 (6.7%) 810 (7.1%) 777 (6.8%) 570 (5.4%) 569 (5.4%) 607 (5.7%)
Family history of cancer, n (%)
 No 8497 (74.2%) 8161 (71.1%) 8019 (70.0%) 7624 (71.9%) 7588 (71.5%) 7685 (71.7%)
 Yes 2947 (25.8%) 3315 (28.9%) 3444 (30.0%) 2977 (28.1%) 3019 (28.5%) 3032 (28.3%)
Education, n (%)
 College or University degree 4626 (40.4%) 4741 (41.3%) 4644 (40.5%) 4276 (40.3%) 4127 (38.9%) 4218 (39.4%)
 Some professional qualifications 297 (2.6%) 301 (2.6%) 309 (2.7%) 172 (1.6%) 197 (1.9%) 175 (1.6%)
 Secondary education 6160 (53.8%) 6115 (53.3%) 6190 (54.0%) 5679 (53.6%) 5793 (54.6%) 5820 (54.3%)
 Others 337 (2.9%) 294 (2.6%) 290 (2.5%) 437 (4.1%) 457 (4.3%) 466 (4.3%)
 Prefer not to answer 24 (0.2%) 25 (0.2%) 30 (0.3%) 37 (0.3%) 33 (0.3%) 38 (0.4%)
Age at menarche, n (%)
 <12 years of age 1970 (17.2%) 1953 (17.0%) 1983 (17.3%) -- -- --
 12 to 13 years of age 5054 (44.2%) 5088 (44.3%) 5015 (43.7%) -- -- --
 ≥14 years of age 4168 (36.4%) 4167 (36.3%) 4184 (36.5%) -- -- --
 Prefer not to answer or do not know 252 (2.2%) 268 (2.3%) 281 (2.5%) -- -- --
Parity, n (%)
 Nulliparous 3027 (26.5%) 3057 (26.6%) 3112 (27.1%) -- -- --
 Parous 8414 (73.5%) 8417 (73.3%) 8347 (72.8%) -- -- --
 Prefer not to answer or do not know 3 (0.0%) 2 (0.0%) 4 (0.0%) -- -- --
Age at first live birth, n (%)
 <25 years of age 1981 (17.3%) 2054 (17.9%) 1955 (17.1%) -- -- --
 25 to 29 years of age 2543 (22.2%) 2543 (22.2%) 2512 (21.9%) -- -- --
 ≥30 years of age 2090 (18.3%) 2056 (17.9%) 2074 (18.1%) -- -- --
 Prefer not to answer or do not know 4830 (42.2%) 4823 (42.0%) 4922 (42.9%) -- -- --
Oral contraceptive use, n (%)
 Yes 10504 (91.8%) 10573 (92.1%) 10560 (92.1%) -- -- --
 No 925 (8.1%) 894 (7.8%) 895 (7.8%) -- -- --
 Prefer not to answer or do not know 15 (0.1%) 9 (0.1%) 8 (0.1%) -- -- --
Menopausal status and hormone replacement therapy use, n (%)
 Postmenopausal (with hormone replacement therapy) 258 (2.3%) 272 (2.4%) 254 (2.2%) -- -- --
 Postmenopausal (without hormone replacement therapy) 508 (4.4%) 514 (4.5%) 476 (4.2%)
 Premenopausal 10663 (93.2%) 10673 (93.0%) 10715 (93.5%) -- -- --
 Prefer not to answer or do not know 15 (0.1%) 17 (0.1%) 18 (0.2%) -- -- --
History of mammograms, n (%)
 Yes 2831 (24.7%) 2920 (25.4%) 3134 (27.3%) -- -- --
 No 8575 (74.9%) 8523 (74.3%) 8287 (72.3%) -- -- --
 Prefer not to answer or do not know 38 (0.3%) 33 (0.3%) 42 (0.4%) -- -- --

Abbreviations: PRS, polygenic risk score; HLS, health-associated lifestyle score; SD, standard deviation.

a

Characteristics of eligible female participants.

b

Mean (SD) for continuous variables and n (%) for categorical variables.

c

Met the guidelines of 150 minutes of moderate activity per week or 75 minutes of vigorous activity (or an equivalent combination).

For both sexes, individuals with a higher HLS (indicating an unhealthier lifestyle) tended to have lower total household income and education levels and were more likely to have a family history of cancer. Among females, those with a higher HLS were less likely to undergo mammography or be premenopausal. Additionally, they were less likely to have experienced menarche between 12–13 years of age and to have their first birth before age 25 years. Moreover, they were more likely to use oral contraceptives. (eTables 67 in Supplement 1)

PRS and the Risk of Early-onset Total and Breast Cancer

In multivariable-adjusted analyses with 2-year latency, higher genetic risk (highest vs. lowest tertile of PRS) was associated with significantly increased risks of early-onset total cancer in females (female-specific composite PRS HR=1.85; 95% CI, 1.50–2.29), early onset total cancer in males (male-specific composite PRS HR=1.94; 95% CI, 1.45–2.59), and early-onset breast cancer in females (breast-cancer PRS HR=3.06; 95% CI, 2.20–4.25). The HR (95% CI) per 1-SD increase in PRS was 1.29 (1.19, 1.40) for early-onset total cancer in females, 1.36 (1.22, 1.53) for early-onset total cancer in males, and 1.58 (1.41, 1.78) for early-onset breast cancer in females. (Figure 1)

Figure 1.

Figure 1.

PRS and early-onset total and breast cancer risk

A: Multivariable-adjusted analysis of female-specific total cancer PRS and early-onset total cancer in females.

B: Multivariable-adjusted analysis of male-specific total cancer PRS and early-onset total cancer in males.

C: Multivariable-adjusted analysis of breast cancer PRS and early-onset breast cancer in females.

Multivariate analyses of early-onset total cancer were stratified by sex, and adjust for the first 10 genetic principal components for ancestry, genotyping batch, average total household income, education, BMI (females only), and family history of cancer (family history of breast cancer was used in analyses of breast cancer), plus adjustment of HLS. Multivariate analyses of early-onset breast cancer were adjusted for above-mentioned covariates, and were additionally adjusted for age at menarche, parity, age at first live birth, oral contraceptive use, menopausal status and hormone replacement therapy use, and history of mammograms. Reference group: individuals with low PRS.

Abbreviations: PRS, polygenic risk score; HLS, health-associated lifestyle score; SD, standard deviation; BMI, body mass index.

HLS and the Risk of Early-onset Total and Breast Cancer

Adopting an unhealthy lifestyle (highest vs. lowest category of HLS) showed suggestive associations with increased risks in females of early-onset total cancer (female-specific HLS HR=1.49; 95% CI, 0.99–2.25), and early-onset breast cancer (HR=1.78; 95% CI, 0.97–3.24), but not for early-onset total cancer in males (male-specific HLS HR=1.14; 95% CI, 0.67–1.95). The associations among females were statistically significant when examining effect estimates on the scale of per 1-unit increase for early-onset total cancer (HR=1.12; 95%CI, 1.01–1.24) and early-onset breast cancer (HR=1.17; 95%CI, 1.02–1.35), but not in males for early-onset total cancer (HR=1.01; 95%CI, 0.90–1.14). (Figure 2)

Figure 2:

Figure 2:

HLS and early-onset total and breast cancer risk

A: Multivariable-adjusted analysis of female-specific HLS and early-onset total cancer in females.

B: Multivariable-adjusted analysis of male-specific HLS and early-onset total cancer in males.

C: Multivariable-adjusted analysis of female-specific HLS and early-onset breast cancer in females.

Multivariate analyses of early-onset total cancer were stratified by sex, and adjust for the first 10 genetic principal components for ancestry, genotyping batch, average total household income, education, BMI (females only), and family history of cancer (family history of breast cancer was used in analyses of breast cancer), plus adjustment of PRS. Multivariate analyses of early-onset breast cancer were adjusted for above-mentioned covariates, and were additionally adjusted for age at menarche, parity, age at first live birth, oral contraceptive use, menopausal status and hormone replacement therapy use, and history of mammograms. Reference group: individuals with healthy HLS.

Abbreviations: PRS, polygenic risk score; HLS, health-associated lifestyle score; BMI, body mass index.

HLS with Early-onset Total and Breast Cancer Risk, Stratified by PRS Category

The HRs (95% CI) of early-onset total cancer in females and males, and early-onset breast cancer in females associated with adopting an unfavorable lifestyle (highest vs. lowest category of HLS) were 1.85 (1.02, 3.36), 3.27 (0.78, 13.72), and 1.67 (0.71, 3.90), respectively, in those with high genetic risk; 1.25 (0.60, 2.57), 1.11 (0.42, 2.89), and 1.66 (0.54, 5.11), respectively, in those with intermediate genetic risk; and 1.15 (0.44, 2.98), 1.16 (0.39, 3.40), and 2.10 (0.57, 7.75), respectively, in those with low genetic risk. On the scale of per 1-unit increase in HLS, the HRs (95% CI) of early-onset total cancer in females and males, and early-onset breast cancer in females were 1.20 (1.03, 1.40), 1.06 (0.87, 1.29), and 1.22 (1.00, 1.48), respectively, in those with high genetic risk; 1.00 (0.84, 1.19), 1.01 (0.81, 1.26), and 1.14 (0.88, 1.48), respectively, in those with intermediate genetic risk; and 1.14 (0.93, 1.40), 0.98 (0.77, 1.25), and 1.07 (0.76, 1.51), respectively, in those with low genetic risk. The P value for the 2-df interaction test on the log HR scale was 0.39 for early-onset total cancer in females, 0.77 for early-onset total cancer in males, and 0.11 for early-onset breast cancer in females. (Figure 3)

Figure 3:

Figure 3:

Figure 3:

HLS and early-onset total and breast cancer risk, stratified by PRS category

A: Multivariable-adjusted analysis of female-specific HLS and early-onset total cancer in females, stratified by female-specific total cancer PRS.

B: Multivariable-adjusted analysis of male-specific HLS and early-onset total cancer in males, stratified by male-specific total cancer PRS.

C: Multivariable-adjusted analysis of female-specific HLS and early-onset breast cancer in females, stratified by breast cancer PRS.

Multivariate analyses of early-onset total cancer were stratified by sex, and adjust for the first 10 genetic principal components for ancestry, genotyping batch, average total household income, education, BMI (females only), and family history of cancer (family history of breast cancer was used in analyses of breast cancer). Multivariate analyses of early-onset breast cancer were adjusted for above-mentioned covariates, and were additionally adjusted for age at menarche, parity, age at first live birth, oral contraceptive use, menopausal status and hormone replacement therapy use, and history of mammograms. Reference group: individuals with healthy HLS within each PRS stratum.

Abbreviations: PRS, polygenic risk score; HLS, health-associated lifestyle score; BMI, body mass index.

Joint Associations of PRS and HLS with Early-onset Total and Breast Cancer Risk

Compared to individuals with both low genetic risk (lowest tertile of PRS) and a favorable lifestyle (lowest category of HLS), those with both high genetic risk (highest tertile of PRS) and an unfavorable lifestyle (highest category of HLS) had increased risks of early-onset total cancer in females (HR=3.07; 95% CI, 1.64–5.78) and males (HR=2.18; 95% CI, 0.78–6.11), and significantly higher early-onset breast cancer risk in females (HR=4.11; 95% CI, 1.56–10.85). (Figure 4)

Figure 4.

Figure 4.

Figure 4.

Joint associations of PRS and HLS with early-onset total and breast cancer risk

A: Multivariable-adjusted joint association analysis of female-specific total cancer PRS and female-specific HLS with early-onset total cancer in females.

B: Multivariable-adjusted joint association analysis of male-specific total cancer PRS and male-specific HLS with early-onset total cancer in males.

C: Multivariable-adjusted joint association analysis of breast cancer PRS and female-specific HLS with early-onset breast cancer in females.

Multivariate analyses of early-onset total cancer were stratified by sex, and adjust for the first 10 genetic principal components for ancestry, genotyping batch, average total household income, education, BMI (females only), and family history of cancer (family history of breast cancer was used in analyses of breast cancer). Multivariate analyses of early-onset breast cancer were adjusted for above-mentioned covariates, and were additionally adjusted for age at menarche, parity, age at first live birth, oral contraceptive use, menopausal status and hormone replacement therapy use, and history of mammograms. Reference group: individuals with both low PRS and healthy HLS.

Abbreviations: PRS, polygenic risk score; HLS, health-associated lifestyle score; BMI, body mass index.

Discussion

Early-onset cancer is generally more aggressive compared to late-onset cancer with its rising incidence becoming a global concern. Deciphering the relationship between genetic risk, lifestyle modification, and risk of early-onset cancers may inform preventive strategies. In this large prospective cohort study among white British individuals, we present the first epidemiological evidence for this relationship for early-onset cancers.

Our results showed that genetic predisposition and lifestyle factors each demonstrated independent associations with the risk of early-onset total cancer and breast cancer. Stratified analyses by PRSs indicate that adopting a healthy lifestyle is likely beneficial for all individuals. Impressively, individuals with a high genetic risk may derive greater benefits from adopting a healthy lifestyle to prevent early-onset total cancer, compared to those with a low or intermediate genetic risk, although we were unable to detect statistically significant interaction between HLS and PRS mainly due to the limited number of early-onset cancer cases. These findings may inform preventive strategies for early-onset cancer in populations with varying genetic risk profiles.

To our knowledge, no previous study has investigated the extent to which adopting a healthy lifestyle could mitigate the impact of common genetic variants on the risk of early-onset cancers. A prior study of European ancestry population in the UK Biobank examined genetic risk and the benefits of adherence to a healthy lifestyle in relation to total cancer, irrespective of age at onset.5 Their findings demonstrated an additive interaction between genetic and lifestyle factors, indicating that individuals with a higher genetic risk may benefit more from lifestyle modification in relation to overall cancer risk in both sexes.5 Another study of invasive breast cancer among females of European ancestry in the UK Biobank reported that for premenopausal females, lifestyle intervention could have the greatest impact on those with a high genetic risk.6 Conversely, for postmenopausal females, lifestyle intervention might offer essentially similar benefits regardless of an individual’s genetic predisposition.6 The associations of genetic risk and lifestyle with risk of other major cancers, including cancers of thyroid, lung, stomach, pancreas, colorectum, ovarian, kidney, bladder, uterine, prostate, non-Hodgkin’s lymphoma, lymphocytic leukemia, and melanoma, have been reported.814 However, no such evidence has been reported for early-onset cancers.

Strengths and Limitations

Our study has several strengths of note, including: 1) It provides the first epidemiological evidence addressing this important unanswered question in early-onset cancer prevention. 2) The prospective cohort study design, which minimized the potential for recall bias and selection bias. 3) The large sample size, with over 66,000 eligible participants involved in the final analytical population. 4) Ascertainment of cancer cases through linkage to national registries enhanced internal validity. 5) A wide spectrum of potential confounding factors was selected a priori and included as covariates in multivariable-adjusted analyses, which ensures relatively rigorous control for confounding. 6) Standardized protocols to assess heritable and lifestyle factors. 7) Novel approach for developing sex-specific composite total cancer PRSs to assess genetic risk of early-onset total cancer. 8) The concern of overfitting multicancer PRSs has been mitigated, given that 19 of the 23 site-specific cancer PRSs included did not incorporate UK Biobank data in their training samples. On average, the UK Biobank constitutes only 13% of the case data across these PRSs. 9) Applied 2-year latency to minimize the potential influence of reverse causation. 10) The analyses of early-onset total cancer were performed separately by sex to reflect the difference in cancer spectrum.

Our study also has several limitations: 1) The possibility for residual confounding cannot be completely ruled out due to the observational nature of study design. Making causal arguments should be approached with caution. 2) The aggregate analytic approach in analyses early-onset total cancer may potentially mask heterogeneity of effects across different cancer types, though it increases the power to detect the effects of interest. 3) The inability to explore heterogeneity across individual cancer types other than breast cancer further, due to power limitations. 4) The ethnic homogeneity of our analytic population (all white British) may limit generalizability of current findings to other groups. 5) Lifestyle factors were assessed based on a single measurement at baseline rather than through repeated measurements (which can better depict long-term trends), raising concern about the potential influence of random within-person variation. 6) We were not able to consider the age of 45 years40 as the cutoff in analyses of early-onset breast cancer due to the limited number of incident cases. We were unable to consider separate analysis by ER status because information on hormone-receptor subtype was unavailable in the UK Biobank.

Future large prospective investigations with repeated measurements and long-term follow-up are warranted to provide additional evidence on these associations in early-onset cancers in diverse populations and to examine potential heterogeneity across individual cancer types in the associations of interest.

Conclusions

Both genetic and lifestyle factors were independently associated with risks of early-onset total cancer and breast cancer. Compared to those with low genetic risk, individuals with a high genetic risk may benefit more from adopting a healthy lifestyle in preventing early-onset cancer.

Supplementary Material

Supplement 1
media-1.pdf (2.8MB, pdf)
1

Funding/Support and Role of the Funder/Sponsor

YZ is supported by Irene M. & Fredrick J. Stare Nutrition Education Fund Doctoral Scholarship and Mayer Fund Doctoral Scholarship. SL is supported by NIH grant R01CA194393. PK is supported by NIH grants R01 CA260352 and U01 CA249866. The funding sources played no role in the study design, data collection, data analysis, and interpretation of results, or the decisions made in preparation and submission of the article. UK Biobank has received core funding from the Wellcome Trust medical charity, Medical Research Council, Department of Health, Scottish Government and the Northwest Regional Development Agency, the Welsh Government, British Heart Foundation, Cancer Research UK and Diabetes UK, National Institute for Health and Care Research. The details of UK Biobank core funding and additional funding are reported at https://www.ukbiobank.ac.uk/learn-more-about-uk-biobank/about-us/our-funding.

Funding Statement

YZ is supported by Irene M. & Fredrick J. Stare Nutrition Education Fund Doctoral Scholarship and Mayer Fund Doctoral Scholarship. SL is supported by NIH grant R01CA194393. PK is supported by NIH grants R01 CA260352 and U01 CA249866. The funding sources played no role in the study design, data collection, data analysis, and interpretation of results, or the decisions made in preparation and submission of the article. UK Biobank has received core funding from the Wellcome Trust medical charity, Medical Research Council, Department of Health, Scottish Government and the Northwest Regional Development Agency, the Welsh Government, British Heart Foundation, Cancer Research UK and Diabetes UK, National Institute for Health and Care Research. The details of UK Biobank core funding and additional funding are reported at https://www.ukbiobank.ac.uk/learn-more-about-uk-biobank/about-us/our-funding.

Footnotes

Conflict of Interest Disclosures

The authors declare no potential conflicts of interest.

Disclaimer

The authors obtained access to the UK Biobank data through the approved project application 70925. The authors assume full responsibility for analyses and interpretation of these data. The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of any agencies. The authors thank Dr. Kathryn L. Penney at the Harvard T. H. Chan School of Public Health and Harvard Medical School for her insightful comments and suggestions for this manuscript.

Ethical Approval

The details of UK Biobank research ethics approval are elaborated at https://www.ukbiobank.ac.uk/learn-more-about-uk-biobank/about-us/ethics.

References

  • 1.Song M, Giovannucci E. Preventable Incidence and Mortality of Carcinoma Associated With Lifestyle Factors Among White Adults in the United States. JAMA Oncol. 2016;2(9):1154–1161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.van Dam RM, Li T, Spiegelman D, Franco OH, Hu FB. Combined impact of lifestyle factors on mortality: prospective cohort study in US women. BMJ. 2008;337:a1440. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Islami F, Goding Sauer A, Miller KD, et al. Proportion and number of cancer cases and deaths attributable to potentially modifiable risk factors in the United States. CA Cancer J Clin. 2018;68(1):31–54. [DOI] [PubMed] [Google Scholar]
  • 4.Sud A, Kinnersley B, Houlston RS. Genome-wide association studies of cancer: current insights and future perspectives. Nat Rev Cancer. 2017;17(11):692–704. [DOI] [PubMed] [Google Scholar]
  • 5.Zhu M, Wang T, Huang Y, et al. Genetic Risk for Overall Cancer and the Benefit of Adherence to a Healthy Lifestyle. Cancer Res. 2021;81(17):4618–4627. [DOI] [PubMed] [Google Scholar]
  • 6.Arthur RS, Wang T, Xue X, Kamensky V, Rohan TE. Genetic Factors, Adherence to Healthy Lifestyle Behavior, and Risk of Invasive Breast Cancer Among Women in the UK Biobank. J Natl Cancer Inst. 2020;112(9):893–901. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Choi J, Jia G, Wen W, Shu XO, Zheng W. Healthy lifestyles, genetic modifiers, and colorectal cancer risk: a prospective cohort study in the UK Biobank. Am J Clin Nutr. 2021;113(4):810–820. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Liang H, Zhou X, Zhu Y, et al. Association of outdoor air pollution, lifestyle, genetic factors with the risk of lung cancer: A prospective cohort study. Environ Res. 2023;218:114996. [DOI] [PubMed] [Google Scholar]
  • 9.Feng X, Wang F, Yang W, et al. Association Between Genetic Risk, Adherence to Healthy Lifestyle Behavior, and Thyroid Cancer Risk. JAMA Netw Open. 2022;5(12):e2246311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Jin G, Lv J, Yang M, et al. Genetic risk, incident gastric cancer, and healthy lifestyle: a meta-analysis of genome-wide association studies and prospective cohort study. Lancet Oncol. 2020;21(10):1378–1386. [DOI] [PubMed] [Google Scholar]
  • 11.Carr PR, Weigl K, Jansen L, et al. Healthy Lifestyle Factors Associated With Lower Risk of Colorectal Cancer Irrespective of Genetic Risk. Gastroenterology. 2018;155(6):1805–1815 e1805. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zeng L, Wu Z, Yang J, Zhou Y, Chen R. Association of genetic risk and lifestyle with pancreatic cancer and their age dependency: a large prospective cohort study in the UK Biobank. BMC Med. 2023;21(1):489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.He Q, Wu S, Zhou Y, et al. Genetic factors, adherence to healthy lifestyle behaviors, and risk of bladder cancer. BMC Cancer. 2023;23(1):965. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Byrne S, Boyle T, Ahmed M, Lee SH, Benyamin B, Hypponen E. Lifestyle, genetic risk and incidence of cancer: a prospective cohort study of 13 cancer types. Int J Epidemiol. 2023;52(3):817–826. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Gupta S, Harper A, Ruan Y, et al. International Trends in the Incidence of Cancer Among Adolescents and Young Adults. J Natl Cancer Inst. 2020;112(11):1105–1117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Barr RD, Holowaty EJ, Birch JM. Classification schemes for tumors diagnosed in adolescents and young adults. Cancer. 2006;106(7):1425–1430. [DOI] [PubMed] [Google Scholar]
  • 17.Sung H, Siegel RL, Rosenberg PS, Jemal A. Emerging cancer trends among young adults in the USA: analysis of a population-based cancer registry. Lancet Public Health. 2019;4(3):e137–e147. [DOI] [PubMed] [Google Scholar]
  • 18.Miller KD, Fidler-Benaoudia M, Keegan TH, Hipp HS, Jemal A, Siegel RL. Cancer statistics for adolescents and young adults, 2020. CA Cancer J Clin. 2020;70(6):443–459. [DOI] [PubMed] [Google Scholar]
  • 19.Tsilidis KK, Kasimis JC, Lopez DS, Ntzani EE, Ioannidis JP. Type 2 diabetes and cancer: umbrella review of meta-analyses of observational studies. BMJ. 2015;350:g7607. [DOI] [PubMed] [Google Scholar]
  • 20.Ugai T, Sasamoto N, Lee HY, et al. Is early-onset cancer an emerging global epidemic? Current evidence and future implications. Nat Rev Clin Oncol. 2022;19(10):656–673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Archambault AN, Su YR, Jeon J, et al. Cumulative Burden of Colorectal Cancer-Associated Genetic Variants Is More Strongly Associated With Early-Onset vs Late-Onset Cancer. Gastroenterology. 2020;158(5):1274–1286 e1212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Bycroft C, Freeman C, Petkova D, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562(7726):203–209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Ronaldson A, Arias de la Torre J, Gaughran F, et al. Prospective associations between vitamin D and depression in middle-aged adults: findings from the UK Biobank cohort. Psychol Med. 2020:1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Sudlow C, Gallacher J, Allen N, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12(3):e1001779. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Mavaddat N, Michailidou K, Dennis J, et al. Polygenic Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes. Am J Hum Genet. 2019;104(1):21–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Kachuri L, Graff RE, Smith-Byrne K, et al. Pan-cancer analysis demonstrates that integrating polygenic risk scores with modifiable risk factors improves risk prediction. Nat Commun. 2020;11(1):6084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Archambault AN, Jeon J, Lin Y, et al. Risk Stratification for Early-Onset Colorectal Cancer Using a Combination of Genetic and Environmental Risk Scores: An International Multi-Center Study. J Natl Cancer Inst. 2022;114(4):528–539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Dareng EO, Tyrer JP, Barnes DR, et al. Polygenic risk modeling for prediction of epithelial ovarian cancer risk. Eur J Hum Genet. 2022;30(3):349–362. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Conti DV, Darst BF, Moss LC, et al. Trans-ancestry genome-wide association meta-analysis of prostate cancer identifies new susceptibility loci and informs genetic risk prediction. Nat Genet. 2021;53(1):65–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Fritsche LG, Patil S, Beesley LJ, et al. Cancer PRSweb: An Online Repository with Polygenic Risk Scores for Major Cancer Traits and Their Evaluation in Two Independent Biobanks. Am J Hum Genet. 2020;107(5):815–836. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Hung RJ, Warkentin MT, Brhane Y, et al. Assessing Lung Cancer Absolute Risk Trajectory Based on a Polygenic Risk Model. Cancer Res. 2021;81(6):1607–1615. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Choi J, Jia G, Wen W, Long J, Zheng W. Evaluating polygenic risk scores in assessing risk of nine solid and hematologic cancers in European descendants. Int J Cancer. 2020;147(12):3416–3423. [DOI] [PubMed] [Google Scholar]
  • 33.Huyghe JR, Bien SA, Harrison TA, et al. Discovery of common and rare genetic risk variants for colorectal cancer. Nat Genet. 2019;51(1):76–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Helgason H, Rafnar T, Olafsdottir HS, et al. Loss-of-function variants in ATM confer risk of gastric cancer. Nat Genet. 2015;47(8):906–910. [DOI] [PubMed] [Google Scholar]
  • 35.Premenopausal Breast Cancer Collaborative G, Schoemaker MJ, Nichols HB, et al. Association of Body Mass Index and Age With Subsequent Breast Cancer Risk in Premenopausal Women. JAMA Oncol. 2018;4(11):e181771. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Shams-White MM, Brockton NT, Mitrou P, et al. Operationalizing the 2018 World Cancer Research Fund/American Institute for Cancer Research (WCRF/AICR) Cancer Prevention Recommendations: A Standardized Scoring System. Nutrients. 2019;11(7). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Lloyd-Jones DM, Hong Y, Labarthe D, et al. Defining and setting national goals for cardiovascular health promotion and disease reduction: the American Heart Association’s strategic Impact Goal through 2020 and beyond. Circulation. 2010;121(4):586–613. [DOI] [PubMed] [Google Scholar]
  • 38.Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559–575. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Lambert SA, Gil L, Jupp S, et al. The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation. Nat Genet. 2021;53(4):420–425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.U.S. Centers for Disease Control and Prevention. Breast Cancer In Young Women. https://www.cdc.gov/cancer/breast/young_women/bringyourbrave/breast_cancer_young_women. Accessed 3/8, 2024.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1
media-1.pdf (2.8MB, pdf)
1

Articles from medRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES