Abstract
In 2016, 13 specific obesity related cancers were identified by IARC. Here, using baseline WHO BMI categories, latent profile analysis (LPA) and latent class trajectory modelling (LCTM) we evaluated the usefulness of one-off measures when predicting cancer risk vs life-course changes. Our results in LPA broadly concurred with the three basic WHO BMI categories, with similar stepwise increase in cancer risk observed. In LCTM, we identified 5 specific trajectories in men and women. Compared to the leanest class, a stepwise increase in risk for obesity related cancer was observed for all classes. When latent class membership was compared to baseline BMI, we found that the trajectories were composed of a range of BMI (baseline) categories. All methods reveal a link between obesity and the 13 cancers identified by IARC. However, the additional information included by LCTM indicates that lifetime BMI may highlight additional group of people that are at risk.
Introduction
Obesity incidence is currently on the rise worldwide, with global prevalence having tripled between 1975 and 2016 (1). Obesity alone is linked to increase risk of developing cardiovascular diseases, musculoskeletal disorders and cancer, making it one of the most deadly preventable conditions. This increase worldwide is linked to the rise of more sedentary professions (office jobs) and unhealthier diets (1). The World Health Organisation (WHO) currently characterise being obese as having a Body Mass Index (BMI) over 30 kg/m2 (2). This is a standardised form of body fatness, taking into account a person's height and weight.
Cancer incidence is also on the rise worldwide, with WHO predicting that from 2008 to 2030, the number of new cancer cases is expected to increase >80% in low income countries and >40% in high income countries (3). The link between increased body fatness and 13 specific cancer types has already been well characterised. In 2016, the International Agency for Research on Cancer (IARC) determined that cancers of the gastric cardia, colon & rectum, liver, gallbladder, pancreas, postmenopausal breast, corpus uteri, ovary, kidney, thyroid, meningioma, oesophageal adenocarcinoma and multiple myeloma all had an increased risk if the person was overweight (4). However, these associations were made using baseline body fatness measures only, and therefore may not be capturing a person's more accurate weight profile which may change over their life course.
Although the use of a "one-off" BMI measure is common, we hypothesise that it is the change in weight over time that confers the risk of developing one of the 13 obesity related cancers (ORCs). For example, would a person who maintained a constant higher weight over time have the same risk of incident cancer as someone who started at a lower weight and put on weight over time?
When examining smoking, "pack years" (number of packs smoked per day x number of years a smoker), can be used to characterise the cumulative exposure to tobacco. However, in obesity there is no such measure. Several alternatives include maximum BMI (5), % weight change and number of years overweight but as yet there is no single "best" method for looking at the longitudinal effects of obesity.
To address this, we developed latent class trajectory models (LCTMs) to simplify all the BMI data into more homogenous clusters. These models are traditionally seen in the criminology and psychology literatures (6, 7) but are now increasingly used to identify groups of patients that display similar trends over time in drug response or disease progression (8-10). The resulting classes were then linked to ORC risk, and compared to ORC risk when using baseline BMI alone, or a baseline BMI cluster derived by latent profile analysis (LPA).
Methods
Study Populations
The National Institutes of Health (NIH) - AARP cohort was derived in 1995 and enrolled 339,666 men and 226,732 women aged 50-71 years. The design of this study has been previously documented (11). Briefly, a baseline questionnaire was mailed to 3.5 million members of the American Association of Retired Persons study in 3 waves from 1995. A risk factor questionnaire was later mailed in 1996-97 and a follow up questionnaire in 2004 - 06.
Body shape assessment
The study ascertained baseline BMI by asking for current height and body weight and recall weights. Height at study entry was used to derive BMI (weight [kg] / height [m]2) at each age. Participants recall BMI was calculated for ages 18, 35 and 50. Using these BMI records from NIH-AARP we examined BMI categories at baseline with ORC, latent profiles derived from baseline BMI with ORC and LCTMs with ORC in both men and women. This allowed us to distinguish any variability in risk patterns from the standard baseline WHO measures by incorporating different information.
Ascertainment of cancer incidence
In NIH-AARP cancer incidence was monitored through cancer registries and active follow up through questionnaires until cancer diagnosis, loss to follow up, death or 31 December 2011, whichever came first (11). To illustrate how different latent class assignments or BMI categories impact the risk of an ORC, we determined incidence risk of 12 IARC ORCs listed previously (multiple myeloma excluded due to variability within ICD code). The following ICD codes were used to identify the ORCs: 15.5,15.8,18.0-18.9, 19.9, 20.0, 22.0, 23.9, 25.0-25.9, 50.0-50.9, 54.0, 54.9, 55.9, 56.9, 64.9, 16.0, 70.0, 70.1, 70.9, 73.0, 73.9, 90.0.
Statistical Analysis
Out of the full NIH-AARP cohort we excluded 13,944 participants that had no baseline BMI measures, 712 participants for implausible BMI measures (<15 kg/m2 or > 60 kg/m2) and 229,915 participants that only possessed one BMI measure. Our final cohort consisted of 321,827 people (190,848 men and 130,979 women). All BMI data was categorised according to WHO definitions (<18.5 Underweight, 18.5 - 24.9 Normal weight, 25.0 -29.9 Overweight, 30.0 - 34.9 Obese I, 35.0 - 39.9 Obese II, > 40 Obese III) (2). BMI category change between each recall age and baseline was also recorded. To assess how these categories were linked to ORC incidence we fitted Cox proportional hazards models, with age in years as the time metric, adjusted for smoking status at baseline and baseline hazards stratified by age category at study entry. From this we estimated hazard ratios (HRs) and 95 % confidence intervals (CIs) of the association of each latent class to ORC. As there are several confounding factors between obesity and cancer, a supplementary model for LCTM classes was also run to include diabetic status, presence of a heart condition, red meat intake, educational level, race, and in women HRT status.
Identifying subject subgroups (clusters) from baseline BMI
Next, we created latent profiles for the NIH-AARP cohort, based on baseline BMI only. To identify the best fitting number of profiles, 1 to 7 classes were fitted and the best model chosen from Bayesian Information Criterion (BIC). Using the class proportions and the BIC, the preferred model was selected for both men and women. This preferred model was then fitted to the cox model mentioned prior.
Identifying BMI trajectory subgroups
Finally, we followed the framework for developing latent class trajectories (12). A scoping model was run on a 1 class model with no random effects and the residuals examined to determine the random effect structure of the model. From this a cubic structure was selected and multiple shape structures and classes (up to k=7) were tested.
Twenty random start points were used to ensure that the model had converged upon the global maximum and the log-likelihoods of all outputs plotted to ensure that the majority had converged on the same point. Multiple metrics were examined to select the best fitting model for each gender in each data set. These included the Bayesian Information Criterion (BIC), the average posterior probability of assignment (APPA), the odds of correct classification (OCC), mismatch scores and relative Entropy. By not relying on BIC alone we minimise the possibility that the model is over fitted to the data. Using the results of these tests we selected the favoured model for each subset of data. These trajectories were then plotted and the clinical plausibility of each assessed. Model discrimination was also evaluated using degrees of separation (DoSk) and Elsensohn's envelope of residuals.
Sensitivity Analysis
All final models were re-run with only participants assigned to each class with > 80% certainty to confirm that the model shapes remained stable.
Results
Baseline BMI clusters mirror WHO categories
In order to identify, in a data-driven manner, subgroups of individuals with differing patterns of single-measure BMI, we applied LPA to baseline BMI in both men and women. In both cases, the BIC decreased as the number of classes increased until a plateau was reached. Applying the "elbow criterion" this resulted in a 3 class fit for men and for women. The spread of BMI within each class is detailed in Figures 1C and 1D. These classes broadly concurred with the basic WHO categories (Normal weight, Overweight and Obese).
Panels A and B show the final model for both men and women, with the x being the mean BMI of the class. Panels C and D show the associated risk of ORC with each class.
Trajectories of BMI
We used an LCTM approach to identify subgroups of patients with similar trajectories of BMI over time. For both men and women, multiple models testing linear, quadratic, cubic and natural cubic splines were fitted for 1 to 7 classes. In men, as per the LPA analysis, the BIC for each shape set decreased as more classes were added and using this in conjunction with other metrics, a splenic 5 class model was chosen for men (knots placed at ages 35 and 50). In women a cubic 4 class model and splenic 5 class model performed similarly, but when raw data was plotted against the predicted trajectories - a better fit was seen by the splenic model.
The final model in men (Figure 2A) consists of a "lean-increase" class that starts with lower BMI and puts on weight over time, a "medium-increase" class that follow a similar trajectory but at a heavier weight, a "medium-stable" class that remain fairly constant at a borderline normal-overweight threshold, a "high-increase" class that start overweight and put on weight over time until obese, and an "n-shaped" class that start overweight and put on weight (up until age 50) then dramatically lose weight. The two leanest classes are assigned the majority of the population, containing 38.9% and 36.6% respectively.
Figure 2: LCTM Model Development.
In women, the final model has a similar pattern to that seen in men but with more marked increases. A "lean-small increase" class is observed, with those that are borderline underweight slowly putting on weight over time all within the normal weight bounds, a "lean-medium increase" class which follows a similar pattern with a heavier weight gain, a "lean-heavy increase" class where weight gain accelerates after age 50 and participants become obese, a "medium increase" class where participants start overweight and put on weight consistently until morbidly obese, and a "heavy - n increase" class where participants start overweight and put on weight until aged 35 and obese, this weight gain continues but at a slower rate as participants become morbidly obese (Figure 2B).
ORC incident risk
Panels A and B show the final model for men and women. Panels C and D show the HR and 95% CI for ORC
The WHO BMI categories follow an expected pattern with associated ORC risk (Figure 3) as previously reported in NIH-AARP (13-15) and other cohorts (16). In men the underweight class appears to have a higher risk of ORC but this is non-significant and could be due to small numbers present in this group.
Figure 3: Baseline BMI risk of ORC.
A similar pattern is seen with the data-driven LPA derived classes, the heavier the class, determined by baseline BMI, the higher the risk of an ORC (Figures 1C and 1D). This relationship is stronger in men than in women, with larger HRs observed. A similar pattern was not observed with the non-ORCs where no significant differences were found between the classes.
A depicts the HR and 95% CI of an ORC in men, B in women.
In men, when the LPA class assignments are compared with the baseline WHO BMI categories, there is again broad agreement between the two (Figure 4A). The lowest BMI LPA class consists of all the participants who are normal weight at baseline and a few who are overweight. The middle class consists of large proportion of the overweight and obese group and the final class consists of the extreme weight classes (underweight and highly obese).
Figure 4: Comparison of LPA and LCTM class assignment with baseline BMI categories.
In women, a similar pattern is observed (Figure 4B) but with class 1 taking all underweight and most normal weight participants at baseline. Class 2 consists of a small proportion of normal weight participants, all the overweight participants and a large proportion of those classed as Obese I. Finally Class 3 consisted solely of obese participants, ranging from morbidly obese to threshold level obese.
Similarly with the LCTM class assignments a stepwise increase in risk is observed in men (Figure 2C). In women there is a slight increase in ORC risk, although for the lean-moderate increase and n-increase classes this is not significant (Figure 2D). Again when looking at non-ORC, no significant results are found.
When trajectory classes are compared to baseline WHO BMI categories in men (Figure 4C), the Lean-Increase group consists of a large proportion of the normal weight participants and some overweight. The medium stable group is made up of mostly overweight participants and some normal weight. The medium increase group has a broader spread of participants, encompassing normal weight overweight and all obese classes. The heavy-increase group has small proportions in all groups as does the N-shaped group.
When this comparison is made for female trajectory classes and baseline WHO BMI categories (Figure 4D) there is slightly better agreement between the two sets of groups. The Lean-stable increase class consists mainly of normal weight participants and nearly all underweight participants. The Lean-medium increase group is made up of a nearly even split of normal weight and overweight participants, plus a small group of Obese I participants. The Lean-heavy increase group surprisingly has a smaller proportion of normal weight participants and more overweight, obese I and obese II participants. The Medium increase group mainly consists of Obese II and III participants with a very small proportion of Overweight and Obese I. Finally, the Heavy increase class has small proportions in all Obese categories.
Discussion
Main findings
To our knowledge this is the first study to examine in a stepwise progression, the added benefit of latent classes with either snapshot or longitudinal data compared to established categories, in the context of BMI and cancer.
We found that participants who were consistently heavier had the highest risk of cancer whereas those who maintained a normal body weight had the lowest risk. Compared to the latter group, regardless of weight in earlier life - those who gained weight consistently had a higher risk of developing an ORC. Our derived LPA and LCTM classes followed a similar progression in ORC risk as seen by using WHO derived baseline BMI categories.
Figure A compares the LPA results in men with baseline BMI and B with women. Figures C and D compare the class assignments from the LCTM in men and women respectively. The width of the bands represents the proportion of the population assigned to each respective class or WHO BMI category.
Comparisons with other studies
Many studies investigating BMI and associations with cancer use a baseline BMI measure recorded at study onset and follow the individuals from that point (17, 18). While this ensures accurate data, it does not capture the complete picture of how a person's weight has changed over life.
Other studies simply use a single BMI measure per person taken at any given time point, which has the benefit of capturing a larger sample size in retrospective studies but makes it hard to address how a change in BMI might have an impact on the resulting risk (19). Additional approaches, such as that taken by Renehan et al (20) summarised weight change over 4 separate periods and how each change was linked to cancer risk to try identify a "critical period" for intervention. It has also been previously demonstrated that using baseline data can result in more significant differences in the resulting HR (21). Here it was shown that using a person's max BMI (at any time point of life) compared with a current baseline BMI of obese class II, compared to normal weight, was associated with a HR of 2.15 (95% CI; 1.47 - 3.14) but when using baseline BMI this dropped to 1.31 (95% CI; 0.95 - 1.81).
A thorough study by Song et al (22) using LCTMs demonstrated that body shape trajectory had a significant impact on overall obesity related cancer risk and individual cancer sites, dependent on trajectory. As the trajectories derived here are for body shape score instead of BMI and at different time points, it is hard to directly compare. Nevertheless, the "lean-moderate increase", "lean marked increase" and "heavy-stable/increase" classes that they derived in women seem to concur with our "lean-small increase", "lean-heavy increase" and "medium-increase"" groups. In men, the same groups as in women compare with our "lean-increase", "medium-increase" and "heavy increase" classes with additional similarities noticed between the "medium stable" groups.
Trajectory modelling of BMI
By using LCTMs instead of single WHO BMI measures, and even categories derived by LPA, we can capture more inter-population variability. As people are subject to fluctuate over time, and change many factors that can influence their health and therefore weight, LCTMs provide a useful approach to finding common themes within age groups and demographic groups over life course. In the context of our derived BMI trajectories, the majority of these started off as a normal weight and became overweight/obese when they reached either middle age or later life, therefore other social factors might be at play to influence this increase in weight, something that could not be captured looking at baseline BMI alone. LCTMs have also been useful in other fields to identify subgroups that have not been previously categorised. For example, Geifman et al (8) identified 3 distinct patterns of disease progression and cognitive decline in Alzheimer's patients.
Strengths and limitations of the study
We first note that our data-driven approach, identifying subgroups of participants from baseline BMI, strongly correlates with WHO BMI categories. These further show an increase in risk for developing ORCs, providing additional evidence to support previous studies using WHO categories. Additional strengths of this study include its large sample size in a well-documented cohort. By using a dataset that contained recall BMI, we were able to examine the effect of early life BMI and therefore change in weight over time and how this affects ORC risk. Through the use of multiple selection metrics instead of BIC alone to select the best fitting LCTM we ensured that the model best fitted the data, and by comparison with LPA and WHO BMI categories we examined the usefulness of additional data.
However, this study does have limited generalisability as the majority of the population were white and well educated. As body weight and height were recalled this could have led to some bias, although similar cohorts have found good correlation between self-reported and directly measured weight/height, both for current and recall data (23, 24). Finally, due to practical constraints only a limited number of latent classes could be tested as the computational power and time needed to run some of the larger number of classes was substantial. However, due to the good discrimination of our final models we can assume that these models do summarise the main patterns of weight change within the population. It was also reassuring to see that these patterns did not change when those assigned to each class with suboptimal probabilities (<0.8) were removed.
Future Work
Further work is needed to develop methods to suitably validate the patterns observed in LCTMs, as can be done in the risk prediction field. Testing these models in a more diverse data set would likely bring out new classes and characteristics previously not seen in the predominantly white cohorts.
Conclusions
While it is hard to determine which method has the best performance, we can incorporate more information into an LPA or LCTM which captures more of the population heterogeneity and can take into account other background features such as lifestyle choices/ other medical conditions that may not directly be incorporated using WHO BMI categories alone. Overall, these results indicate that there would be a health benefit to a weight intervention in early life and continued monitoring. Our findings provide evidence as to GP recommendations or a public health campaign for weight management and avoidance of weight gain for long term health benefits.
Acknowledgements
This research was funded by the NIHR Manchester Biomedical Research Centre. The views expressed are those of the author and not necessarily those of the NHS, the NIHR or the Department of Health. Many thanks to all at NIH-AARP who worked on the creation and curation of the data: https://dietandhealth.cancer.gov/acknowledgement.html
Figures & Table
Figure 1: LPA Model Development.
References
- 1.WHO Obesity and overweight. https://www.who.int/news-room/fact-sheets/detail/obesity-and-overweight2018. [Accessed 29.01.2020]
- 2.WHO BMI classification. http://apps.who.int/bmi/index.jsp?introPage=intro 3.html2006. [Accessed 29.01.2020]
- 3.Organisation WH. Cancer. https://www.who.int/cancer/resources/keyfacts/en/ WHO 2020. [Accessed 29.01.2020]
- 4.Lauby-Secretan B, Scoccianti C, Loomis D, Grosse Y, Bianchini F, Straif K, et al. Body Fatness and Cancer--Viewpoint of the IARC Working Group. N Engl J Med. 2016;375(8):794–8. doi: 10.1056/NEJMsr1606602. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Stokes A, Preston SH. Revealing the burden of obesity using weight histories. Proc Natl Acad Sci U S A. 2016;113(3):572–7. doi: 10.1073/pnas.1515472113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Cole VT, Apud JA, Weinberger DR, Dickinson D. Using latent class growth analysis to form trajectories of premorbid adjustment in schizophrenia. J Abnorm Psychol. 2012;121(2):388–95. doi: 10.1037/a0026922. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Stevenson J, Thompson MJ, Sonuga-Barke E. Mental health of preschool children and their mothers in a mixed urban/rural population. III. Latent variable models. Br J Psychiatry. 1996;168(1):26–32. doi: 10.1192/bjp.168.1.26. [DOI] [PubMed] [Google Scholar]
- 8.Geifman N, Kennedy RE, Schneider LS, Buchan I, Brinton RD. Data-driven identification of endophenotypes of Alzheimer’s disease progression: implications for clinical trials and therapeutic interventions. Alzheimers Res Ther. 2018;10(1):4. doi: 10.1186/s13195-017-0332-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kelly SP, Lennon H, Sperrin M, Matthews C, Freedman ND, Albanes D, et al. Body mass index trajectories across adulthood and smoking in relation to prostate cancer risks: the NIH-AARP Diet and Health Study. Int J Epidemiol. 2019;48(2):464–73. doi: 10.1093/ije/dyy219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Amico B, Dagliati A, Plant D, Barton A, Peek N, Geifman N. A Dashboard for Latent Class Trajectory Modeling: Application in Rheumatoid Arthritis. Stud Health Technol Inform. 2019;264:911–5. doi: 10.3233/SHTI190356. [DOI] [PubMed] [Google Scholar]
- 11.Schatzkin A, Subar AF, Thompson FE, Harlan LC, Tangrea J, Hollenbeck AR, et al. Design and serendipity in establishing a large cohort with wide dietary intake distributions : the National Institutes of Health-American Association of Retired Persons Diet and Health Study. Am J Epidemiol. 2001;154(12):1119–25. doi: 10.1093/aje/154.12.1119. [DOI] [PubMed] [Google Scholar]
- 12.Lennon H KS, Sperrin M Buchan I, Cross AJ Leitzmann M, Cook MB Renehan A. A framework to construct and interpret latent class trajectory modelling. BMJ Open. 2018. [DOI] [PMC free article] [PubMed]
- 13.Adams KF, Leitzmann MF, Albanes D, Kipnis V, Mouw T, Hollenbeck A, et al. Body mass and colorectal cancer risk in the NIH-AARP cohort. Am J Epidemiol. 2007;166(1):36–45. doi: 10.1093/aje/kwm049. [DOI] [PubMed] [Google Scholar]
- 14.Kitahara CM, Platz EA, Park Y, Hollenbeck AR, Schatzkin A, Berrington de Gonzalez A. Body fat distribution, weight change during adulthood, and thyroid cancer risk in the NIH-AARP Diet and Health Study. Int J Cancer. 2012;130(6):1411–9. doi: 10.1002/ijc.26161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Brinton LA, Smith L, Gierach GL, Pfeiffer RM, Nyante SJ, Sherman ME, et al. Breast cancer risk in older women: results from the NIH-AARP Diet and Health Study. Cancer Causes Control. 2014;25(7):843–57. doi: 10.1007/s10552-014-0385-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Reeves GK, Pirie K, Beral V, Green J, Spencer E, Bull D, et al. Cancer incidence and mortality in relation to body mass index in the Million Women Study: cohort study. BMJ. 2007;335(7630):1134. doi: 10.1136/bmj.39367.495995.AE. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Greenlee H, Unger JM, LeBlanc M, Ramsey S, Hershman DL. Association between Body Mass Index and Cancer Survival in a Pooled Analysis of 22 Clinical Trials. Cancer Epidemiol Biomarkers Prev. 2017;26(1):21–9. doi: 10.1158/1055-9965.EPI-15-1336. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Berrington de Gonzalez A, Hartge P, Cerhan JR, Flint AJ, Hannan L, MacInnis RJ, et al. Body-mass index and mortality among 1.46 million white adults. N Engl J Med. 2010;363(23):2211–9. doi: 10.1056/NEJMoa1000367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Renehan AG, Tyson M, Egger M, Heller RF, Zwahlen M. Body-mass index and incidence of cancer: a systematic review and meta-analysis of prospective observational studies. Lancet. 2008;371(9612):569–78. doi: 10.1016/S0140-6736(08)60269-X. [DOI] [PubMed] [Google Scholar]
- 20.Renehan AG, Flood A, Adams KF, Olden M, Hollenbeck AR, Cross AJ, et al. Body mass index at different adult ages, weight change, and colorectal cancer risk in the National Institutes of Health-AARP Cohort. Am J Epidemiol. 2012;176(12):1130–40. doi: 10.1093/aje/kws192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Stokes A. Using maximum weight to redefine body mass index categories in studies of the mortality risks of obesity. Popul Health Metr. 2014;12(1):6. doi: 10.1186/1478-7954-12-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Song M, Willett WC, Hu FB, Spiegelman D, Must A, Wu K, et al. Trajectory of body shape across the lifespan and cancer risk. Int J Cancer. 2016;138(10):2383–95. doi: 10.1002/ijc.29981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Perry GS, Byers TE, Mokdad AH, Serdula MK, Williamson DF. The validity of self-reports of past body weights by U.S. adults. Epidemiology. 1995;6(1):61–6. doi: 10.1097/00001648-199501000-00012. [DOI] [PubMed] [Google Scholar]
- 24.Connor Gorber S, Tremblay M, Moher D, Gorber B. A comparison of direct vs. self-report measures for assessing height, weight and body mass index: a systematic review. Obes Rev. 2007;8(4):307–26. doi: 10.1111/j.1467-789X.2007.00347.x. [DOI] [PubMed] [Google Scholar]




