Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2021 Oct 18;16(10):e0258748. doi: 10.1371/journal.pone.0258748

Non-linear interaction between physical activity and polygenic risk score of body mass index in Danish and Russian populations

Dmitrii Borisevich 1,*, Theresia M Schnurr 1, Line Engelbrechtsen 1,2, Alexander Rakitko 3, Lars Ängquist 1, Valery Ilinsky 3, Mette Aadahl 4,5, Niels Grarup 1, Oluf Pedersen 1, Thorkild I A Sørensen 1,5, Torben Hansen 1
Editor: David Meyre6
PMCID: PMC8523041  PMID: 34662357

Abstract

Body mass index (BMI) is a highly heritable polygenic trait. It is also affected by various environmental and behavioral risk factors. We used a BMI polygenic risk score (PRS) to study the interplay between the genetic and environmental factors defining BMI. First, we generated a BMI PRS that explained more variance than a BMI genetic risk score (GRS), which was using only genome-wide significant BMI-associated variants (R2 = 13.1% compared to 6.1%). Second, we analyzed interactions between BMI PRS and seven environmental factors. We found a significant interaction between physical activity and BMI PRS, even when the well-known effect of the FTO region was excluded from the PRS, using a small dataset of 6,179 samples. Third, we stratified the study population into two risk groups using BMI PRS. The top 22% of the studied populations were included in a high PRS risk group. Engagement in self-reported physical activity was associated with a 1.66 kg/m2 decrease in BMI in this group, compared to a 0.84 kg/m2 decrease in BMI in the rest of the population. Our results (i) confirm that genetic background strongly affects adult BMI in the general population, (ii) show a non-linear interaction between BMI genetics and physical activity, and (iii) provide a standardized framework for future gene-environment interaction analyses.

Introduction

Body mass index (BMI) is a complex measure that has been robustly associated with cardiometabolic traits and diseases [1]. BMI is a highly heritable complex trait, with heritability estimated to be between 30–40% [25]. Studying how genetic variation affects BMI is important to understand the biology of BMI-related diseases.

Genome-wide association studies (GWAS) have identified a multitude of BMI-associated genetic variants at the genome-wide significance threshold (p < 5 x 10−8). The largest meta-analysis of GWAS so far, including ~700,000 adults, identified 941 genetic variants associated with BMI [6]. From the significant variants, genetic risk scores for BMI (BMI GRS) have been constructed representing the number of BMI-increasing risk alleles, weighted by their respective effect sizes within the discovery GWAS. However, the GRS constructed in the study explained only 6.0% of BMI variance, which is substantially less than the estimated heritability of BMI. This finding represents a marked case of missing heritability. There are many potential reasons for heritability missing from the GWAS findings [7], including non-significant variants with small effect size, rare variants, structural variation, and gene-gene and gene-environment interactions.

One of the substantial reasons for the missing heritability of BMI is that BMI is a polygenic trait. Thus the set of genetic variants identified at the genome-wide significance threshold have limited predictive ability. The genetic susceptibility to BMI is accumulated from numerous genetic variants with individually small to modest effects [8]. Recently, computational algorithms have been developed to derive polygenic risk scores (PRS) that combine all available common genetic variants into a single quantitative measure [9]. Applied to BMI, a BMI PRS was shown to be a better predictor of BMI than a BMI GRS comprised of 141 BMI-associated genetic variants [10], as expected based on the highly polygenic nature of BMI.

Another source of the unexplained variation of BMI stems from gene-environment interactions [11, 12]. An interaction occurs when the biological effect of a genetic variant depends on a risk factor, such as an environmental stimulus or a lifestyle factor [13, 14]. For example, physical activity attenuates the effect of a common SNP rs9939609 within the FTO locus on BMI [15, 16], the strongest common SNP known to associate with BMI. However, detection of the interactions driven by individual SNPs is challenging. A meta-analysis of 200,452 adults [12] reported only one additional SNP, rs986732, on top of the known FTO locus. Attempts to aggregate genetic background using GRS to increase power to detect gene-environment interactions have been made. Recently, a large study in up to ~360,000 unrelated participants from UK Biobank identified several risk factors–alcohol intake, physical inactivity, socioeconomic status, mental health, and sleeping patterns–that influenced the effect of a BMI GRS comprised of 94 BMI-associated genetic variants on BMI [17].

In the present study, we investigated the interactions of a BMI PRS with environmental and lifestyle risk factors and used the interaction to develop a criterion for stratifying a population into two risk groups. We constructed and validated a BMI PRS. We validated an interaction between the BMI PRS and physical activity, which remained highly significant even when omitting FTO variants. Finally, we developed a simple non-linear criterion to translate this interaction into clinical practice and future research. We showed that in a subset of individuals with the highest 22% of BMI PRS values, self-reported physical activity was associated with a two-fold higher difference in BMI than in the remaining 78% of the individuals.

Materials and methods

Overall analysis workflow

The workflow of the analysis is present at Fig 1. The polygenic risk score of BMI was built on UK Biobank and GIANT summary statistics as the source dataset and a “Training” subset from the Inter99 cohort as the target dataset, as described in the “BMI Polygenic Risk Score” section. The resulting PRS was validated in an independent “Validation/Discovery” subset of Inter99, using phenotypes described in the “Phenotypes” section. Both parts of the Inter99 dataset are described in the “Inter99 dataset” section. Interaction analyses were done in the same subset of Inter99 between the BMI PRS and risk factors described in the “Risk Factors” section. Both analyses were performed, as described in the “Statistical Analysis” section. A post hoc criterion to stratify the individuals was developed using the “Validation/Discovery” subset, as described in the “Building the PRS Criterion” section. The criterion was validated in an independent “Replication” Genotek cohort of different origin, described in the “Genotek Dataset” section.

Fig 1. The overall workflow of the analysis.

Fig 1

The flow of the data and the steps of the analysis. The subsets of data are named “Training”, “Validation/Discovery”, and “Replication”, according to the naming in the text. Orange color highlights steps of analysis where models were trained, blue color highlights steps of analysis where models were validated. The N = 5,179 subset of Inter99 was used both for the validation of the BMI PRS and for the discovery of the interactions with risk factors.

BMI polygenic risk score

To generate a polygenic risk score (PRS), a source summary statistics dataset from the relevant GWAS study and a target cohort dataset with individual genotypes and phenotypes are required. A meta-analysis of UK Biobank and GIANT [6] (N ~ 700,000) was used as the source of summary statistics. PRS was generated using LDpred tool (v.1.0.6) [9] and its standard workflow. Briefly, a Danish population-based cohort Inter99 [18] (N = 6,179) was used. The cohort was randomly split into training (N = 1,000) and validation (N = 5,179) subsets. SNPs available in the input data, comprising subsets and the summary statistics of the BMI GWAS, were aligned using LDpred coord command. Eleven BMI PRS were generated for the different parameter f, representing what LDpred calls the “fractions of causative variants”. The following parameters f were used to cover different orders of magnitude: 1.0, 0.3, 0.1, 0.03, 0.01, 0.003, 0.001, 0.0003, 0.0001, 3 x 10−5, 1 x 10−5. The scores were generated using LDpred gibbs and their performance was assessed in the training subset only using R2 provided by LDpred score command. The best performing score was selected. The score values were calculated for the samples in the validation subset using LDpred score command. These values and this subset were used for the subsequent analyses.

A genetic risk score (GRS) was constructed using only the genome-wide significant (p < 5 x 10−8) variants from the same meta-analysis of UK Biobank and GIANT [6] and calculated in the validation subset of the Inter99 dataset.

To exclude the FTO effect from the PRS, all the SNPs within ± 100,000 nucleotides from the lead SNP rs9939609 were excluded from the score. This covered at least all the SNPs with estimated linkage disequilibrium R2 ≥ 0.1.

Phenotypes

We have reviewed the literature and created a shortlist of cardiometabolic phenotypes, which were previously reported to associate with BMI. For evaluation of the association between the BMI PRS and cardiometabolic traits, we selected well-known obesity-related traits. The literature review highlighted multiple biochemical and anthropometric measurements and functional tests associated with BMI. The following twenty phenotypes were available in the Inter99 cohort for analysis:

  • Dyslipidemia markers: fasting total cholesterol, high-density lipocholesterol, triglycerides,

  • Inflammation markers: fasting serum C-reactive protein (high sensitive hs-CRP), interleukin-18 (IL18),

  • Cardiovascular diseases markers: diastolic blood pressure, systolic blood pressure, pulse pressure,

  • Glucose level markers: hemoglobin A1C, fasting plasma glucose, 30 min & 120 min plasma glucose during oral glucose tolerance test (OGTT),

  • Insulin sensitivity or resistance markers: homeostatic model assessment of insulin resistance (HOMA-IR), fasting serum insulin,

  • Satiety markers: fasting leptin (LEP),

  • Anthropometrics: hip, waist circumference, waist-to-hip ratio, weight, height.

Risk factors

Different behavioral and environmental traits (called together “risk factors” in the text for simplicity) were shown to affect the BMI via a gene-environment interaction [17] using a GRS approach. While Inter99 did not contain exactly the same risk factors, we have found similar risk factors available in Inter99. The following factors were matched to the previously reported and analyzed in this study:

  • Smoking status [19], four categories: “smoking daily”, “smoking occasionally”, “never smoking”, and “previously smoking” (N = 5,155).

  • Alcohol consumption [20], two categories: “no or moderate alcohol consumption” defined as ≤ 6 units/week for women and ≤ 12 units/week for men; “high alcohol consumption” defined as > 6 units week for women and > 12 units/week for men. 1 unit is equivalent to 12 g of pure alcohol (N = 5,002).

  • Diet quality, two categories: “poor diet” defined as 4–8 points on the diet quality score (DQS) system described at [21]; “healthy diet” defined as ≥ 9 points on the DQS score system (N = 5,020).

  • Physical activity level [22], two categories: “inactive” defined as self-reported commuting and leisure-time physical activity ≤ 225 min/week; “active” defined as self-reported commuting and leisure-time physical activity > 225 min/week (N = 4,859).

  • Mental health [23], two categories: “high” defined as mental health component score (MCS) higher than the 75th percentile within the study population of the same sex; “low” defined as MCS lower than the 75th percentile. MCS has been calculated as described in [23], using the Short Form 12 (SF-12) questionnaire [24] (N = 4,878).

  • Quality of sleep, two categories: “good” defined by the answer ‘no’ to the question ‘do you often suffer from insomnia’; “poor” defined by the answer ‘yes’ (N = 5,129).

  • Socioeconomic class [20], five categories: “not working, no education”, “not working, ≥ 1 year of education”, “working, no education”, “working, 1–3 years of education”, “working, ≥ 4 years of education”. Education is counted after mandatory school years. The categories were combined from education and employment statuses, reported in [20] (N = 4,807).

All risk factors were measured by self-report questionnaire in the Inter99 cohort.

Statistical analysis

The validation subset of the Inter99 was used for the analysis of associations. Exploratory data analyses and linear regression analyses were done using python (Python 3, including numpy, pandas, matplotlib, seaborn and statsmodels packages).

Explained variances of the BMI by the PRS and GRS were calculated using ordinary least squares models. For regression analysis, robust linear models were used. Formulas are provided inline in the Results text using the python statsmodels standard, identical to the R glm standard [25]. Terms with “C()” represent categorical phenotypes. Terms, combined using the semicolon as “X:Y”, represent the interaction between X and Y. Term “1” represents Intersect.

Findings were reported if they have passed a study-wise Bonferroni-adjusted p-value significance cut-off. I.e., the Bonferroni correction for multiple testing was applied for each analysis separately. Specifically, p-values in the analysis of associations of cardiometabolic phenotypes with the BMI PRS were adjusted for the number of the outcome phenotypes tested (twenty tests, p = 2.5 x 10−3). Analyses for the associations and interactions with categorical risk factors were each independently adjusted for the number of associations or interactions tests performed (twelve tests, p = 4.17 x 10−3).

The presence of the non-linear associations between the BMI PRS and the BMI was analyzed by regressing out the linear effect of the PRS and covariates (age and sex) and checking if the square of the PRS was significantly associated with the residuals.

Building the PRS criterion

The selection of the PRS cut-off for the stratification was made in the discovery subset of the Inter99. Ordinary least squares model “BMI ~ 1 + age + C(sex)” was used to regress out intercept, age, and sex fixed effects. Residuals of BMI were fitted to a family of models of the form “BMI residuals ~ 0 + PRS + C(physical activity) + C(physical activity):I(PRS > ith percentile)”. “I(PRS > ith percentile)” is a binary indicator, true when PRS is greater than the ith percentile of observed PRS values and false otherwise. One hundred models with cut-offs i = 0,1,..,99 were calculated. The cut-off i producing a model with the highest R2 was selected.

The criterion model was combined as “BMI ~ 1 + age + C(sex) + PRS + C(physical activity) + C(physical activity):I(PRS > ith percentile)”. It splits all the population individuals into two groups based on their BMI PRS for measuring the interaction with physical activity, instead of using precise PRS values. The model was validated and analyzed in the independent Genotek replication dataset.

Inter99 dataset

Inter99 is a previously described Danish population-based dataset consisting of 6,179 individuals [18]. The study was approved by the Regional Scientific Ethics Committee (KA 98 155) and the Danish Data Protection Agency.

Briefly, all individuals were genotyped using Illumina HumanOmniExpress-24 (v1.0A / v1.1A). Genotypes were called using the Genotyping module (version 1.9.4) of GenomeStudio software (version 2011.1; Illumina). Only individuals having a call rate ≥98% were included. Monomorphic SNPs and SNPs with Hardy-Weinberg expectation p-value < 10−5 were excluded. Genotypes were imputed using the Michigan imputation server with the HRC1.1 [26] reference panel.

The dataset was randomly split into two subsets. A “Training” subset (N = 1,000) was used for PRS generation. A “Validation/Discovery” subset (N = 5,179) was used for the analysis of the PRS, its interactions with risk factors, and building the stratification criterion.

Genotek dataset

Genotek is an unpublished Russian population-based set consisting of 3,415 unrelated individuals aged between 20 and 60 years old with self-reported measures.

All individuals were genotyped using Illumina Infinium Global Screening Array (v1.0 / v2.0). SNPs with call rate <0.9, the calls on Y chromosome for women and the heterozygous calls on X chromosome for men were removed. Genotypes were imputed using BEAGLE 5.1 with the HRC reference panel. After imputation, variants with MAF below 1% or DR2 below 0.7 were excluded.

The following phenotypes were used from questionnaire information: age, weight, height, sex, and physical activity. Three categories of physical activity levels were available: “sedentary” defined as self-reported “I have sedentary lifestyle”, “moderate” defined as self-reported “I take walks every single day”, and “high” defined as self-reported “My job involves physical activity, or I do a lot of sports”. To match the physical activity measures between the datasets, the “sedentary” group from the Genotek dataset was considered equivalent to the “inactive” group from the Inter99 dataset, and the “moderate” and “high” groups from the Genotek dataset were considered equivalent to the “active” group from the Inter99 dataset.

The dataset was used for validation of the stratification criterion.

Results

BMI polygenic risk score

To have a tool for measuring the genetic susceptibility to BMI, we have constructed a BMI PRS. BMI PRS was a score corresponding to the LDpred parameter “fraction of causative variants” f = 0.3, selected using a procedure described in “Materials and methods”. The explained variance of the score was 13.0% in a training subset (N = 1,000) and 13.1% in a validation subset (N = 5,179). In comparison, the best currently available GRS of BMI, which utilizes 941 SNPs [6], explained only 6.1% of the variance in the same validation subset. In the validation subset, we observed a significant association between the BMI and the BMI PRS (p = 3.36 x 10-172, linear regression BMI ~ 1 + PRS). The difference in the median BMI between the top and the bottom deciles of the individuals, according to their PRS, was 5.17 kg/m2 (Fig 2A), also showing an improvement over the GRS, which showed a median difference of 3.41 kg/m2 (Fig 2B) in the same dataset.

Fig 2. BMI distribution of individuals, stratified by genetic risk.

Fig 2

(A): Individuals stratified by BMI PRS deciles. (B): Individuals stratified by BMI GRS deciles. Both graphs are plotted in the same scale and cut at 10th / 90th percentiles on the y-axis. Box plots represent the median and 25th / 75th percentiles. Whiskers are set at 1.5 IQR. Both graphs show BMI, unadjusted for age and sex. Adjusted BMI residuals and outliers are available at Online Supplementary.

To check if the BMI PRS captures other BMI-related risk phenotypes, we analyzed whether known BMI-associated cardiometabolic traits were also correlated with BMI PRS. First, we confirmed that BMI was associated with all twenty known cardiometabolic traits in our dataset (p < 2.5 x 10-3) (S1 Table). Next, we analyzed associations between the BMI PRS and the cardiometabolic traits. Each of the cardiometabolic traits was also associated (p < 2.5 x 10−3) with the generated BMI PRS (S1 Table), when adjusted for age, sex and genetic principal components (gPCs) (linear regression trait ~ 1 + PRS + age + C(sex) + gPC1 + gPC2 + gPC3). The distributions of all traits across individuals stratified by PRS deciles are shown in S1 File.

While GWAS is based on univariate linear associations, the cumulative increase in risk may be non-linear. To understand if the impact of the genetic load for BMI was linear, we checked for the non-linear effects of the BMI PRS. We observed a significant association between the square of BMI PRS and the BMI residuals corrected for the covariates (age and sex) and the BMI PRS (p = 2.61 x 10-3). In contrast, BMI GRS did not show significant non-linear effects (p = 0.50).

No evidence was observed for associations between BMI and any of the three genetic principal components (linear regression BMI ~ 1 + PRS + age + C(sex) + gPC1 + gPC2 + gPC3, p > 0.05). To simplify the analysis, we have excluded the adjustment for the gPCs in the subsequent analysis.

Associations between the BMI-associated risk factors and BMI

Several risk factors (environmental stimuli and lifestyle factors) are known to be associated with a change in BMI as described in Materials and methods. To study these factors, we first validated associations between them and BMI in our dataset. BMI was associated (Table 1) with smoking, physical activity, and alcohol consumption groups (twelve tests performed, p < 4.17 x 10−3) when adjusted for age, sex, and BMI PRS (linear regression BMI ~ 1 + PRS + age + C(sex) + C(risk factor)). The same risk factors were found to be significantly associated when using the BMI GRS instead of the BMI PRS in our data (S2 Table). BMI was only nominally (p < 0.05) associated with diet quality, and BMI was not associated with socioeconomic class, mental health, or quality of sleep in our dataset (Table 1).

Table 1. Associations of the risk factors with BMI.

Risk factor Category * Effect size P-value
Smoking Never smoker Reference group
Previous smoker + 0.06 kg/m2 0.648
Smoking occasionally + 0.27 kg/m2 0.341
Smoking daily - 1.06 kg/m2 6.49 x 10−18
Physical activity Active Reference group
Inactive + 0.79 kg/m2 8.49 x 10−13
Alcohol consumption No or moderate Reference group
High - 0.36 kg/m2 9.00 x 10−4
Diet quality Healthy Reference group
Poor - 0.25 kg/m2 0.024
Socioeconomic class Working, ≥ 4 years of education Reference group
Working, 1–3 years of education + 0.00 kg/m2 0.99
Working, no education + 0.21 kg/m2 0.24
Not working, ≥ 1 year of education + 0.18 kg/m2 0.40
Not working, no education + 0.01 kg/m2 0.98
Mental health High Reference group
Low - 0.06 kg/m2 0.62
Quality of sleep Good Reference group
Poor + 0.15 kg/m2 0.28

* The categories are described in the “Risk Factors” section of Materials and methods.

Effect sizes and p-values for each category are reported relative to the Reference group.

Interactions between the BMI PRS and BMI-associated risk factors

To analyze if the presence of the risk factor alters the effect of BMI PRS on BMI, we examined potential interactions between the BMI PRS and all seven risk factors (linear regression BMI ~ 1 + PRS + age + C(sex) + C(risk factor) + PRS:C(risk factor)). We found a significant interaction (twelve tests performed, p < 4.17 x 10-3) only between the BMI PRS and physical activity. Physically active individuals demonstrated 0.81 kg/m2 (p = 2.24 x 10-13) lower BMI than inactive and an additional 0.33 kg/m2 (p = 3.13 x 10−3) lower BMI per each standardized BMI PRS unit. Unadjusted BMI values per BMI PRS decile are visualized in Fig 3A. Unlike the generated BMI PRS, the BMI GRS did not demonstrate significant (p < 4.17 x 10−3) interactions with any of the risk factors, but it showed a nominal significance (p = 0.045) for the interaction with physical activity in the same direction as PRS. Unadjusted BMI values per BMI GRS decile are visualized in Fig 3B. Physical activity was not itself significantly associated with the BMI PRS (logistic regression physical activity ~ 1 + age + C(sex) + PRS, p = 0.053).

Fig 3. BMI distribution of individuals, stratified by genetic risk and levels of physical activity.

Fig 3

(A): Individuals stratified by BMI PRS deciles. (B): Individuals stratified by BMI GRS deciles. Both graphs are plotted in the same scale and cut at 10th / 90th percentiles on the y-axis. Box plots represent the median and 25th / 75th percentiles. Whiskers are set at 1.5 IQR. Both graphs show BMI, unadjusted for age and sex. Adjusted BMI residuals and outliers are available at Online Supplementary.

The FTO locus is the primary locus known to interact with physical activity. In particular, the minor allele of the FTO lead SNP rs9939609 is associated with increased BMI. This effect was shown to be attenuated in physically active individuals in Inter99 before [15]. To check if the BMI PRS interaction with physical activity was driven only by the effect of FTO, we have performed the interaction analysis with a PRS without the FTO region. We still demonstrated a significant interaction between the residual PRS and physical activity (p = 3.23 x 10−3).

The observed BMI PRS interaction with physical activity was reproduced in a replication dataset of different origin (Russian). In this dataset, physically active individuals had 0.97 kg/m2 (p = 9.10 x 10-14) lower BMI than inactive individuals and an additional 0.45 kg/m2 (p = 5.57 x 10−4) lower BMI per one standardized BMI PRS unit.

A criterion for stratification based on the BMI PRS

In clinical genetics practice, simple binary criteria are preferred over numerical variables. To utilize our findings in future studies, we have in post hoc analysis developed a PRS-based criterion to divide the individuals into two different groups of genetic risk in a clinical setting. We have selected the cut-off between the groups in a data-driven manner by selecting the highest R2 for BMI (S1 Fig, Materials and methods). The cut-off was selected at PRS = 78th percentile (PRS = 0.783 units in our dataset), dividing all the individuals into two groups. Here we coin the term “PRS0-78%” group for the individuals with PRS < 78th percentile, and the “PRS78-100%” group for the individuals with PRS ≥ 78th percentile. We observed a significant interaction between the two risk groups of BMI PRS and physical activity (p = 2.83 x 10−6, regression model BMI ~ 1 + age + C(sex) + PRS + C(exercise)*I(PRS > 78th percentile)). In the PRS0-78% group, the average BMI of the inactive individuals was only 0.65 kg/m2 higher than the average BMI of the active individuals. In the PRS78-100% group, the average BMI of the inactive individuals was 2.07 kg/m2 higher than the average BMI of the active individuals. The BMI distribution is shown in Fig 4A.

Fig 4. BMI distribution of individuals, stratified by PRS groups and levels of physical activity.

Fig 4

(A): Discovery Inter99 dataset. (B): Validation in Genotek dataset. Graphs are cut at 10% / 90% quantiles on the y-axis (note the different scales on the panels). Solid lines represent median, boxplot outlines represent 25% / 75% quantiles, whiskers represent 10% / 90% quantiles. Dashed lines represent the mean of the respective box, which were compared in the statistical test. The difference in mean values and p-value of respective tests are shown on the plot. Both graphs show BMI, unadjusted for age and sex. Adjusted BMI residuals and outliers are available at Online Supplementary.

We replicated the interaction between BMI PRS risk groups and physical activity at the 78% cut-off in the replication dataset. In the PRS0-78% group, the average BMI of the inactive individuals was 0.84 kg/m2 higher than the average BMI of the active individuals. In the PRS78-100% group, the average BMI of the inactive individuals was 1.66 kg/m2 higher than the average BMI of the active individuals. The BMI distribution is shown in Fig 4B. The addition of the cut-off interaction increased R2 by +0.55% compared to the model without interaction. In comparison, the linear interaction increased R2 only by +0.22%.

Discussion

In this study, we have detected a non-linear interaction between BMI genetics and physical activity using BMI PRS. We have constructed a BMI PRS and assessed its performance. The BMI PRS demonstrated a substantial improvement in the explained variance of BMI over the BMI GRS. Using the BMI PRS, we have detected an interaction between the genetic component of BMI and physical activity in a relatively small dataset. This interaction was neither limited to the interaction driven by the FTO locus nor significant when using the BMI GRS for the same analysis. The application of PRS enabled us to identify 22% of individuals with the highest PRS. In this group, self-reported physical activity was associated with a 2-fold higher difference in BMI than in the remaining 78% of study participants. The model with a non-linear two-group division of individuals showed higher R2 than a model with a linear interaction.

Studies of BMI indicate that BMI is a highly heritable phenotype. However, existing tools to measure the genetic predisposition to BMI, namely BMI GRS, fail to capture most of this heritability. As a result, the predictive power of BMI GRS is limited, preventing BMI GRS from being used in clinical practice. Lack of proper genetic tools to assess predisposition to BMI hinders progress in this area of research. To address this limitation we used a polygenic risk scoring approach. Studying interactions between the environmental and behavioral risk factors and BMI genetics is important to understand the mechanisms by which risk factors modify genetic predisposition to BMI. To address this we analyze interactions between BMI PRS and seven previously known risk factors.

Different socioeconomic and behavioral factors are known to affect BMI. Here we demonstrate a particularly strong connection between the genetics of BMI and physical activity. Physical activity levels have been shown to interact with BMI-associated variants using the GRS approach [17]. Here we also demonstrate the interaction between the BMI PRS and physical activity levels. Interestingly, this interaction was not significant when applying the BMI GRS, although we replicated the interaction effect in the same direction with nominal significance. This probably means that our study sample size was not large enough for the GRS to show a significant association. Application of the BMI PRS, whose interaction with physical activity was stronger than that of the BMI GRS, likely enabled us to see this interaction in a smaller dataset (N = 5,179) than the one used before (N = 362,496) [17]. It supports the idea that BMI PRS may enable analyses of phenotypes and their interactions in small datasets. This property of PRS would allow detailed phenotypes that are difficult to sample and are lacking from large biobanks to be studied using a PRS approach.

The SNP rs9939609 (also commonly referred to by the nearby gene name, FTO) is of particular interest for studies of obesity. Previously, the effect of FTO locus on BMI has been shown to interact with physical activity in a European population analysis [12]. When we excluded the FTO locus from the BMI PRS, the remaining PRS still showed significant interaction with physical activity, in line with previous GRS studies [17]. This observation indicates that our PRS captures more than just an interaction of physical activity with FTO by integrating weaker interactions of other SNPs. This demonstrates that a PRS may facilitate analyses of the interaction between BMI-associated genetic markers and physical activity in future studies.

We succeeded in replicating the interaction between physical activity and the BMI PRS observed in the Danish population. We replicated it in a Russian population sample with a similar effect size, despite genetic and cultural differences between the two populations. PRS prediction accuracy is known to decline when applied in populations different from the origin population of the source summary statistics [27]. In this study, we utilized previously unpublished data from a Russian study sample. Despite being geographically close, genetic PCA [28] clearly separates the Russian from the European population from which the GWAS was derived. Successful replication in Russians strengthens our findings, suggesting that both our European-based BMI PRS and the observed cut-off are generalizable and may be applied in other populations.

One issue we observed with the replication dataset was that R2, explained by the regression models, was higher on average in the replication dataset than in the discovery dataset. This difference was driven by higher R2 explained by age and sex alone. R2 for the model BMI ~ 1 + C(sex) + age was 21.5% in the replication dataset, while only 2.7% in the discovery dataset. The individuals’ age, height, and weight in the replication dataset were self-reported, in contrast to the discovery dataset, where these measures were collected objectively. We speculate that the observed difference in the R2 might have been caused by biased reporting of variables by individuals in the replication sample.

There are few potential caveats in the performed research. First, the observed interaction between BMI PRS and physical activity should be interpreted cautiously. The association between weight and physical activity is believed to be bi-directional [29]. Reduction in physical activity directly shifts energy balance in favor of weight gain. However, increased BMI also causes a reduction in physical activity, which may create artifacts in the analysis of the interaction between the BMI PRS and physical activity. Our study is cross-sectional, and it cannot provide insights into directionality nor causality. In our cohort, 60–70% of the individuals within any BMI PRS decile were physically active, and physical activity did not decrease significantly with increased BMI PRS. Therefore, we speculate that the observed interaction is caused by physical inactivity directly and is not an artifact. If this hypothesis is correct, our findings would indicate that physical activity would be more beneficial for individuals with a genetic predisposition to high BMI than for individuals without such predisposition.

Second, the sample size of our study (N = 5,179) was relatively small for reaching high levels of significance in performed tests and drawing strong conclusions. To resolve this issue, we have used study-wise correction for multiple testing. I.e., each analysis was corrected for multiple testing in the number of phenotypes independently, as described in detail in Materials and methods. Stronger claims could be made in future studies, where more samples would be available.

Third, in this paper we focus on BMI, but there is a piling amount of evidence that other body composition measures are important, when assessing obesity. Fat vs. fat free mass distribution is an important measure [30], capturing differences in body constitution between the people with the same BMI. Another important set of measures are waist-to-hip ratio, waist circumference and similar [31], capturing the fat distribution around the body. These measures are orthogonal to BMI. Future studies of genetics of these parameters using PRS approach could provide additional value to understanding of obesity.

Last, typically, the cut-offs for “high PRS” groups are selected arbitrarily by picking PRS percentiles [32] or selecting a group based on the corresponding increase in risk [33]. We decided not to select the BMI PRS cut-off for population stratification a priori but to create it as a part of a post hoc analysis. In our study, we have optimized the cut-off to increase the variance explained by the resulting criterion. Such a criterion would perform better than an arbitrary selected one, but it would also be more prone to overfitting. To ensure that the resulting cut-off is not an artifact of overfitting, we have performed an independent replication. We found the cut-off using the Inter99 dataset as a discovery dataset and then validated and analyzed the criterion in an independent Genotek dataset. The replication was successful, and we observed similar effect sizes and improvements in the regression models’ performance. These observations prove the criterion validity and generalizability.

Application-wise, BMI PRS may support a path towards personalized obesity prevention and treatment. BMI PRS enabled us to make a criterion to stratify the population into two groups–PRS0-78% and PRS78-100%, with genetic predispositions to different BMI levels. These two groups showed a difference in the sizes of the physical activity effects on BMI. Using the criterion, we have observed that for 78% of the examined population sample, the average increase in BMI associated with physical inactivity was 0.84 kg/m2. In comparison, for the top 22%, the average increase was 1.66 kg/m2. This cut-off interaction model also provided a larger explained variance than the linear interaction model. Together, the large effect size observed only in the PRS78-100% group, and the simplicity of the criterion, open a path for better research and clinical practice. The criterion may be applied to future recall-by-genotype intervention to dissect the causal interplay between the BMI-associated genetic factors and physical activity in defining BMI. If physical inactivity is shown to drive the BMI increase, while BMI PRS defines how large the increase will be, then differential prevention of obesity could be implemented based on the individual’s genetic risk group.

In conclusion, our work showcases an important example of enabling gene-environment interactions analyses by using PRS as a genetic tool and provides a useful criterion for future genetic studies of BMI. Our results demonstrate that a BMI PRS is a better instrument to measure the genetic predisposition to BMI than a BMI GRS. By discovering the interaction between BMI PRS and physical activity, we show how a PRS may be used for studying gene-environment interactions in small datasets, where a GRS is unlikely to reveal significant findings. The developed cut-off model shows how findings from analyzing a PRS may be translated into clinical research. While the interaction between the BMI PRS and physical activity warrants a careful interpretation, we suggest that our work may support the path towards personalized physical activity-based prevention of obesity based on genetic risk.

Supporting information

S1 File. Distributions of the BMI-associated phenotypes of individuals, stratified by BMI polygenic risk score (PRS) deciles.

(ZIP)

S1 Fig. Performance of models using two groups of genetic risk for interaction with physical activity.

(DOCX)

S1 Table. p-values of associations between cardiometabolic traits and BMI or BMI PRS.

(XLSX)

S2 Table. Associations of the risk factors with BMI adjusted for BMI GRS.

(XLSX)

S1 Appendix. List of online supporting materials.

(DOCX)

Acknowledgments

The authors would like to thank Sophia Metz and Jonathan J. Thompson for an independent review of the manuscript and valuable feedback.

Data Availability

The polygenic risk score is available via Online supplementary, https://bmiprsxpa-pa7qfqmwhq-ew.a.run.app/. The underlying code is available at GitHub, https://github.com/borisevichdi/bmiprs-code. The underlying raw data comprises only sensitive data that cannot be made publicly available: genetic and biological data of human participants. For Inter99 dataset, consent for publication of raw data was not obtained from the individuals, and the dataset could pose a threat to confidentiality. The restriction is imposed by Regional Scientific Ethics Committee (KA 98 155) and the Danish Data Protection Agency. Data access is provided by the Phenomics Platform of the NNF Center for Basic Metabolic Research, https://cbmr.ku.dk/researchfacilities/phenomics/. Point of contact: CBMR-PhenomicsInfo@sund.ku.dk. For Genotek dataset, the user agreement (available at https://www.genotek.ru) states that disclosure of individual-level genetic information and/or self-reported Information to third parties for research purposes will not occur without explicit consent, and the consent was not obtained from the individuals. Due to the user agreement the individual level cannot be made directly available, and the dataset could pose a threat to confidentiality. Data have to be accessed indirectly via Genotek Ltd, https://www.genotek.ru. Data requests should be sent to the Genotek Ltd at info@genotek.ru.

Funding Statement

Data collection in the Inter99 study was supported economically by The Danish Medical Research Council (grant nr. 2028-00-0019 and 09-059174), The Danish Centre for Evaluation and Health Technology Assessment (grant nr. 3126-138-1998 and 263-12-1999 and 0-204-03-74), Novo Nordisk, Copenhagen County (grant nr. 9870006), The Danish Heart Foundation (grant nr. 98-2-5-71-22659 and 00-2-9-F4-22872 and 04-10-B201-A309-22171), The Danish Pharmaceutical Association (grant nr. 53-99 and 58-2003), Augustinus foundation (grant nr. 99-1663), Ib Henriksen foundation, and Becket foundation. Novo Nordisk Foundation Center for Basic Metabolic Research is an independent Research Center, based at the University of Copenhagen, Denmark, and partially funded by an unconditional donation from the Novo Nordisk Foundation (www.cbmr.ku.dk) (NNF18CC0034900). Dmitrii Borisevich is receiving funding from NNF Copenhagen Bioscience PhD Programme (NNF17CC0026760). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Genotek Ltd provided only financial support in the form of AR and VI salaries and the data for the analysis, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.

References

  • 1.Lyall DM, Celis-Morales C, Ward J, Iliodromiti S, Anderson JJ, Gill JMR, et al. Association of body mass index with cardiometabolic disease in the UK biobank: A mendelian randomization study. JAMA Cardiol. 2017;2: 882–889. doi: 10.1001/jamacardio.2016.5804 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Yang J, Bakshi A, Zhu Z, Hemani G, Vinkhuyzen AA, Lee SH, et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat Genet. 2015/09/01. 2015;47: 1114–1120. doi: 10.1038/ng.3390 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Robinson MR, English G, Moser G, Lloyd-Jones LR, Triplett MA, Zhu Z, et al. Genotype-covariate interaction effects and the heritability of adult body mass index. Nat Genet. 2017/07/12. 2017;49: 1174–1181. doi: 10.1038/ng.3912 [DOI] [PubMed] [Google Scholar]
  • 4.Hemani G, Yang J, Vinkhuyzen A, Powell JE, Willemsen G, Hottenga JJ, et al. Inference of the genetic architecture underlying BMI and height with the use of 20,240 sibling pairs. Am J Hum Genet. 2013/11/05. 2013;93: 865–875. doi: 10.1016/j.ajhg.2013.10.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Wainschtein P, Jain DP, Yengo L, Zheng Z, TOPMed Anthropometry Working Group T-O for PMC, Cupples LA, et al. Recovery of trait heritability from whole genome sequence data. bioRxiv. 2019; 588020. doi: 10.1101/588020 [DOI] [Google Scholar]
  • 6.Yengo L, Sidorenko J, Kemper KE, Zheng Z, Wood AR, Weedon MN, et al. Meta-analysis of genome-wide association studies for height and body mass index in ∼700000 individuals of European ancestry. Hum Mol Genet. 2018;27: 3641–3649. doi: 10.1093/hmg/ddy271 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, et al. Finding the missing heritability of complex diseases. Nat 2009 4617265. 2009;461: 747–753. doi: 10.1038/nature08494 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Loos RJF, Janssens ACJW. Predicting Polygenic Obesity Using Genetic Information. Cell Metab. 2017;25: 535–543. doi: 10.1016/j.cmet.2017.02.013 [DOI] [PubMed] [Google Scholar]
  • 9.Vilhjálmsson BJ, Yang J, Finucane HK, Gusev A, Lindström S, Ripke S, et al. Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores. Am J Hum Genet. 2015;97: 576–592. doi: 10.1016/j.ajhg.2015.09.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Khera A V., Chaffin M, Wade KH, Zahid S, Brancale J, Xia R, et al. Polygenic Prediction of Weight and Obesity Trajectories from Birth to Adulthood. Cell. 2019;177: 587–596.e9. doi: 10.1016/j.cell.2019.03.028 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Justice AE, Winkler TW, Feitosa MF, Graff M, Fisher VA, Young K, et al. Genome-wide meta-analysis of 241,258 adults accounting for smoking behaviour identifies novel loci for obesity traits. Nat Commun. 2017;8: 14977. doi: 10.1038/ncomms14977 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Graff M, Scott RA, Justice AE, Young KL, Feitosa MF, Barata L, et al. Genome-wide physical activity interactions in adiposity—A meta-analysis of 200,452 adults. PLoS Genet. 2017;13: e1006528. doi: 10.1371/journal.pgen.1006528 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Bouchard C. Gene–environment interactions in the etiology of obesity: defining the fundamentals. Obesity. 2008;16. doi: 10.1038/oby.2008.528 [DOI] [PubMed] [Google Scholar]
  • 14.Kilpeläinen TO, Franks PW. Gene-physical activity interactions and their impact on diabetes. 2014;60: 94–103. [DOI] [PubMed] [Google Scholar]
  • 15.Andreasen CH, Stender-Petersen KL, Mogensen MS, Torekov SS, Wegner L, Andersen G, et al. Low physical activity accentuates the effect of the FTO rs9939609 polymorphism on body fat accumulation. Diabetes. 2008;57: 95–101. doi: 10.2337/db07-0910 [DOI] [PubMed] [Google Scholar]
  • 16.Kilpeläinen TO, Qi L, Brage S, Sharp SJ, Sonestedt E, Demerath E, et al. Physical activity attenuates the influence of FTO variants on obesity risk: a meta-analysis of 218,166 adults and 19,268 children. PLoS Med. 2011;8: 1543. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Rask-Andersen M, Karlsson T, Ek WE, Johansson Å. Gene-environment interaction study for BMI reveals interactions between genetic factors and physical activity, alcohol consumption and socioeconomic status. PLoS Genet. 2017;13: e1006977. doi: 10.1371/journal.pgen.1006977 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Jørgensen T, Borch-Johnsen K, Thomsen TF, Ibsen H, Glümer C, Pisinger C. A randomized non-pharmacological intervention study for prevention of ischaemic heart disease: baseline results Inter99. Eur J Cardiovasc Prev Rehabil. 2003;10: 377–86. doi: 10.1097/01.hjr.0000096541.30533.82 [DOI] [PubMed] [Google Scholar]
  • 19.Pisinger C, Glümer C, Toft U, von Huth Smith L, Aadahl M, Borch-Johnsen K, et al. High risk strategy in smoking cessation is feasible on a population-based level. The Inter99 study. Prev Med (Baltim). 2008;46: 579–584. doi: 10.1016/j.ypmed.2008.02.026 [DOI] [PubMed] [Google Scholar]
  • 20.Toft U, Pisinger C, Aadahl M, Lau C, Linneberg A, Ladelund S, et al. The impact of a population-based multi-factorial lifestyle intervention on alcohol intake. The Inter99 study. Prev Med (Baltim). 2009;49: 115–121. doi: 10.1016/j.ypmed.2009.06.007 [DOI] [PubMed] [Google Scholar]
  • 21.Toft U, Kristoffersen LH, Lau C, Borch-Johnsen K, Jørgensen T. The Dietary Quality Score: Validation and association with cardiovascular risk factors: The Inter99 study. Eur J Clin Nutr. 2007;61: 270–278. doi: 10.1038/sj.ejcn.1602503 [DOI] [PubMed] [Google Scholar]
  • 22.Von Huth Smith L, Borch-Johnsen K, Jørgensen T. Commuting physical activity is favourably associated with biological risk factors for cardiovascular disease. Eur J Epidemiol. 2007;22: 771–779. doi: 10.1007/s10654-007-9177-3 [DOI] [PubMed] [Google Scholar]
  • 23.Pisinger C, Toft U, Aadahl M, Glümer C, Jørgensen T. The relationship between lifestyle and self-reported health in a general population. The Inter99 study. Prev Med (Baltim). 2009;49: 418–423. doi: 10.1016/j.ypmed.2009.08.011 [DOI] [PubMed] [Google Scholar]
  • 24.Ware J, Kosinski M, Keller S. How to Score SF-12, Physical and Mental Health Summary Scales. Med Outomes Trust. 1995.
  • 25.R: Fitting Generalized Linear Models. [cited 9 Jun 2020]. https://stat.ethz.ch/R-manual/R-devel/library/stats/html/glm.html.
  • 26.McCarthy S, Das S, Kretzschmar W, Delaneau O, Wood AR, Teumer A, et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet. 2016;48: 1279–1283. doi: 10.1038/ng.3643 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet. 2019;51: 584–591. doi: 10.1038/s41588-019-0379-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Moffatt MF, Gut IG, Demenais F, Strachan DP, Bouzigon E, Heath S, et al. A Large-Scale, Consortium-Based Genomewide Association Study of Asthma. N Engl J Med. 2010;363: 1211–1221. doi: 10.1056/NEJMoa0906312 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Barone Gibbs B, Aaby D, Siddique J, Reis JP, Sternfeld B, Whitaker K, et al. Bidirectional 10-year associations of accelerometer-measured sedentary behavior and activity categories with weight among middle-aged adults. Int J Obes. 2020;44: 559–567. doi: 10.1038/s41366-019-0443-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Pramyothin P, Limpattanachart V, Dawilai S, Sarasak R, Sukaruttanawong C, Chaiyasoot K, et al. Fat-free mass, metabolically healthy obesity, and type 2 diabetes in severely obese asian adults. Endocr Pract. 2017;23: 915–922. doi: 10.4158/EP171792.OR [DOI] [PubMed] [Google Scholar]
  • 31.Czernichow S, Kengne AP, Huxley rachel r., Batty GD, de Galan B, Grobbee D, et al. Comparison of waist-to-hip ratio and other obesity indices as predictors of cardiovascular disease risk in people with type-2 diabetes: A prospective cohort study from ADVANCE. Eur J Prev Cardiol. 2011;18: 312–319. doi: 10.1097/HJR.0b013e32833c1aa3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.de Toro-Martín J, Guénard F, Bouchard C, Tremblay A, Pérusse L, Vohl MC. The Challenge of Stratifying Obesity: Attempts in the Quebec Family Study. Front Genet. 2019;10. doi: 10.3389/fgene.2019.00994 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Khera A V., Chaffin M, Aragam KG, Haas ME, Roselli C, Choi SH, et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat Genet. 2018;50: 1219–1224. doi: 10.1038/s41588-018-0183-z [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

David Meyre

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

28 Jun 2021

PONE-D-21-17351

Non-linear interaction between physical activity and polygenic risk score of body mass index in Danish and Russian populations

PLOS ONE

Dear Dr. Borisevich,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Aug 12 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

David Meyre

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please include additional information regarding the survey or questionnaire used in the study and ensure that you have provided sufficient details that others could replicate the analyses. For instance, if you developed a questionnaire as part of this study and it is not under a copyright more restrictive than CC-BY, please include a copy, in both the original language and English, as Supporting Information.

3. PLOS journals require authors to make all data underlying the findings described in their manuscript fully available without restriction unless the data are subject to ethical restrictions or owned by someone other than the authors (https://journals.plos.org/plosone/s/data-availability#loc-acceptable-data-access-restrictions). Therefore, we ask that you please upload underlying data to an appropriate data repository and update your Data Availability Statement accordingly or provide all contact details for where an interested researcher would need to apply to gain access to the relevant data. Please note that it is not acceptable for an author to be the sole named individual responsible for ensuring data access.

4. Thank you for stating the following in the Financial Disclosure section:

"Data collection in the Inter99 study was supported economically by The Danish Medical Research Council (grant nr. 2028-00-0019 and 09-059174), The Danish Centre for Evaluation and Health Technology Assessment (grant nr. 3126-138-1998 and 263-12-1999 and 0-204-03-74), Novo Nordisk, Copenhagen County (grant nr. 9870006), The Danish Heart Foundation (grant nr. 98-2-5-71-22659 and 00-2-9-F4-22872 and 04-10-B201-A309-22171), The Danish Pharmaceutical Association (grant nr. 53-99 and 58-2003), Augustinus foundation (grant nr. 99-1663), Ib Henriksen foundation, and Becket foundation. Novo Nordisk Foundation Center for Basic Metabolic Research is an independent Research Center, based at the University of Copenhagen, Denmark, and partially funded by an unconditional donation from the Novo Nordisk Foundation (www.cbmr.ku.dk) (NNF18CC0034900). Dmitrii Borisevich is receiving funding from NNF Copenhagen Bioscience PhD Programme (NNF17CC0026760).

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

We note that you received funding from a commercial source: Novo Nordisk

Please provide an amended Competing Interests Statement that explicitly states this commercial funder, along with any other relevant declarations relating to employment, consultancy, patents, products in development, marketed products, etc.

Within this Competing Interests Statement, please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests).  If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include your amended Competing Interests Statement within your cover letter. We will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

5. Thank you for stating the following in the Competing Interests section:

"I have read the journal's policy and the authors of this manuscript have the following competing interests: Alexander Rakitko and Valery Ilinsky are employees of Genotek Ltd. The rest of the authors declare no competing interests."

We note that one or more of the authors are employed by a commercial company: Genotek Ltd.

5.1. Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form.

Please also include the following statement within your amended Funding Statement.

“The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”

If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.

5.2. Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc.  

Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to  PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests) . If this adherence statement is not accurate and  there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

6. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

7. We note that you have included the phrase “data not shown” in your manuscript. Unfortunately, this does not meet our data sharing requirements. PLOS does not permit references to inaccessible data. We require that authors provide all relevant data within the paper, Supporting Information files, or in an acceptable, public repository. Please add a citation to support this phrase or upload the data that corresponds with these findings to a stable repository (such as Figshare or Dryad) and provide and URLs, DOIs, or accession numbers that may be used to access these data. Or, if the data are not a core part of the research being presented in your study, we ask that you remove the phrase that refers to these data.

8. Please ensure that you refer to Figure 1 in your text as, if accepted, production will need this reference to link the reader to the figure.

9. Please include captions for *all* your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The present study analyzes the interaction between physical activity and polygenic risk scores for obesity. The authors identify a significant interaction between physical activity and a BMI polygenic risk score (PRS). The effect size of physical activity on BMI among the high-risk PRS was two times greater than the effect size in the low-risk PRS group. In my opinion, this is a methodologically strong paper that contributes to the evidence base in this area. Please see attachment for the complete review.

Reviewer #2: This paper report a well-designed gene x environment interaction study to investigate how genetic liability for obesity interact with physical activity. The results are timely and important to extend our understanding how genetic risk factors could modulate the effect of lifestyle. The manuscript is well structured and clearly written.

However, there are some points that should be considered to improve the manuscript:

1. The authors found that well known risk factors for obesity such as diet, socioeconomic class, mental health and sleep quality were not associated with BMI in their model when PRS was used but the associations were significant when GRS (or perhaps when no genetic variables) was in the model. It would be interesting to know how the authors interpret these findings. Do BMI PRS showed association with these factors? Only associations of cardiometabolic factors were reported in Table S1. To report associations with BMI and BMI PRS of diet, socioeconomic class, mental health and sleep quality would add the quality of the paper. Also, the interpretation should be added to the discussion.

2. It is not clear from the methods how mental health was measured. Please, add further description and reference.

3. Also, it is not understandable what the reported socioeconomic class categories mean, how they were measured. Please, describe it further and provide reference.

4. Please, add further details how genotype QC steps were carried out in the different cohorts. It is especially incomplete for the Genotek dataset.

5. There is no description in the methods section how the LDpred was used to calculate BMI PRS. How was the setting for LDpred defined? Why they selected “fraction of causative variants” 0.3?

6. Naming of the supplementary materials is confusing. Would be better to report them in a uniform way. Furthermore, S1 Table and Table S1 are the same?

7. Do the authors tested whether the other factors (such as diet, socioeconomic class, mental health and sleep quality) interact with BMI PRS, even though they do not have main effect? It is possible to see interaction effect even in the absence of main effect.

8. I miss an overview of past gene x physical activity interaction studies related to obesity or other conditions in the discussion, only FTO was mentioned.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: Plos One Review-June 7 (2021).docx

PLoS One. 2021 Oct 18;16(10):e0258748. doi: 10.1371/journal.pone.0258748.r002

Author response to Decision Letter 0


14 Sep 2021

Thank you to everyone for your exhaustive feedback. Please find our point-by-point rebuttal below.

To the academic editor

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

We have changed the style of the title page and headings, corrected minor details in the text, and uploaded the Figures and Supplementary files as separate files following the naming instructions.

2. Please include additional information regarding the survey or questionnaire used in the study and ensure that you have provided sufficient details that others could replicate the analyses. For instance, if you developed a questionnaire as part of this study and it is not under a copyright more restrictive than CC-BY, please include a copy, in both the original language and English, as Supporting Information.

We have not developed any questionnaires as part of this study. We used previously published questionnaires. We described the questions used in detail in the Materials and Methods (lines 129-148), and we have supplied additional references to the papers where the variables were originally published:

• “Smoking status [19], four categories: “smoking daily”, “smoking occasionally”, “never smoking”, and “previously smoking” (N = 5,155).

• Alcohol consumption [20], two categories: “no or moderate alcohol consumption” defined as ≤ 6 units/week for women and ≤ 12 units/week for men; “high alcohol consumption” defined as > 6 units week for women and > 12 units/week for men. 1 unit is equivalent to 12 g of pure alcohol (N = 5,002).

• Diet quality, two categories: “poor diet” defined as 4-8 points on the diet quality score (DQS) system described at [21]; “healthy diet” defined as ≥ 9 points on the DQS score system (N = 5,020).

• Physical activity level [22], two categories: “inactive” defined as self-reported commuting and leisure-time physical activity ≤ 225 min/week; “active” defined as self-reported commuting and leisure-time physical activity > 225 min/week (N = 4,859).

• Mental health [23], two categories: “high” defined as mental health component score (MCS) higher than the 75th percentile within the study population of the same sex; “low” defined as MCS lower than the 75th percentile. MCS has been calculated as described in [23], using the Short Form 12 (SF-12) questionnaire [24] (N = 4,878).

• Quality of sleep, two categories: “good” defined by the answer ‘no’ to the question ‘do you often suffer from insomnia’; “poor” defined by the answer ‘yes’ (N = 5,129).

• Socioeconomic class [20], five categories: “not working, no education”, “not working, ≥ 1 year of education”, “working, no education”, “working, 1-3 years of education”, “working, ≥ 4 years of education”. Education is counted after mandatory school years. The categories were combined from education and employment statuses, reported in [20] (N = 4,807).”

3. PLOS journals require authors to make all data underlying the findings described in their manuscript fully available without restriction unless the data are subject to ethical restrictions or owned by someone other than the authors (https://journals.plos.org/plosone/s/data-availability#loc-acceptable-data-access-restrictions). Therefore, we ask that you please upload underlying data to an appropriate data repository and update your Data Availability Statement accordingly or provide all contact details for where an interested researcher would need to apply to gain access to the relevant data. Please note that it is not acceptable for an author to be the sole named individual responsible for ensuring data access.

We have initiated uploading our polygenic risk score weight matrix to the appropriate public repository PGSCatalog. However, PGSCatalog is processing the submissions manually, and we are awaiting the approval. In the meantime, we have made the score available via Online supplementary, described in S1 Appendix. We have also shared the underlying code using GitHub, described in S1 Appendix.

4. Thank you for stating the following in the Financial Disclosure section:

"Data collection in the Inter99 study was supported economically by The Danish Medical Research Council (grant nr. 2028-00-0019 and 09-059174), The Danish Centre for Evaluation and Health Technology Assessment (grant nr. 3126-138-1998 and 263-12-1999 and 0-204-03-74), Novo Nordisk, Copenhagen County (grant nr. 9870006), The Danish Heart Foundation (grant nr. 98-2-5-71-22659 and 00-2-9-F4-22872 and 04-10-B201-A309-22171), The Danish Pharmaceutical Association (grant nr. 53-99 and 58-2003), Augustinus foundation (grant nr. 99-1663), Ib Henriksen foundation, and Becket foundation. Novo Nordisk Foundation Center for Basic Metabolic Research is an independent Research Center, based at the University of Copenhagen, Denmark, and partially funded by an unconditional donation from the Novo Nordisk Foundation (www.cbmr.ku.dk) (NNF18CC0034900). Dmitrii Borisevich is receiving funding from NNF Copenhagen Bioscience PhD Programme (NNF17CC0026760).

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

We note that you received funding from a commercial source: Novo Nordisk

Please provide an amended Competing Interests Statement that explicitly states this commercial funder, along with any other relevant declarations relating to employment, consultancy, patents, products in development, marketed products, etc.

Within this Competing Interests Statement, please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests). If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include your amended Competing Interests Statement within your cover letter. We will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be ompetin or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

Here is our Competing Interests Statement, including all the required information for the Novo Nordisk and also for Genotek Ltd:

“Novo Nordisk provided unrestricted grants for data collection. No authors were employed by or consulted Novo Nordisk during this study, and no conflict of interest exists in connection to patents, products in development, marketed products, or alike. This funding does not alter our adherence to PLOS ONE policies on sharing data and materials, and does not impose restrictions on sharing of data and/or material. AR and VI are employees of Genotek Ltd. and may own stock/stock options in the company. No other conflict of interest exists in connection to patents, products in development, marketed products, or alike. This funding does not alter our adherence to PLOS ONE policies on sharing data and materials, and does not impose restrictions on sharing of data and/or material. Other authors declare no conflict of interests.”

5. Thank you for stating the following in the Competing Interests section:

"I have read the journal's policy and the authors of this manuscript have the following competing interests: Alexander Rakitko and Valery Ilinsky are employees of Genotek Ltd. The rest of the authors declare no competing interests."

We note that one or more of the authors are employed by a commercial company: Genotek Ltd.

5.1. Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form.

Please also include the following statement within your amended Funding Statement.

“The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”

If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.

The affiliation did not play a role in our study. We have updated the Funding Statement, including the following information:

“Genotek Ltd provided only financial support in the form of AR and VI salaries and the data for the analysis, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”

5.2. Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc.

Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests) . If this adherence statement is not accurate and there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

See the Competing Interests Statement under paragraph 4.

Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

6. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

The underlying raw data comprises only sensitive data that cannot be made publicly available: genetic and biological data of human participants.

For Inter99 dataset, consent for publication of raw data was not obtained from the individuals, and the dataset could pose a threat to confidentiality. The restriction is imposed by Regional Scientific Ethics Committee (KA 98 155) and the Danish Data Protection Agency. Data access is provided by the Phenomics Platform of the NNF Center for Basic Metabolic Research, https://cbmr.ku.dk/researchfacilities/phenomics/.

For Genotek dataset, the user agreement (available at https://www.genotek.ru) states that disclosure of individual-level genetic information and/or self-reported Information to third parties for research purposes will not occur without explicit consent, and the consent was not obtained from the individuals. Due to the user agreement the individual level cannot be made directly available, and the dataset could pose a threat to confidentiality. Data have to be accessed indirectly via Genotek Ltd, https://www.genotek.ru.

7. We note that you have included the phrase “data not shown” in your manuscript. Unfortunately, this does not meet our data sharing requirements. PLOS does not permit references to inaccessible data. We require that authors provide all relevant data within the paper, Supporting Information files, or in an acceptable, public repository. Please add a citation to support this phrase or upload the data that corresponds with these findings to a stable repository (such as Figshare or Dryad) and provide and URLs, DOIs, or accession numbers that may be used to access these data. Or, if the data are not a core part of the research being presented in your study, we ask that you remove the phrase that refers to these data.

We had two occurrences, and we have fixed both. We have added Supplementary Table S2, reporting p-values and effect sizes for associations between BMI and the environmental risk factors when adjusted for BMI GRS (line 244). We have removed information about simulations regarding the difference between the variance explained by age and sex alone in the text, as we decided they are not a core part of the research.

8. Please ensure that you refer to Figure 1 in your text as, if accepted, production will need this reference to link the reader to the figure.

We have referred to Figure 1 in the beginning of the Materials and Methods (line 79) with the following text – “The workflow of the analysis is present at Fig 1.”

9. Please include captions for *all* your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

We have added the captions (lines 413-419):

“Supporting Information

S1 File. Distributions of the BMI-associated phenotypes of individuals, stratified by BMI polygenic risk score (PRS) deciles.

S1 Fig. Performance of models using two groups of genetic risk for interaction with physical activity.

S1 Table. p-values of associations between cardiometabolic traits and BMI or BMI PRS.

S2 Table. Associations of the risk factors with BMI adjusted for BMI GRS.

S1 Appendix. List of online supporting materials.”

To the reviewer #1

The present study analyzes the interaction between physical activity and polygenic risk scores for obesity. The authors identify a significant interaction between physical activity and a BMI polygenic risk score (PRS). The effect size of physical activity on BMI among the high-risk PRS was two times greater than the effect size in the low-risk PRS group. In my opinion, this is a methodologically strong paper that contributes to the evidence base in this area.

Major comments

1. What is the clinical significance of the 78% cut-off? The authors state that this has important clinical implications but this is only shown empirically through the difference in effect size becoming two-fold greater in this group for the interaction. The clinical implications of this are not clear. Are there other cardiometabolic traits that increase significantly (e.g., past a certain threshold or markedly increase disease risk) in the high-risk PRS group? There may be more empirical support needed to justify that this threshold has important clinical implications.

Thank you for your comment. The significance of the cut-off is utilitarian, in that in clinical genetics practice, it is easier to adopt binary criteria than to use continuous variables with no meaningful scale that a PRS represents. We agree that while we focus on BMI, even though we study a public health topic, we should not make strong statements about clinical implications. Studying how the high-risk PRS group differs from the low-risk group in other cardiometabolic traits would be a very interesting focus for future studies. For the scope of this manuscript, we have removed the statement about the clinical significance from the abstract (line 30):

“Our results … (ii) show a non-linear interaction between BMI genetics and physical activity”.

2. BMI is an imperfect measure of obesity since it does not distinguish fat vs. fat free mass and this warrants mention in the limitations section.

Thank you for your comment. We agree with this statement, and we have added the following text (lines 373-378) to the Discussion to address this limitation:

“Third, in this paper we focus on BMI, but there is a piling amount of evidence that other body composition measures are important, when assessing obesity. Fat vs. fat free mass distribution is an important measure [30], capturing differences in body constitution between the people with the same BMI. Another important set of measures are waist-to-hip ratio, waist circumference and similar [31], capturing the fat distribution around the body. These measures are orthogonal to BMI. Future studies of genetics of these parameters using PRS approach could provide additional value to understanding of obesity.”

Minor Comments

Line 45: this section would benefit from describing other sources of missing heritability such as gene-gene interactions.

Thank you for the comment. We agree, and we have listed all major potential sources of missing heritability and provided a reference in the Introduction (lines 43-45):

“This finding represents a marked case of missing heritability. There are many potential reasons for heritability missing from the GWAS findings [7], including non-significant variants with small effect size, rare variants, structural variation, and gene-gene and gene-environment interactions.”

Line 186: is it possible to report the number and descriptive statistics of the individuals that were excluded based on missing data?

Thank you for the great remark. This was an unfortunate phrasing from our side. 3,415 individuals were available after all exclusions based on age and questionnaire availability. No samples were excluded based on genetic QC, because if a certain sample had low call rate then the corresponding individual was asked to collect the saliva one more time and the analysis started from the beginning. We have removed the filtering description and instead described the dataset in lines 193-194 as:

“Genotek is an unpublished Russian population-based set consisting of 3,415 unrelated individuals aged between 20 and 60 years old with self-reported measures.”

Line 256: I appreciate the authors’ creativity to remove FTO from the PRS and demonstrate the interaction with physical activity after excluding this variant. Has the interaction with physical activity been tested with any individual SNPs beyond FTO? I appreciate that this may be beyond the scope of this paper but may be worth investigating in future studies if it is a question of interest.

Thank you! We have not checked other individual SNPs in our analysis. We agree and strongly believe that it would be interesting to combine together individual SNPs with strong effects and the polygenic background of the rest of the SNPs comprised with a PRS in future studies.

Line 259: I believe “There” should be revised to “These”

“There” replaces “In the replication dataset” in this context, and it was not a typo. We rephrased it (lines 269-271) as:

“In this dataset, physically active individuals had 0.97 kg/m2 (p = 9.10 x 10 14) lower BMI than inactive individuals and an additional 0.45 kg/m2 (p = 5.57 x 10-4) lower BMI per one standardized BMI PRS unit.”

Line: 304: it would be informative for the reader to describe the nature of the interaction in the first paragraph of the discussion (e.g., physical activity was associated with a greater decrease in BMI among people with a higher PRS).

Thank you for the comment, we agree. We have swapped the first and second paragraph of the discussion, to put the paragraph with results and a description of the nature of the interaction on the top of Discussion (lines 305-313):

“In this study, we have detected a non-linear interaction between BMI genetics and physical activity using BMI PRS. We have constructed a BMI PRS and assessed its performance. The BMI PRS demonstrated a substantial improvement in the explained variance of BMI over the BMI GRS. Using the BMI PRS, we have detected an interaction between the genetic component of BMI and physical activity in a relatively small dataset. This interaction was neither limited to the interaction driven by the FTO locus nor significant when using the BMI GRS for the same analysis. The application of PRS enabled us to identify 22% of individuals with the highest PRS. In this group, self-reported physical activity was associated with a 2-fold higher difference in BMI than in the remaining 78% of study participants. The model with a non-linear two-group division of individuals showed higher R2 than a model with a linear interaction.”

Line 315-316: the flow of this sentence could be improved, perhaps by adding “identify a” after “to”

We agree, and we have replaced the verb “subset”, which may be confused for a noun, with the proposed “identify” verb.

To the reviewer #2

This paper report a well-designed gene x environment interaction study to investigate how genetic liability for obesity interact with physical activity. The results are timely and important to extend our understanding how genetic risk factors could modulate the effect of lifestyle. The manuscript is well structured and clearly written.

However, there are some points that should be considered to improve the manuscript:

1. The authors found that well known risk factors for obesity such as diet, socioeconomic class, mental health and sleep quality were not associated with BMI in their model when PRS was used but the associations were significant when GRS (or perhaps when no genetic variables) was in the model. It would be interesting to know how the authors interpret these findings.

Also, the interpretation should be added to the discussion.

Thank you for the detailed comment! There seem to be some misunderstanding. We have studied seven risk factors known from the literature. We have found that in our data the four mentioned factors (diet, socioeconomic class, mental health, and sleep quality) did not associate with BMI when PRS was in the model (linear model “BMI ~ 1 + PRS + age + C(sex) + C(risk factor))”). The same risk factors were neither associated with BMI in our data when GRS was in the model (linear model BMI ~ 1 + GRS + age + C(sex) + C(risk factor))), or when no genetic variables were used (linear model BMI ~ 1 + age + C(sex) + C(risk factor))). So, there was no difference between using PRS or GRS. We have added Supplementary Table S2 with the detailed results of the model with GRS to the manuscript.

Do BMI PRS showed association with these factors?

Since there was no difference between using PRS or GRS, we have not checked whether BMI PRS showed association with the risk factors. To address your question, we have run multinomial logistic regression adjusted for age and sex (general formula “C(risk factor) ~ 1 + PRS + age + C(sex)”). Smoking, alcohol consumption, and mental health were associated significantly with PRS levels at the same cut-off as used for the original analysis (p < 4.17 x 10-3). We think that this might be caused by the fact that a PRS captures variants that are causal for the risk factors since BMI itself is correlated with the risk factors. An increase in the explained variance (calculated using McFadden’s pseudo-R-squared) provided by PRS compared to the models only with sex and age was below 0.3% for each of the traits.

Only associations of cardiometabolic factors were reported in Table S1. To report associations with BMI and BMI PRS of diet, socioeconomic class, mental health and sleep quality would add the quality of the paper.

Thank you for the comment. We agree that associations between all the risk factors and BMI should be reported, and we have described these associations in Table 1 (see below). The associations between BMI PRS and risk factors were weak, as described in the paragraph above. We think it might distract the readers from the focus of the manuscript, so we have not included these new results in the manuscript.

“Table 1. Associations of the risk factors with BMI.

Risk factor Category * Effect size P-value

Smoking Never smoker Reference group

Previous smoker + 0.06 kg/m2 0.648

Smoking occasionally + 0.27 kg/m2 0.341

Smoking daily - 1.06 kg/m2 6.49 x 10-18

Physical activity Active Reference group

Inactive + 0.79 kg/m2 8.49 x 10-13

Alcohol consumption No or moderate Reference group

High - 0.36 kg/m2 9.00 x 10-4

Diet quality Healthy Reference group

Poor - 0.25 kg/m2 0.024

Socioeconomic class Working, ≥ 4 years

of education Reference group

Working, 1-3 years

of education + 0.00 kg/m2 0.99

Working,

no education + 0.21 kg/m2 0.24

Not working, ≥ 1 year of education + 0.18 kg/m2 0.40

Not working,

no education + 0.01 kg/m2 0.98

Mental health High Reference group

Low - 0.06 kg/m2 0.62

Quality of sleep Good Reference group

Poor + 0.15 kg/m2 0.28

2. It is not clear from the methods how mental health was measured. Please, add further description and reference.

3. Also, it is not understandable what the reported socioeconomic class categories mean, how they were measured. Please, describe it further and provide reference.

Thank you for the comments. We agree with both, and we have provided references to all the used questions / measures in the Methods (lines 140-143, 146-151):

“Mental health [23], two categories: “high” defined as mental health component score (MCS) higher than the 75th percentile within the study population of the same sex; “low” defined as MCS lower than the 75th percentile (N = 4,878). MCS has been calculated as described in [23], using the Short Form 12 (SF-12) questionnaire [24].

Socioeconomic class [20], five categories: “not working, no education”, “not working, ≥ 1 year of education”, “working, no education”, “working, 1-3 years of education”, “working, ≥ 4 years of education”. Education is counted after mandatory school years. The categories were combined from education and employment statuses, reported in [20] (N = 4,807).

All risk factors were measured by self-report questionnaire in the Inter99 cohort.”

4. Please, add further details how genotype QC steps were carried out in the different cohorts. It is especially incomplete for the Genotek dataset.

Thank you for your comment, we have described the details of genotyping QC tools and parameters:

Inter99 dataset, lines 184-188:

“Genotypes were called using the Genotyping module (version 1.9.4) of GenomeStudio software (version 2011.1; Illumina). Only individuals having a call rate ≥98% were included. Monomorphic SNPs and SNPs with Hardy-Weinberg expectation p-value < 10-5 were excluded. Genotypes were imputed using the Michigan imputation server with the HRC1.1 [26] reference panel.”

Genotek dataset, lines 195-198:

“All individuals were genotyped using Illumina Infinium Global Screening Array (v1.0 / v2.0). SNPs with call rate <0.9, the calls on Y chromosome for women and the heterozygous calls on X chromosome for men were removed. Genotypes were imputed using BEAGLE 5.1 with the HRC reference panel. After imputation, variants with MAF below 1% or DR2 below 0.7 were excluded.”

5. There is no description in the methods section how the LDpred was used to calculate BMI PRS.

How was the setting for LDpred defined? Why they selected “fraction of causative variants” 0.3?

Thank you for your comment. We agree that the description was not clear enough. We have defined the “fraction of causative variants” parameter at 0.3 by screening eleven different potential fractions from evenly spaced logarithmic scale in the training subset. We have added a more detailed LDpred protocol to the Methods and described the screening process in more detail (lines 92-102):

“PRS was generated using LDpred tool (v.1.0.6) [9] and its standard workflow. Briefly, a Danish population-based cohort Inter99 [18] (N = 6,179) was used. The cohort was randomly split into training (N = 1,000) and validation (N = 5,179) subsets. SNPs available in the input data, comprising subsets and the summary statistics of the BMI GWAS, were aligned using LDpred coord command. Eleven BMI PRS were generated for the different parameter f, representing what LDpred calls the “fractions of causative variants”. The following parameters f were used to cover different orders of magnitude: 1.0, 0.3, 0.1, 0.03, 0.01, 0.003, 0.001, 0.0003, 0.0001, 3 x 10-5, 1 x 10-5. The scores were generated using LDpred gibbs and their performance was assessed in the training subset only using R2 provided by LDpred score command. The best performing score was selected. The score values were calculated for the samples in the validation subset using LDpred score command. These values and this subset were used for the subsequent analyses.”

6. Naming of the supplementary materials is confusing. Would be better to report them in a uniform way.

We have adjusted our naming. Please, let us know if there is still any confusing naming.

Furthermore, S1 Table and Table S1 are the same?

Thanks for catching this, they are indeed the same. We have replaced “S1 Table” with “Table S1”.

7. Do the authors tested whether the other factors (such as diet, socioeconomic class, mental health and sleep quality) interact with BMI PRS, even though they do not have main effect? It is possible to see interaction effect even in the absence of main effect.

Thank you for your comment. The way we described it made it look as if we only checked the three significantly associated risk factors, but we have actually checked all seven factors. We have rephrased this block in the Results section (lines 251-254):

“To analyze if the presence of the risk factor alters the effect of BMI PRS on BMI, we examined potential interactions between the BMI PRS and all seven risk factors (linear regression BMI ~ 1 + PRS + age + C(sex) + C(risk factor) + PRS:C(risk factor)). We found a significant interaction (twelve tests performed, p < 4.17 x 10 3) only between the BMI PRS and physical activity.”

8. I miss an overview of past gene x physical activity interaction studies related to obesity or other conditions in the discussion, only FTO was mentioned.

We have updated the Introduction to cover both studies discovering individual SNP-environment interactions, including FTO and CDH12 (rs986732), and studies on interactions visible via aggregated genetic background using risk scores in the following way (lines 53-63):

“Another source of the unexplained variation of BMI stems from gene-environment interactions [11,12]. An interaction occurs when the biological effect of a genetic variant depends on a risk factor, such as an environmental stimulus or a lifestyle factor [13,14]. For example, physical activity attenuates the effect of a common SNP rs9939609 within the FTO locus on BMI [15,16], the strongest common SNP known to associate with BMI. However, detection of the interactions driven by individual SNPs is challenging. A meta-analysis of 200,452 adults [12] reported only one additional SNP, rs986732, on top of the known FTO locus. Attempts to aggregate genetic background using GRS to increase power to detect gene-environment interactions have been made. Recently, a large study in up to ~360,000 unrelated participants from UK Biobank identified several risk factors – alcohol intake, physical inactivity, socioeconomic status, mental health, and sleeping patterns – that influenced the effect of a BMI GRS comprised of 94 BMI-associated genetic variants on BMI [17].”

Attachment

Submitted filename: To editor and reviewers.docx

Decision Letter 1

David Meyre

5 Oct 2021

Non-linear interaction between physical activity and polygenic risk score of body mass index in Danish and Russian populations

PONE-D-21-17351R1

Dear Dr. Borisevich,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

David Meyre

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

David Meyre

8 Oct 2021

PONE-D-21-17351R1

Non-linear interaction between physical activity and polygenic risk score of body mass index in Danish and Russian populations

Dear Dr. Borisevich:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. David Meyre

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File. Distributions of the BMI-associated phenotypes of individuals, stratified by BMI polygenic risk score (PRS) deciles.

    (ZIP)

    S1 Fig. Performance of models using two groups of genetic risk for interaction with physical activity.

    (DOCX)

    S1 Table. p-values of associations between cardiometabolic traits and BMI or BMI PRS.

    (XLSX)

    S2 Table. Associations of the risk factors with BMI adjusted for BMI GRS.

    (XLSX)

    S1 Appendix. List of online supporting materials.

    (DOCX)

    Attachment

    Submitted filename: Plos One Review-June 7 (2021).docx

    Attachment

    Submitted filename: To editor and reviewers.docx

    Data Availability Statement

    The polygenic risk score is available via Online supplementary, https://bmiprsxpa-pa7qfqmwhq-ew.a.run.app/. The underlying code is available at GitHub, https://github.com/borisevichdi/bmiprs-code. The underlying raw data comprises only sensitive data that cannot be made publicly available: genetic and biological data of human participants. For Inter99 dataset, consent for publication of raw data was not obtained from the individuals, and the dataset could pose a threat to confidentiality. The restriction is imposed by Regional Scientific Ethics Committee (KA 98 155) and the Danish Data Protection Agency. Data access is provided by the Phenomics Platform of the NNF Center for Basic Metabolic Research, https://cbmr.ku.dk/researchfacilities/phenomics/. Point of contact: CBMR-PhenomicsInfo@sund.ku.dk. For Genotek dataset, the user agreement (available at https://www.genotek.ru) states that disclosure of individual-level genetic information and/or self-reported Information to third parties for research purposes will not occur without explicit consent, and the consent was not obtained from the individuals. Due to the user agreement the individual level cannot be made directly available, and the dataset could pose a threat to confidentiality. Data have to be accessed indirectly via Genotek Ltd, https://www.genotek.ru. Data requests should be sent to the Genotek Ltd at info@genotek.ru.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES