Abstract
Obesity is associated with changes in the plasma lipids. Although simple lipid quantification is routinely used, plasma lipids are rarely investigated at the level of individual molecules. We aimed at predicting different measures of obesity based on the plasma lipidome in a large population cohort using advanced machine learning modeling. A total of 1,061 participants of the FINRISK 2012 population cohort were randomly chosen, and the levels of 183 plasma lipid species were measured in a novel mass spectrometric shotgun approach. Multiple machine intelligence models were trained to predict obesity estimates, i.e., body mass index (BMI), waist circumference (WC), waist-hip ratio (WHR), and body fat percentage (BFP), and validated in 250 randomly chosen participants of the Malmö Diet and Cancer Cardiovascular Cohort (MDC-CC). Comparison of the different models revealed that the lipidome predicted BFP the best (R2 = 0.73), based on a Lasso model. In this model, the strongest positive and the strongest negative predictor were sphingomyelin molecules, which differ by only 1 double bond, implying the involvement of an unknown desaturase in obesity-related aberrations of lipid metabolism. Moreover, we used this regression to probe the clinically relevant information contained in the plasma lipidome and found that the plasma lipidome also contains information about body fat distribution, because WHR (R2 = 0.65) was predicted more accurately than BMI (R2 = 0.47). These modeling results required full resolution of the lipidome to lipid species level, and the predicting set of biomarkers had to be sufficiently large. The power of the lipidomics association was demonstrated by the finding that the addition of routine clinical laboratory variables, e.g., high-density lipoprotein (HDL)- or low-density lipoprotein (LDL)- cholesterol did not improve the model further. Correlation analyses of the individual lipid species, controlled for age and separated by sex, underscores the multiparametric and lipid species-specific nature of the correlation with the BFP. Lipidomic measurements in combination with machine intelligence modeling contain rich information about body fat amount and distribution beyond traditional clinical assays.
Obesity is associated with changes in plasma lipids, but while simple lipid quantification is routinely used, plasma lipids are rarely investigated at the level of individual molecules. A machine learning study based on lipidomes of a total of 1,311 individuals reveals improved associations of plasma lipids with total body fat and fat distribution compared to routine clinical laboratory variables.
Introduction
Obesity, the abnormal or excessive fat accumulation that may impair health [1], is associated with increased morbidity and mortality from diseases such as type 2 diabetes and cardiovascular disease [2, 3]. According to World Health Organization, obesity has nearly tripled since 1975, which resulted in 39% of overweight and 13% of obese adults worldwide in 2016 [1].
Obesity can be estimated in a variety of ways: Most commonly, the body mass index (BMI), a ratio of body weight-for-height [4], is used as an indicator of general adiposity. It is convenient and simple but results in varying cardiovascular and metabolic manifestations across individuals. Although BMI largely increases as adiposity increases, it does not distinguish between fat and lean mass, and therefore, individuals with greater muscle mass will also have higher BMIs [5]. The waist-hip ratio (WHR) is an easily accessible measure of body fat distribution and consists of a comparison of waist and hip circumferences. Larger WHR indicates more intra-abdominal fat and is associated with higher risk for type 2 diabetes, cardiovascular disease, and mortality [6]. Similarly, waist circumference (WC) can be used and has been considered a more straight forward and reliable measure compared with WHR [7]. Furthermore, body fat percentage (BFP) is a measure of proportion of adipose tissue in the body compared with lean mass and water [8] and is mostly determined using bioelectrical impedance in field methods. Bioelectrical impedance analysis is a repeatable, easy-to-use, and low-cost method for the estimation of BFP; however, its reliability can be influenced by various factors, including the equation used and the characteristics of the sample in which they have been validated in [9]. BFP is associated with increased all-cause mortality independently of BMI and is often suggested to be a better estimation of adiposity than BMI for prognostic and exploratory purposes [10].
The human genetic predisposition to obesity is rather low. For example, a set of 97 genetic loci have been found associated with BMI, but they accounted for only 2.7% of BMI variation [11]. Similarly, a set of 12 loci explained 0.58% of the variance in BFP [12]. Thus, the genotype may not provide sufficient information for reliable risk assessment of obesity and associated outcomes, highlighting the need for more direct, phenotypic read-outs.
Lipidomics is an omic science, which comprehensively measures the entirety of lipid molecules in a sample [13–15]—the lipid phenotype—and can be used to identify multiparametric biomarkers for disease detection, prediction, and patient stratification. For shotgun lipidomics, this can be obtained in a single mass spectrometric measurement after direct infusion of the sample. The plasma lipidome offers a plethora of information on lipids, the metabolism, and biological functions that are currently inaccessible to routine clinical lipid chemistry. This information can be used to obtain insights into many complex disease processes [16, 17]. The shotgun lipidomics technique, in which lipids are efficiently obtained from biological material by automated organic solvent extraction and measured quantitatively and reproducibly in an automated high throughput approach, allows fast screening of several thousand samples with high reproducibility [18], rendering this technology a promising tool for clinical risk assessment and precision biomedicine.
Although first lipidomic biomarkers are entering the clinic [19, 20], certain analytical standards, such as intersite reproducibility, need to be established in order to make lipidomic measurements generally accepted in clinical settings [21, 22].
Here, we applied machine learning to model obesity estimates for a lipidomics data set of the large FINRISK 2012 population cohort comprising 1,061 plasma samples [23]. We identified a complex lipidomic signature for BFP and validated the model with an independent data set of the Malmö Diet and Cancer Cardiovascular Cohort (MDC-CC) comprising randomly selected 250 plasma lipidomes [24, 25] measured on the same platform [18]. We could predict BFP with an error of 8% of its full range and explain 73% of its variation based on age, sex, and the lipidome. This lipidomic signature of obesity outperforms classical clinical lipid measures and provides fine-grained and quantitative molecular phenotype enabling stratification and identification of different obesity manifestations.
Analyzing the plasma lipidome or the metabolome [26] to estimate obesity is of course much more complicated than by direct measurement and not what we aimed for. Instead, we are investigating how the plasma lipids reflect metabolic status and whether the plasma lipidome can be used to predict health and disease. There is already ample evidence that the plasma lipidome is changing in different disease states [16, 27], and here, we show that the plasma lipidome indeed gives information beyond obesity measures and classical clinical lipid parameters, such as triglycerides and cholesterol.
We find that the lipidome gives information about the body fat distribution as measured by the WHR because a number of lipid species correlate with the WHR, even when controlled for BMI. Lipidomes show differences between the sexes, concerning lipid levels, lipid coefficients of variation, and correlations of lipid species with obesity measures. These correlation profiles were similar between the 3 obesity estimates but very different from those lipids correlating with high-density lipoprotein (HDL) cholesterol, low-density lipoprotein (LDL) cholesterol, and triglyceride levels indicating that these commonly used lipid markers only insufficiently capture molecular lipid metabolism. We discuss correlations with obesity measures and find that highest lipid impact on our modeling algorithm features 4E,14Z-sphingadiene containing sphingomyelins. Finally, we look the variation not explained by the BMI and BFP regression and find those related to other clinical parameters, such as HDL and LDL cholesterol.
Results and discussion
We performed lipidomics analysis of 1,061 plasma samples of the FINRISK 2012 cohort (S2 Table shows clinical baseline characteristics). Plasma lipid species vary substantially between individuals and on a day-to-day basis [28, 29]. Coefficients of variation for each lipid subspecies showed population variations of 23% to 150% (S1A Fig), which is considerably larger than our 6.0% median technical coefficient of subspecies variation as assessed by reference samples (method precision). Low biological variation was found in lipid classes such as cholesterol (26%) and sphingomyelin (SM, median of 26%), whereas high variation was seen in dietary lipids like triacylglyceride (TAG) and diacylglyceride (DAG) species but also for phosphatidylethanolamine (PE) species. There are differences in variations between the sexes (S1B Fig), with TAGs varying more in males and SM varying more in females. Sex-specific differences are well documented in lipidomics studies [27, 30, 31].
Modeling obesity
Associations of BMI and obesity with lipidomes were investigated before [27, 32], and a more detailed discussion can be found in the S1 Text. We proceeded to construct models predicting obesity from the lipidome of the FINRISK data set. Models were trained on lipid subspecies, including age and sex (S3 Fig) as covariables. Using a Lasso model [33] trained in a cross-validation loop, we first used BMI as our obesity measure and reached a mean absolute error (MAE) of 2.5 ± 0.18 and an explained variation of 47% (S7 Table). Then, using the same procedure, we analyzed how the plasma lipidome is predicting other obesity measures compared with the models we obtained for BMI using a normalized MAE. On this comparable metric, BMI was outperformed (Fig 1A and S7 Table) by WC (MAE = 6.5 ± 0.59, 64% variation explained, WHR (MAE = 0.039 ± 0.0033, 65% variation explained), and BFP (MAE = 3.6 ± 0.33, 73% variation explained). This indicates that the lipidomic information about adiposity, as measured by WHR, WC, and BFP, is more precise than for BMI. Therefore, lipidomes contain information about the actual amount of body fat (BFP) and its distribution (WHR/WC). In the case of BFP, the high variation explained by the model is probably due to specific lipids released by the adipose tissue into the plasma. A similar notion has been reported in the case of branched-chain and aromatic amino acids [34].
We tested the presence of BFP-specific information in the lipidome by creating linear models for each lipid subspecies controlled for age and sex. This returned 141 significant lipid species after controlling for multiple testing. A similar amount of lipid species remained significant, even when the model was controlled for BMI (n = 82), WHR (n = 109), or BMI and WHR together (n = 52, S5 Table). A similar situation is found for WHR, for which linear models controlled for age and sex still returned similarly high amount of lipid specific for WHR (n = 134), when additionally controlled for BMI (n = 103), BFP (n = 90), and the combination of BMI and BFP (n = 93). As the relation of WHR and BFP with BMI seems nonlinear (S2 Fig), we also tested the relation using natural splines with similar results (S5 Table). All these results argue for a BFP and WHR specific but BMI independent lipid biology captured by human plasma lipidome, which is still largely unexplored.
Different BFP models and conditions
Six different models predicting BFP were trained and their parameters learned on 796 random training samples in a cross-validation loop (Fig 1B, Results for WHR and BMI in S7 Table). Tree-based random forest [35] and stochastic gradient boosting [36] do not perform significantly better than an ordinary linear model [37] of all lipid predictors. Partial least squares [38], which is well suited for the multicollinearity characterizing lipidomic data sets, was performing better but the Lasso [33] and Cubist [39] models showed even better performance. The simple Lasso model fit the data equally well as the Cubist model, and we used it for all remaining analyses because of its simplicity and interpretability. We also tested whether normalizing absolute lipid amounts to the total lipid amount in a sample (molar fraction [mol%]) would improve the fit by removing the influence of different lipid levels between samples. However, we found no evidence of this (Fig 1C).
Description of the BFP model
The best performing BFP Lasso model (MAE = 3.61 ± 0.33, variation explained = 73.2 ± 5%) resulted in 58 predictors, but there is also a slightly less performing Lasso model (MAE = 3.65 ± 0.33, variation explained = 72.9 ± 5.1%) with only 45 predictors within 1 standard error (S4 Fig and S6 Table). The simpler multiparametric model based on 45 predictors is essentially a subset of the complex multiparametric model based on 58 predictors (Figs 2 and S5).
The Pearson correlation network of the predictors of both Lasso models (Fig 2) shows several interesting features. Within the common lipid predictors of both BFP Lasso models, SM 34:1;2 has the greatest negative and SM 34:2;2 the greatest positive lipid β-coefficients by far (S5 Fig and S6 Table), whereas both are correlated with each other in the correlation network within a cluster of other SM species. The additional double bond in SM 34:2;2 is likely due to an 18;2;2 long-chain base [40, 41], which is present in human plasma [41] and has been shown to be a 4E,14Z-sphingadiene [42], thus suggesting SM 18:2;2/16:0;0 as the subspecies for SM 34:2;2 in plasma [31]. SM 34:2;2 and further doubly unsaturated SMs correlate positively with BFP, i.e., SM 36:2;2 and SM 38:2;2 especially in females (S11 Table), in which they also show higher levels (S3 Fig and S4 Table, [27, 31]). The 4E,14Z-sphingadiene is suggested to be produced by an unknown desaturase, which also creates the single 14Z double bond in 1-deoxysphingolipids [43]. Its supposed higher activity in females results in higher levels of the respective ceramides (Cers) and SMs [27]. As SM 34:1;2 has been reported to be >96% SM 18:1;2/16:0;0 in plasma [31], it is the occurrence of 4E,14Z-sphingadiene in specifically SMs with a 16:0;0 fatty acid, which is the major correlation with BFP picked up by the Lasso models. Their significance is supported by the reduction in prediction power if the SM class is removed from the model (S8 Fig), the fact the SMs are a particularly stable lipid class in plasma (S1 Fig), and that long-chain base effects of plasma sphingolipids have been recently reported to correlate with BMI [31]. How the balance between sphingosine and 4E,14Z-sphingadiene is mechanistically related to the overall metabolic status and its usefulness as a general BFP biomarker needs to be further investigated.
Associated with the SM cluster are multiple lipid predictors (cholesteryl ester [CE] 15:0;0, CE 17:0;0, and PC O-17:0;0/17:1;0) with odd chain fatty acids (Fig 2), which could be due to dairy consumption [44] or dietary fiber intake [45]. However, their association with SM and Cer species (Fig 2) might also indicate that these fatty acids are derived from hydroxylated fatty acids in glycosphingolipids or phytosphingosine [46] and therefore link the model to sphingolipids not measured in this study. Furthermore, we find a cluster of correlated lyso-lipids and of TAG species (Fig 2). TAGs with positive β-coefficients are largely consistent with common fatty markers [47]. A more detailed discussion of this observation and the association of product-to-precursor ratios of lipid metabolism enzymes to obesity measures is provided in the S1 Text.
Although the Lasso models are dominated by 2 coefficients of the sphingadiene SMs, the error of the model increases significantly, when less than 20 lipid predictors are used (S4 Fig), arguing that a single biomarker, or a small set of biomarkers, are not sufficient to predict and to faithfully capture the complex molecular scenario associated with obesity. In the FINRISK data set, fasting duration for subjects peak around 5 hours (semifasting); however, we saw no trend in the model residuals with fasting length (Fig 1F), indicating that differences in fasting time do not have an impact on the accuracy of the prediction of BFP. This is likely due to the fact that our model is not only based on diet-derived lipids, the levels of which are acutely varying in the blood plasma, but that the predictors of the model are spread across all lipid classes except the one HexCer species (S12A Fig). For example, changes in the diet are reflected in serum TAGs within the first few hours, whereas serum CE and phospholipids reflect the last 3 to 6 weeks [48].
Independent validation of the obesity model
Training of the BFP model on the FINRISK test set resulted in a cross-validation MAE of 3.61 ± 0.33 BFP units, which is about 8% of the BFP range (S2C Fig). The training error of the model was found at a MAE of 3.33 BFP units, and the mean error of the hold out FINRISK test data was at 3.84 (Fig 1E & Table 1). We validated the FINRISK based BFP model in a second, independent data set (MDC-CC), the clinical baseline characteristics of which differ from the FINRISK data set (S2 Table).
Table 1. Reproducibility of the model.
MAE (BFP) | Data set | n | MAE |
---|---|---|---|
Cross validation | FINRISK | 796 | 3.61 ± 0.33 |
Training | FINRISK | 796 | 3.33 |
Testing | FINRISK | 206 | 3.84 |
Validation | MDC-CC | 250 | 3.67 |
Abbreviations: BFP, body fat percentage; MAE, mean absolute error; MDC-CC, Malmö Diet and Cancer Cardiovascular Cohort
This validation resulted in a MAE of 3.67, which is only slightly above the cross-validation error obtained with the FINRISK data set. The validation also confirms that the models obtained were independent of the fasting duration, because the participants from the MDC-CC cohort were fasted over night.
The MDC-CC validation data set was measured 2 years later than the FINRISK data set on the same platform, arguing for our shotgun lipidomic approach to be highly reproducible. Taken together, these results show that we have identified a robust BFP lipidomic signature (Fig 2 and S6 Table), which was validated in an entirely independent data set. It would be interesting to see whether the model is transferable to subjects from other geographic regions with different population structures and lifestyle habits, because both data sets used originate from northern European countries.
Comparison to a metabolomic obesity model
Recently, a metabolomic data set was used to model BMI using 49 selected metabolites. This study found that this set of metabolites explained 43% of BMI variation when age and sex were included [26]. If the model was extended to the full set of 650 metabolites measured in the study, 47% to 49% of the BMI variation could be explained. In both cases, a major fraction of the metabolites (47% and 40%, respectively) were associated with the lipid superpathway. Our BMI model, although similar in many modeling aspects, is exclusively based on shotgun lipidomics. With 75 predictors in a Lasso model, it explains 47% of the BMI variation, and a model with only 50 predictors resulted in 46.5% of the variation explained. Although the population, experimental set-up, and computational modeling in the metabolomic study and in our study are not directly comparable, this suggests that the data generated with our lipidome shotgun method provide predictions of comparable quality as liquid chromatography-mass spectrometry (LC-MS) metabolomic data used in the above-mentioned study. However, the goal was achieved with a single measurement in a fully high-throughput assay. Therefore, shotgun mass spectrometry lipidomics, with its quantitative and straight-forward approach, together with fast measurements, is reproducible, robust [18], and well prepared to be used in a routine clinical setting.
Although the metabolome-derived model [26] explained only about 50% of the actual BMI variation, the metabolome-predicted BMI had improved features, such as better correlations with other clinically important variables, e.g., insulin resistance and HDL cholesterol levels. In addition, if the metabolome predicted a substantially higher BMI than the actual BMI of the subject, these subjects scored worse on a set of clinical health measures. If, however, the metabolome predicted a lower BMI than their actual BMI, the subjects scored better on the respective health measures. Because of uneven distribution of outliers in our models, we were not able to fully show the just described outlier characteristics as in Cirulli and colleagues [26]. However, when we restrict our models to a range of the FINRISK data set, in which both over- and underestimated outliers are present, we observe similar effects (S9 Fig). Still, the overpredicted outliers are in the range of low observed BMI and the underpredicted outliers in a range of high observed BMI. Therefore, the mean BMI of overpredicted samples (24.3 ± 2.0) is much lower than the mean BMI of underpredicted samples (26.9 ± 2.1). Despite this adverse setting, we could validate that individuals who had a lower BMI than predicted from the plasma lipidome had worse routine clinical laboratory values, e.g., HDL and LDL cholesterol, than those individuals whose actual BMI was higher than predicted (S9 Fig). Similar results were obtained for our BFP regression of female subjects (S10 Fig), and weaker trends were observed for male subjects (S11 Fig). Therefore, our results confirm the earlier outlier findings [26] but extend them to a lipidomic setting and also to BFP as an obesity measure. They support the conclusion that a multiparametric lipidomic estimate has a stronger predictive value on obesity than the classical predictors, improving on shortcomings in terms of total fat amount and distribution. They further show that lipidomics captures obesity-related metabolic aberrations more accurately than classical clinical parameters. The fact that outliers of the obesity predictions align with better or worse clinical laboratory values suggests overlapping markers of obesity and other diseases, e.g., a dysregulated lipid metabolism, which not only align with obesity but a range of diseases [47]. Although the lipid metabolism measured by the lipidome would be interpreted by the model as, e.g., a higher BFP, these markers might actually be hinting at other diseases. This unaccounted variation should be further explored.
Effect of input variables and level of lipidome resolution on BFP
To test the quality of the lipidomic predictions, we compared our results with predictions of BFP based on clinical parameters (Fig 3 and S8 Table). As a zero model, we used the mean of the BFP distribution with a MAE of 7.3 (Fig 3B, −Age, −Sex, No Lipids [L]) or 0% of variation explained (S7 Fig and S9 Table). Addition of routine clinical laboratory values (e.g., total cholesterol, triglycerides, LDL cholesterol, HDL cholesterol) to the model hardly improved the BFP prediction (Fig 3B [C], MAE = 7.1 or 8.0% of variation explained). Inclusion of additional variables, e.g., smoking status or blood pressure treatment alone (Fig 3B [A], MAE = 7.2 or 5.8% variation explained) or together with the routine clinical laboratory values (Fig 3B [C + A], MAE = 7.0 or 11% of variation explained) also did not improve prediction of BFP.
We then assessed how increased structural resolution of the lipidome influenced the predictive outcome (Figs 3A and S6). Already including the total molar amounts of 4 plasma lipid categories [49], glycerolipids, glycerophospholipids, sphingolipids, and sterol lipids, improved prediction outcomes to a MAE of 6.7 to 7.1 (7.0%–18% variation explained; Categories). Because this enhancement is adding to the improvement obtained by clinical parameters alone, it shows that lipid category amounts add information not contained in the other variables. The next level of structural detail is that of plasma lipid classes (e.g., PC or PE). Addition of the total molar amounts of 14 lipid classes to the BFP model further improved the prediction to 6.0 to 6.1 or 27% to 32% variation explained (Classes). The biggest improvement of the model was obtained when information of molar amounts of individual lipid molecules (143 species or a mixture of 183 species and subspecies) was used, reaching a MAE of 4.8 ± 0.43 or 55% to 57% variation explained (Species/Subspecies). Therefore, molecular lipid information is clearly superior in predicting BFP over more aggregated measures, such as HDL cholesterol, LDL cholesterol, total cholesterol, lipid categories, or lipid classes. We confirmed that the prediction based on lipid subspecies was not improved by including information on classical clinical parameters. This is expected, since LDL cholesterol, HDL cholesterol, and triglycerides have very distinct correlation patterns with the lipid subspecies profile (S16 Fig) and are, therefore, already represented in the lipidome. The correlations are in line with reported relative amounts of lipids found in these lipoproteins [50]. We also observed multiple interesting lipid species-specific differences in these correlations, e.g., sex-specific signs for correlations of PE species with HDL cholesterol. In the case of the clinical triglycerides measurement, the major correlating lipid classes are expectedly TAG and DAG (S16 Fig). However, some highly unsaturated TAGs do not correlate strongly with the triglycerides value. Furthermore, sex-specific correlations of triglycerides are observed for cholesterol and ceramides, both of which show greater correlation coefficients in males than in females. This is in agreement with the results provided in previous studies [30, 51].
Contrary to BMI, BFP is strongly influenced by gender (S2 Fig), which is reflected by the improved prediction outcome after including age and sex variables into the model (S7 Fig). BFP predictions based on age and sex variables alone already have a MAE of 4.5 to 5.2 or 45% to 57% variation explained (Fig 3, +Age +Sex). When age and sex are considered, also routine clinical laboratory values result in improved BFP prediction (Fig 3, +Age +Sex: C, C+A). However, when the structural detail of lipid information is increased, predictions of BFP improved even further. For the model containing age, sex, and lipid subspecies, a MAE as low as 3.6 ± 0.33 or 73% variation explained was achieved. In this case, 62% of the variation not explained by age and sex is explained by the lipidome (S10 Table). These subspecies models are also not improved by the addition of clinical parameters or additional variables, which again shows that these parameters provide no additional information for BFP prediction. Similar models for BMI, WC, and WHR show comparable results (S7 Fig and S8 Table) with some variation on the magnitude of dependency on age and sex or classical clinical parameters.
We also tested whether a lipid class was necessary for prediction or could be compensated for by the other lipid classes. The most important lipid class for predicting BFP was SM. Apart from PC, the only lipid class, the removal of which reduces model performance is SM (S8 Fig). This is in agreement with complex correlation patterns observed for SM species. As mentioned above, SM 34:1;2 is inversely correlated with BFP in males, whereas SM 34:2;2 is directly correlated in females (Fig 4), both with a similar correlation estimate and the greatest positive (SM 34:1;2) and negative (SM 34:2;2) β-coefficients in the Lasso models (S5 Fig).
We conclude that the plasma lipidomes, measured by a single shotgun mass spectrometric analysis, have significantly more predictive power predicting obesity than classically used clinical parameters and that it is the resolution to molecular detail at the subspecies level that provides the relevant information. It is to be expected that similar predictive information on other metabolic states (disease and life style) are represented in the multiparametric lipidomics data.
Lipid correlations with BFP
Although the predictive algorithm (Lasso) selects features on the basis of overall prediction error, it cannot be employed to define the individual lipidomic features correlating with BFP. Therefore, a comprehensive Spearman correlation analysis of lipid subspecies and additional features was performed for male and female subjects individually, including age as a covariable (Figs 4 and S12). The results showed that 53.5% (108) of lipid subspecies in females and 65.3% (132) in males of a total of 202 tested lipid species correlated significantly with BFP after Benjamini-Hochberg correction for multiple hypothesis testing. The BFP to lipidome correlation profile is similar to the BMI (S13 Fig), WHR (S14 Fig), and WC (S15 Fig) profiles; however, the magnitude of the correlation coefficients reflects the respective MAEs in modeling (Fig 1A), with BFP showing highest correlation coefficients and lowest error. Lipidome correlations to HDL cholesterol, LDL cholesterol, or triglycerides (S16 Fig), on the other hand, show very different profiles. As expected from the prediction results, there is a lipid subspecies-specific effect, i.e., individual lipid species showing proportional or inverse correlations, despite being in the same lipid class, e.g., CE 20:1;0 and CE 20:2;0 are inversely correlating, whereas CE 20:3;0 and 20:4;0 are proportionally correlating to BFP. All lipid classes are observed to contribute significant correlations (Fig 4 and S1 Text). These results suggest complex systemic perturbations of lipid metabolism in the obese state.
Conclusion
We show that by exploiting the species diversity revealed by a single quantitative measurement in a lipidomics readout of a large population cohort, we can use machine learning to model and validate obesity estimates better than by using classical clinical parameters, such as total triglycerides and cholesterol. These results show that the molecular details of the plasma lipidome capture obesity-related metabolic aberrations more accurately than these classical clinical parameters. We further confirmed that outliers in the correlation between lipidomic profiles and obesity measures have clinical profiles that could predispose for later obesity-related noncommunicable diseases [26, 52]. The future challenge will be to use this technology to stratify obesity to accurately predict who will stay healthy and who will progress toward disease.
Materials and methods
FINRISK 2012 cohort
The National FINRISK Study is a Finnish population survey conducted every 5 years since 1972 [23]. Samples of the FINRISK 2012 underwent lipidomics measurements (1,141 randomly selected individuals) of which 1,061 were used (See S2 Table) after lipidomic quality control based on total lipid amount or disturbed lipid profile. FINRISK participants were advised to fast at least for 4 hours before the examination and avoid heavy meals earlier during the day. Measurements were obtained as described by Borodulin and colleagues [23]. BFP in the FINRISK study was measured using bioelectrical impedance device (Tanita TBF-300MA, Tanita Corporation, Tokyo, Japan).
The FINRISK 2012 survey was approved by the Coordinating Ethical Committee of the Helsinki and Uusimaa Hospital District, the participants gave a written informed consent, and the study was conducted following the principles of the declaration of Helsinki.
All data discussed in the paper can be made available to established researchers by a written application to the FINRISK Executive Board. Application portal is located at https://thl.fi/fi/tutkimus-ja-kehittaminen/tutkimukset-ja-hankkeet/finriski-tutkimus/tietoa-tutkijoille. More information can be obtained through info@med.lu.se.
The MDC-CC
MDC-CC is a Swedish cohort designed to study the epidemiology of carotid artery disease from 1991 through 1994 [24, 25]. The MDC-CC was approved by the Regional Board of Ethics in Lund, Dnr 2009/63, and all participants provided written informed consent. A total of 250 subjects were randomly selected as a validation data set, and clinical characteristics of the study samples are presented in S2 Table. MDC-CC participant plasma samples were collected after overnight fasting. Bioelectrical impedance analyzers (BIAs) were used to estimate body composition, and BFP was calculated using an algorithm according to procedures provided by the manufacturer (BIA 103, single-frequency analyzer, JRL Systems, Detroit, IL, USA) [8].
MDC-CC data discussed in the paper will be made available to readers based on a written application to the MDC-CC steering committee (info@med.lu.se).
Lipid nomenclature
Lipid molecules are identified as species or subspecies. Fragmentation of the lipid molecules in MSMS mode delivers subspecies information, i.e., the exact acyl chain (e.g., fatty acid) composition of the lipid molecule. MS only mode, acquiring data without fragmentation, cannot deliver this information and provides species information only. In that case, the sum of the carbon atoms and double bonds in the hydrocarbon moieties is provided. Lipid species are annotated according to their molecular composition as lipid class <sum of carbon atoms>:<sum of double bonds>;< sum of hydroxyl groups>. For example, PI 34:1;0 denotes phosphatidylinositol with a total length of its fatty acids equal to 34 carbon atoms, total number of double bonds in its fatty acids equal to 1 and 0 hydroxylations. In case of sphingolipids, SM 34:1;2 denotes a sphingomyelin species with a total of 34 carbon atoms, 1 double bond, and 2 hydroxyl groups in the ceramide backbone. Lipid subspecies annotation contains additional information on the exact identity of their acyl moieties and their sn-position (if available). For example, PI 18:1;0_16:0;0 denotes phosphatidylinositol with octadecenoic (18:1;0) and hexadecanoic (16:0;0) fatty acids, for which the exact position (sn-1 or sn-2) in relation to the glycerol backbone cannot be discriminated (underline "_" separating the acyl chains). On contrary, PC O-18:1;0/16:0;0 denotes an ether-phosphatidylcholine, in which an alkyl chain with 18 carbon atoms and 1 double bond (O-18:1;0) is ether-bound to sn-1 position of the glycerol and a hexadecanoic acid (16:0;0) is connect via an ester bond to the sn-2 position of the glycerol (slash “/” separating the chains signifies that the sn-position on the glycerol can be resolved). Lipid identifiers of the SwissLipids database [53] (http://www.swisslipids.org) are provided in S1 Table.
Analytical process design
Samples were divided into analytical batches of 84 samples each. Each batch was accompanied by a set of 4 blank samples (150 mM ammonium bicarbonate [in water]) and a set of identical 8 control reference samples (human blood plasma). These control samples in groups of 1 blank and 2 reference samples were distributed evenly across each batch and extracted and processed together with study samples to control for background and intrarun reproducibility.
Lipid extraction for mass spectrometry lipidomics
Mass spectrometry–based lipid analysis was performed as described by Surma and colleagues [18]. For lipid extraction, an equivalent of 1 μL of undiluted plasma was used, and plasma lipids were extracted with methyl tert-butyl ether/methanol (7:2, V:V) [54]. Internal standards (Avanti Polar Lipids, Birmingham, AL) were premixed with the organic solvents mixture. The internal standards included known amounts of: cholesterol D6 (Chol), cholesterol ester 20:0 (CE), ceramide 18:1;2/17:0 (Cer), diacylglycerol 17:0/17:0 (DAG), phosphatidylcholine 17:0/17:0 (PC), phosphatidylethanolamine 17:0/17:0 (PE), lysophosphatidylcholine 12:0 (LPC), lysophosphatidylethanolamine 17:1 (LPE), triacylglycerol 17:0/17:0/17:0 (TAG), and sphingomyelin 18:1;2/12:0 (SM). After extraction, the organic phase was transferred to an infusion plate and dried in a speed vacuum concentrator. Dried extract was resuspended in 7.5 mM ammonium acetate in chloroform/methanol/propanol (1:2:4, V:V:V). All liquid handling steps were performed using Hamilton Robotics STARlet (Hamilton Robotics, Reno, NV) with the Anti Droplet Control feature for organic solvents pipetting. Chemicals and solvents of HPLC/LC-MS analytical grade were used (Merck, Darmstadt, Germany).
MS data acquisition
Samples were analyzed by direct infusion in a QExactive mass spectrometer (Thermo Scientific, Bremen, Germany) equipped with a TriVersa NanoMate ion source (Advion Biosciences, Ltd., Ithaca, NY). Samples were analyzed in both positive and negative ion modes with a resolution of Rm/z = 200 = 280,000 for MS and Rm/z = 200 = 17,500 for MSMS experiments in a single acquisition. MSMS was triggered by an inclusion list encompassing corresponding MS mass ranges scanned in 1 Da increments. Both MS and MSMS data were combined to monitor CE, DAG, and TAG ions as ammonium adducts; PC, PC O−, as acetate adducts and PE, PE O−, and PI as deprotonated anions. MS only was used to monitor LPE and LPE O− as deprotonated anions; Cer, SM, LPC, and LPC O− as acetate adducts; and cholesterol as ammonium adduct.
Postprocessing
Spectra were analyzed with in-house developed lipid identification software based on LipidXplorer [55, 56]. TAGs are quantified as species (e.g., TAG 48:0;0). Fatty acid amounts within TAG species are achieved by distributing the total species amount by fatty acid fragment intensities. Data postprocessing and normalization were performed using an in-house developed data management system. Only lipid identifications with a signal-to-noise ratio >5, and a signal intensity 5-fold higher than in corresponding blank samples were considered for further data analysis. For FINRISK, using 3 reference samples per 96-well plate batch, lipid amounts were corrected for batch variations. An occupational threshold of 70% was applied to the data, keeping lipid species, which were present in at least 70% of the subjects. The median coefficient of subspecies variation, as accessed by reference samples, was 5.96%. In the MDC-CC data set, batch correction was applied using 8 reference samples per 96-well. Amounts were also corrected for analytical drift if the p-value of the slope was below 0.05 with an R2 greater than 0.75 and the relative drift was above 5%. Median coefficient of subspecies variation, as accessed by reference samples, was 10.49%, and no occupational threshold was applied. For predictive modeling, lipid species were matched between the FINRISK and MDC-CC datasets.
Data analysis
Data were analyzed with R version 3.4.2 [37] using tidyverse packages [57]. For correlations analysis, outliers were removed, which were more than 4.5 interquartile ranges from the median of the lipid species, whereas the full set was used for predictive modeling. The data set was split for the 2 sexes, and for each subset, age-adjusted Spearman correlation coefficients (ρ) were calculated using the RVAideMemoire::pcor.test() function [58]. CIs were thereby created using 1,000 bootstrap resamples. When testing whether male and female correlation coefficients were significantly different from each other, the cocor.indep.groups() function from the cocor package was was used with default parameters. Correlation coefficients were significantly different if the confidence interval of the difference did not include zero [59, 60]. The correlation network was calculated with the stats::cor() function using the Pearson correlation method and pairwise complete observations. The network was visualized using Cytoscape version 3.5.0 [61].
For linear models of obesity measures with covariables, outliers were removed, which were more than 4.5 interquartile ranges from the median of the lipid species. Regression models were created by the lm() function, and 95% CIs were calculated using the confint() function. Natural splines were created with the ns() function of the splines package. Degrees of freedom of the splines were analyzed in a 10-fold cross-validation loop with MAE as readout: no effect was determined for age, and 3 degrees of freedom were used as a default for the obesity estimate covariables.
A coefficient of partial determination [62] was calculated as the proportion of variation that cannot be explained in a reduced model, using the residual sum of squares (RSS) of the full model or reduced model:
Predictive modeling
Cubist [39], Lasso [33], partial least squares [38], stochastic gradient boosting [36], random forest [35], and linear models were trained within the caret package, version 6.0–76 [63]. Input data were randomly split into a training (80%, n = 796) and a test data set (20%, n = 206) were used for all models. The input data were also filtered to contain only complete measurements of all modeled variables (BFP, BMI, WHR, WC, n = 1,002). Models were trained using a 5× repeated 10-fold cross-validation loop. Within the cross-validation loop data were centered and scaled (Z-score) to avoid predominance of the most abundant features, missing values were imputed by the median value of the predictor and near zero-variance variables were removed [63]. The final model was fit on all training data. MAEs are calculated by predicting the training data (training MAE), hold out test data (test MAE), and validation data (validation MAE).
Product-to-precursor ratios
Fatty acid desaturases and elongation activities were estimated by calculating product-to-precursor ratios of sums of fatty acids in all lipids measured on the subspecies level (CE, DAG, TAG, LPC, LPE, PC, PC O−, PE, PE O−, PI) as described and discussed in work by Vessby and colleagues and Kjellqvist and colleagues [48, 64]. The following indices were used: The ratio of 16:1;0 to 16:0;0 (C16) and 18:1;0 to 18:0;0 (C18) was used to estimate the SCD1 Δ-9-desaturase (D9D) activity [48, 65, 66]. The ratio of 20:4;0 to 20:3;0 was used to estimate Δ-5-desaturase (D5D) activity [64, 66, 67]. The ratio of 18:3;0 to 18:2;0 for D6D activity [48, 66]. The ratio of 20:3;0 to 18:3;0 for ELOVL5 activity [64]. The ratio of 18:0;0 to 16:0;0 for ELOVL6 activity [68, 69]. The ratio of 16:0;0 to 18:2;0 as de novo lipogenesis index (DNL) [70].
Supporting information
Acknowledgments
We thank Markus Damm for the creation of modeling workflows and Kai Schuhmann for helpful discussions.
Abbreviations
- BFP
body fat percentage
- BIA
bioelectrical impedance analyzer
- BMI
body mass index
- CE
cholesteryl ester
- Cer
ceramide
- Chol
cholesterol
- DAG
diacylglyceride
- HDL
high-density lipoprotein
- LC-MS
liquid chromatography-mass spectrometry
- LDL
low-density lipoprotein
- LPC
lysophosphatidylcholine
- LPE
lysophosphatidylethanolamine
- MAE
mean absolute error
- MDC-CC
Malmö Diet and Cancer Cardiovascular Cohort
- mol%
molar fraction
- PC
phosphatidylcholine
- PE
phosphatidylethanolamine
- RSS
residual sum of squares
- SM
sphingomyelin
- TAG
triacylglyceride
- WC
waist circumference
- WHR
waist-hip ratio
Data Availability
All FINRISK data discussed in the paper can be made available to established researchers by a written application to the FINRISK Executive Board. The application portal is located at https://thl.fi/fi/tutkimus-ja-kehittaminen/tutkimukset-ja-hankkeet/finriski-tutkimus/tietoa-tutkijoille. More information can be obtained through finriski@thl.fi. MDC-CC data discussed in the paper will be made available to readers based on a written application to the MDC-CC steering committee (info@med.lu.se).
Funding Statement
VS has been supported by the Finnish Foundation for Cardiovascular Research (http://www.sydantutkimussaatio.fi/en/foundation) and EI by the Academy of Finland (https://www.aka.fi/en, grants 307415, 312491). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.World Health Organization (2018) Obesity and overweight. Geneva: World Health Organization; Available from: http://www.who.int/news-room/fact-sheets/detail/obesity-and-overweight. [cited 2019 Sep 26] [Google Scholar]
- 2.Borrell LN, Samuel L. Body Mass Index Categories and Mortality Risk in US Adults: The Effect of Overweight and Obesity on Advancing Death. American Journal of Public Health. 2014;104(3):512–519. 10.2105/AJPH.2013.301597 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Flegal KM. Excess Deaths Associated With Underweight Overweight, and Obesity. JAMA. 2005;293(15):1861 10.1001/jama.293.15.1861 [DOI] [PubMed] [Google Scholar]
- 4.Keys A, Fidanza F, Karvonen MJ, Kimura N, Taylor HL. Indices of relative weight and obesity. Journal of chronic diseases. 1972;25(6):329–343. 10.1016/0021-9681(72)90027-6 [DOI] [PubMed] [Google Scholar]
- 5.Neeland IJ, Poirier P, Després JP. Cardiovascular and Metabolic Heterogeneity of Obesity: Clinical Challenges and Implications for Management. Circulation. 2018;137(13):1391–1406. 10.1161/CIRCULATIONAHA.117.029617 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Pischon T, Boeing H, Hoffmann K, Bergmann M, Schulze MB, Overvad K, et al. General and abdominal adiposity and risk of death in Europe. The New England journal of medicine. 2008;359(20):2105–2120. 10.1056/NEJMoa0801891 [DOI] [PubMed] [Google Scholar]
- 7.Visscher TL, Seidell JC, Molarius A, van der Kuip D, Hofman A, Witteman JC. A comparison of body mass index, waist-hip ratio and waist circumference as predictors of all-cause mortality among the elderly: the Rotterdam study. International journal of obesity and related metabolic disorders: journal of the International Association for the Study of Obesity. 2001;25(11):1730–1735. [DOI] [PubMed] [Google Scholar]
- 8.Calling S, Hedblad B, Engström G, Berglund G, Janzon L. Effects of body fatness and physical activity on cardiovascular risk: Risk prediction using the bioelectrical impedance method. Scandinavian Journal of Public Health. 2006;34(6):568–575. 10.1080/14034940600595621 [DOI] [PubMed] [Google Scholar]
- 9.Sergi G, De Rui M, Stubbs B, Veronese N, Manzato E. Measurement of lean body mass using bioelectrical impedance analysis: a consideration of the pros and cons. Aging clinical and experimental research. 2017;29(4):591–597. 10.1007/s40520-016-0622-6 [DOI] [PubMed] [Google Scholar]
- 10.Padwal R, Leslie WD, Lix LM, Majumdar SR. Relationship Among Body Fat Percentage, Body Mass Index, and All-Cause Mortality: A Cohort Study. Annals of internal medicine. 2016;164(8):532–541. 10.7326/M15-1181 [DOI] [PubMed] [Google Scholar]
- 11.Locke AE, Kahali B, Berndt SI, Justice AE, Pers TH, Day FR, et al. Genetic studies of body mass index yield new insights for obesity biology. Nature. 2015;518(7538):197–206. 10.1038/nature14177 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Lu Y, Day FR, Gustafsson S, Buchkovich ML, Na J, Bataille V, et al. New loci for body fat percentage reveal link between adiposity and cardiometabolic disease risk. Nature communications. 2016;7(1):10495. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Shevchenko A, Simons K. Lipidomics: coming to grips with lipid diversity. Nature reviews Molecular cell biology. 2010;11(8):593–598. 10.1038/nrm2934 [DOI] [PubMed] [Google Scholar]
- 14.Wenk MR. Lipidomics: new tools and applications. Cell. 2010;143(6):888–895. 10.1016/j.cell.2010.11.033 [DOI] [PubMed] [Google Scholar]
- 15.Harayama T, Riezman H. Understanding the diversity of membrane lipid composition. Nature reviews Molecular cell biology. 2018;19(5):281–296. 10.1038/nrm.2017.138 [DOI] [PubMed] [Google Scholar]
- 16.Meikle PJ, Summers SA. Sphingolipids and phospholipids in insulin resistance and related metabolic disorders. Nature reviews Endocrinology. 2017;13(2):79–91. 10.1038/nrendo.2016.169 [DOI] [PubMed] [Google Scholar]
- 17.Quehenberger O, Armando AM, Brown AH, Milne SB, Myers DS, Merrill AH, et al. Lipidomics reveals a remarkable diversity of lipids in human plasma. Journal of lipid research. 2010;51(11):3299–3305. 10.1194/jlr.M009449 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Surma MA, Herzog R, Vasilj A, Klose C, Christinat N, Morin-Rivron D, et al. An automated shotgun lipidomics platform for high throughput comprehensive, and quantitative analysis of blood plasma intact lipids. European Journal of Lipid Science and Technology. 2015;117(10):1540–1549. 10.1002/ejlt.201500145 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Havulinna AS, Sysi-Aho M, Hilvo M, Kauhanen D, Hurme R, Ekroos K, et al. Circulating Ceramides Predict Cardiovascular Outcomes in the Population-Based FINRISK 2002 Cohort. Arteriosclerosis, thrombosis, and vascular biology. 2016;36(12):2424–2430. 10.1161/ATVBAHA.116.307497 [DOI] [PubMed] [Google Scholar]
- 20.Summers SA. Could Ceramides Become the New Cholesterol? Cell metabolism. 2018;27(2):276–280. 10.1016/j.cmet.2017.12.003 [DOI] [PubMed] [Google Scholar]
- 21.Simons K. How Can Omic Science be Improved? PROTEOMICS. 2018;18(5–6):e1800039 10.1002/pmic.201800039 [DOI] [PubMed] [Google Scholar]
- 22.Bowden JA, Heckert A, Ulmer CZ, Jones CM, Koelmel JP, Abdullah L, et al. Harmonizing lipidomics: NIST interlaboratory comparison exercise for lipidomics using SRM 1950-Metabolites in Frozen Human Plasma. Journal of lipid research. 2017;58(12):2275–2288. 10.1194/jlr.M079012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Borodulin K, Tolonen H, Jousilahti P, Jula A, Juolevi A, Koskinen S, et al. Cohort Profile: The National FINRISK Study. International journal of epidemiology. 2017;47(3):696–696i. [DOI] [PubMed] [Google Scholar]
- 24.Persson M, Hedblad B, Nelson JJ, Berglund G. Elevated Lp-PLA2 levels add prognostic information to the metabolic syndrome on incidence of cardiovascular events among middle-aged nondiabetic subjects. Arteriosclerosis, thrombosis, and vascular biology. 2007;27(6):1411–1416. 10.1161/ATVBAHA.107.142679 [DOI] [PubMed] [Google Scholar]
- 25.Rosvall M, Janzon L, Berglund G, Engström G, Hedblad B. Incident coronary events and case fatality in relation to common carotid intima-media thickness. Journal of internal medicine. 2005;257(5):430–437. 10.1111/j.1365-2796.2005.01485.x [DOI] [PubMed] [Google Scholar]
- 26.Cirulli ET, Guo L, Swisher CL, Shah N, Huang L, Napier LA, et al. Profound Perturbation of the Metabolome in Obesity Is Associated with Health Risk. Cell metabolism. 2018; p. 1–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Huynh K, Barlow CK, Jayawardana KS, Weir JM, Mellett NA, Cinel M, et al. High-Throughput Plasma Lipidomics: Detailed Mapping of the Associations with Cardiometabolic Risk Factors. Cell chemical biology. 2018;. [DOI] [PubMed] [Google Scholar]
- 28.Assfalg M, Bertini I, Colangiuli D, Luchinat C, Schäfer H, Schütz B, et al. Evidence of different metabolic phenotypes in humans. Proceedings of the National Academy of Sciences. 2008;105(5):1420–1424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Begum H, Li B, Shui G, Cazenave-Gassiot A, Soong R, Ong RTH, et al. Discovering and validating between-subject variations in plasma lipids in healthy subjects. Scientific reports. 2016;6(1):19139. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Sales S, Graessler J, Ciucci S, Al-Atrib R, Vihervaara T, Schuhmann K, et al. Gender Contraceptives and Individual Metabolic Predisposition Shape a Healthy Plasma Lipidome. Scientific Reports. 2016;6(1). 10.1038/srep27710 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Chew WS, Torta F, Ji S, Choi H, Begum H, Sim X, et al. Large-scale lipidomics identifies associations between plasma sphingolipids and T2DM incidence. JCI insight. 2019;5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Frahnow T, Osterhoff MA, Hornemann S, Kruse M, Surma MA, Klose C, et al. Heritability and responses to high fat diet of plasma lipidomics in a twin study. Scientific reports. 2017;7(1):3750 10.1038/s41598-017-03965-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Zou H, Hastie T. elasticnet: Elastic-Net for Sparse Estimation and Sparse PCA; 2012. Version 1.1 [software]. Available from: https://CRAN.R-project.org/package=elasticnet. [cited 2019 Feb 20]
- 34.Newgard CB. Interplay between lipids and branched-chain amino acids in development of insulin resistance. Cell metabolism. 2012;15(5):606–614. 10.1016/j.cmet.2012.01.024 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Liaw A, Wiener M. Classification and Regression by randomForest. R News. 2002;2(3):18–22. Version 4.6.14 [software]. Available from: https://cran.r-project.org/web/packages/randomForest. [cited 2019 Feb 20] [Google Scholar]
- 36.Ridgeway G. gbm: Generalized Boosted Regression Models; 2017. Version 2.1.1 [software]. Available from: https://CRAN.R-project.org/package=gbm. [cited 2019 Feb 20]
- 37.R Core Team. R: A Language and Environment for Statistical Computing; 2017. Version 3.4.2 [software]. Available from: https://www.R-project.org/. [cited 2019 Feb 20]
- 38.Mevik BH, Wehrens R, Liland KH. pls: Partial Least Squares and Principal Component Regression; 2016. Version 2.7.0 [software]. Available from: https://CRAN.R-project.org/package=pls. [cited 2019 Feb 20]
- 39.Kuhn M, Quinlan R. Cubist: Rule- And Instance-Based Regression Modeling; 2017. Version 0.2.1 [software]. Available from: https://CRAN.R-project.org/package=Cubist. [cited 2019 Feb 20]
- 40.Manni MM, Sot J, Arretxe E, Gil-Redondo R, Falcón-Pérez JM, Balgoma D, et al. The fatty acids of sphingomyelins and ceramides in mammalian tissues and cultured cells: Biophysical and physiological implications. Chem Phys Lipids. 2018;217:29–34. 10.1016/j.chemphyslip.2018.09.010 [DOI] [PubMed] [Google Scholar]
- 41.Karlsson KA. Sphingolipid long chain bases. Lipids. 1970;5(11):878–891. 10.1007/bf02531119 [DOI] [PubMed] [Google Scholar]
- 42.Renkonen O, Hirvisalo EL. Structure of plasma sphingadienine. Journal of lipid research. 1969;10(6):687–693. [PubMed] [Google Scholar]
- 43.Steiner R, Saied EM, Othman A, Arenz C, Maccarone AT, Poad BLJ, et al. Elucidating the chemical structure of native 1-deoxysphingosine. Journal of lipid research. 2016;57(7):1194–1203. 10.1194/jlr.M067033 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Imamura F, Fretts A, Marklund M, Ardisson Korat AV, Yang WS, Lankinen M, et al. Fatty acid biomarkers of dairy fat consumption and incidence of type 2 diabetes: A pooled analysis of prospective cohort studies. PLoS Med. 2018;15(10):e1002670 10.1371/journal.pmed.1002670 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Weitkunat K, Schumann S, Nickel D, Hornemann S, Petzke KJ, Schulze MB, et al. Odd-chain fatty acids as a biomarker for dietary fiber intake: a novel pathway for endogenous production from propionate. The American journal of clinical nutrition. 2017;105(6):1544–1551. 10.3945/ajcn.117.152702 [DOI] [PubMed] [Google Scholar]
- 46.Pfeuffer M, Jaudszus A. Pentadecanoic and Heptadecanoic Acids: Multifaceted Odd-Chain Fatty Acids. Advances in nutrition (Bethesda, Md). 2016;7(4):730–734. 10.3945/an.115.011387 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Balgoma D, Pettersson C, Hedeland M. Common Fatty Markers in Diseases with Dysregulated Lipogenesis. Trends in endocrinology and metabolism: TEM. 2019;30(5):283–285. 10.1016/j.tem.2019.02.008 [DOI] [PubMed] [Google Scholar]
- 48.Vessby B, Gustafsson IB, Tengblad S, Boberg M, Andersson A. Desaturation and elongation of Fatty acids and insulin action. Ann N Y Acad Sci. 2002;967:183–195. 10.1111/j.1749-6632.2002.tb04275.x [DOI] [PubMed] [Google Scholar]
- 49.Fahy E, Subramaniam S, Brown HA, Glass CK, Merrill AHJ, Murphy RC, et al. A comprehensive classification system for lipids. Journal of lipid research. 2005;46(5):839–861. 10.1194/jlr.E400004-JLR200 [DOI] [PubMed] [Google Scholar]
- 50.Christinat N, Masoodi M. Comprehensive Lipoprotein Characterization using Lipidomics Analysis of Human Plasma. Journal of proteome research. 2017; p. acs.jproteome.7b00236. [DOI] [PubMed] [Google Scholar]
- 51.Ciucci S, Ge Y, Durán C, Palladini A, Jiménez-Jiménez V, Martnez-Sánchez LM, et al. Enlightening discriminative network functional modules behind Principal Component Analysis separation in differential-omic science studies. Scientific reports. 2017;7:43946 10.1038/srep43946 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Yanovski SZ, Yanovski JA. Toward Precision Approaches for the Prevention and Treatment of Obesity. JAMA. 2018;319(3):223–224. 10.1001/jama.2017.20051 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Aimo L, Liechti R, Hyka-Nouspikel N, Niknejad A, Gleizes A, Götz L, et al. The SwissLipids knowledgebase for lipid biology. Bioinformatics. 2015;31:2860–6. 10.1093/bioinformatics/btv285 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Matyash V, Liebisch G, Kurzchalia TV, Shevchenko A, Schwudke D. Lipid extraction by methyl-tert-butyl ether for high-throughput lipidomics. Journal of lipid research. 2008;49(5):1137–1146. 10.1194/jlr.D700041-JLR200 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Herzog R, Schuhmann K, Schwudke D, Sampaio J, Bornstein S, Schroeder M, et al. LipidXplorer: a software for consensual cross-platform lipidomics. PLoS ONE. 2012;7(1):e29851 10.1371/journal.pone.0029851 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Herzog R, Schwudke D, Schuhmann K, Sampaio J, Bornstein S, Schroeder M, et al. A novel informatics concept for high-throughput shotgun lipidomics based on the molecular fragmentation query language. Genome Biol. 2011;12:R8 10.1186/gb-2011-12-1-r8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Wickham H. tidyverse: Easily Install and Load 'tidyverse' Packages; 2017. Version 1.1.1 [software]. Available from: https://CRAN.R-project.org/package=tidyverse. [cited 2019 Feb 20]
- 58.Hervé M. RVAideMemoire: Testing and Plotting Procedures for Biostatistics; 2017. Version 0.9.71 [software]. Available from: https://CRAN.R-project.org/package=RVAideMemoire. [cited 2019 Feb 20]
- 59.Diedenhofen B, Musch J. cocor: A Comprehensive Solution for the Statistical Comparison of Correlations. PLoS ONE. 2015;10(4):e0121945 10.1371/journal.pone.0121945 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Zou GY. Toward using confidence intervals to compare correlations. Psychological Methods. 2007;12(4):399–413. 10.1037/1082-989X.12.4.399 [DOI] [PubMed] [Google Scholar]
- 61.Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome research. 2003;13(11):2498–2504. Version 3.5.0 [software]. [cited 2019 Feb 20] 10.1101/gr.1239303 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Anderson-Sprecher R. Model Comparisons and R 2. The American Statistician. 1994;48(2):113–117. 10.1080/00031305.1994.10476036 [DOI] [Google Scholar]
- 63.Kuhn M. Building Predictive Models in R Using the caret Package. Journal of Statistical Software. 2008;28(5). 10.18637/jss.v028.i05 Version 6.0–76 [software]. [DOI] [Google Scholar]
- 64.Kjellqvist S, Klose C, Surma MA, Hindy G, Mollet IG, Johansson A, et al. Identification of Shared and Unique Serum Lipid Profiles in Diabetes Mellitus and Myocardial Infarction. Journal of the American Heart Association. 2016;5(12). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Attie AD, Krauss RM, Gray-Keller MP, Brownlie A, Miyazaki M, Kastelein JJ, et al. Relationship between stearoyl-CoA desaturase activity and plasma triglycerides in human and mouse hypertriglyceridemia. Journal of lipid research. 2002;43(11):1899–1907. 10.1194/jlr.m200189-jlr200 [DOI] [PubMed] [Google Scholar]
- 66.Warensjö E, Öhrvall M, Vessby B. Fatty acid composition and estimated desaturase activities are associated with obesity and lifestyle variables in men and women. Nutrition, metabolism, and cardiovascular diseases: NMCD. 2006;16(2):128–136. 10.1016/j.numecd.2005.06.001 [DOI] [PubMed] [Google Scholar]
- 67.Gieger C, Geistlinger L, Altmaier E, Hrabé de Angelis M, Kronenberg F, Meitinger T, et al. Genetics meets metabolomics: a genome-wide association study of metabolite profiles in human serum. PLoS Genet. 2008;4(11):e1000282 10.1371/journal.pgen.1000282 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Moon YA, Ochoa CR, Mitsche MA, Hammer RE, Horton JD. Deletion of ELOVL6 blocks the synthesis of oleic acid but does not prevent the development of fatty liver or insulin resistance. Journal of lipid research. 2014;55(12):2597–2605. 10.1194/jlr.M054353 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Beccarelli LM, Scherr RE, Newman JW, Borkowska AG, Gray IJ, Linnell JD, et al. Associations Among Fatty Acids, Desaturase and Elongase, and Insulin Resistance in Children. Journal of the American College of Nutrition. 2018;37(1):44–50. 10.1080/07315724.2017.1347908 [DOI] [PubMed] [Google Scholar]
- 70.Chong MFF, Hodson L, Bickerton AS, Roberts R, Neville M, Karpe F, et al. Parallel activation of de novo lipogenesis and stearoyl-CoA desaturase activity after 3 d of high-carbohydrate feeding. The American journal of clinical nutrition. 2008;87(4):817–823. 10.1093/ajcn/87.4.817 [DOI] [PubMed] [Google Scholar]
- 71.Ho-Pham LT, Campbell LV, Nguyen TV. More on body fat cutoff points. Mayo Clinic proceedings. 2011;86(6):584 10.4065/mcp.2011.0097 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Oreopoulos A, Lavie CV, Snitker S, Romero-Corral A. More on body fat cutoff points–reply–I. Mayo Clinic proceedings. 2011;86(6):584–585. 10.4065/mcp.2011.0097 [DOI] [PMC free article] [PubMed] [Google Scholar]