Skip to main content
. 2024 Aug 20;14(8):463. doi: 10.3390/metabo14080463

Table 2.

Metabolites selected with proportion of explained variation for predicting breast cancer and colorectal cancer 1.

Metabolites Selected Proportion of Explained Variation 2 Direction of Coefficient for Metabolites 3
Breast Cancer All covariates + metabolites: 0.27
Serum:
  LC-MS
    Azelaic acid 0.23
    Choline 0.23 +
    Cysteinyl glycine 0.23
    Ethanolamine 0.23 +
    Gamma tocopherol 0.23 +
    Hippuric acid 0.23
    Isovaleryl carnitine 0.23 +
    N-isovaleryl glycine 0.23
    Sucrose 0.23
    Trimethylamine-N-oxide 0.23 +
    Valine 0.23 +
    Xylose 0.23
  Lipidyzer 4
    Cholesteryl ester (CE 12:0) 0.23
    Cholesteryl ester (CE 20:0) 0.23
    Diacylglycerol (DAG 14:1) 0.24 +
    Free fatty Acid (FFA 18:4) 0.23
    Free fatty Acid (FFA 20:2) 0.23 +
    Hexosylceramide (HCER 22:0) 0.23 +
    Hexosylceramide (HCER 22:0) 0.23
    Phosphatidylcholine (PC 18:1) 0.23 +
    Phosphatidylcholine (PC 18:2) 0.23 +
    Phosphatidylcholine (PC 16:0/18:2) 0.24
    Phosphatidylethanolamine (PE 18:2) 0.23 +
    Triacylglycerol (TAG 12:0) 0.23
    Triacylglycerol (TAG 16:0) 0.23
    Triacylglycerol (TAG 18:0) 0.23
    Triacylglycerol (TAG 47:0/15:0) 0.23
    Triacylglycerol (TAG 48:4/18:2) 0.23
    Triacylglycerol (TAG 50:0/16:0) 0.23 +
    Triacylglycerol (TAG 50:2/18:2) 0.23
    Triacylglycerol (TAG 50:5/18:3) 0.24
    Triacylglycerol (TAG 52:2/18:2) 0.24 +
    Triacylglycerol (TAG 55:4/18:1) 0.23
Urine
  NMR
    Dimethylamine 0.23
    Propanediol 0.23
    Formate 0.23 +
    Sucrose 0.23
    Taurine 0.23 +
    Uracil 0.23
    Trimethylamine-N-oxide 0.23
    2-Hydroxyisobutyrate 0.23 +
    2-Oxoglutarate 0.23
GC-MS
    Unknown 73.012.10 5 0.23
    Unknown 73.014.49 5 0.23 +
    Unknown 73.016.52 5 0.23 +
Colorectal Cancer All covariates + metabolites: 0.31
Serum
  LC-MS
    Adenosine 0.23
    Leucic Acid 0.21 +
    Glycerate 0.25 +
    Myo-inositol 0.22 +
    N-Acetyl-glutamate 0.22
    N-Acetyl-glycine 0.23 +
    N-Acetylneuraminate 0.22 +
    2-Hydroxyglutarate 0.22 +
    Hydroxyproline 0.21 +
    7-Methylguanine 0.22 +
Lipidyzer 4
    Lysophosphatidylcholine (LPC 20:3) 0.22
Urine
  NMR
    Acetate 0.21 +
    Allantoin 0.21
    Histidine 0.22
    Isoleucine 0.21 +
    Taurine 0.22 +
    Threonine 0.21 +
    Trimethylamine-N-oxide 0.21 +
    Uracil 0.22
GC-MS
    Unknown 103 17.03 5 0.21
    Unknown 285 22.41 5 0.22 +
    Unknown 57 9.58 5 0.22 +
    Unknown 73 10.76 5 0.21
    Unknown 73 17.66 5 0.21 +

1 All variables listed below were selected by either the lasso or SL selection procedure in the corresponding platform-specific analysis. The base set of covariates (forced into all models) were age, WHI enrollment date, and self-reported race or ethnicity. Selected covariates for breast cancer: education level, income, alcohol intake, current smoking, total folate intake, Gail 5-year risk, family history of CRC, prior removal of ≤1 colon polyp, currently using estrogen, waist circumference, BMI (kg/m2), randomized to CaD or HT, date of sample draw visit. Selected covariates for colorectal cancer: age, self-reported race/ethnicity, education, income, alcohol intake, total folate intake, waist circumference, BMI (kg/m2), ≥1 colonoscopy, prior removal of ≥1 colon polyp, sample draw visit, randomized to DM control arm. 2 The proportion of explained variation (PEV) was estimated by first creating a dataset with only the selected metabolites and covariates for each outcome. Then, we used cross-validation to fit a logistic regression on each set of training data and predict on the test data; the PEV is defined as the correlation between the observed outcomes and the predictions. 3 Positive direction of the estimated coefficient from the multiple logistic regression model implies higher odds of being a case; negative direction implies lower odds of being a case. 4 In CE, X:A; FFA, X:A; DAG, X:A/Y:B; HCER, X:A; PC, X:A/Y:B; PE, X:A/Y:B; and LPC, X:A, X and Y indicate the number of carbon atoms and A and B indicate the number of double bonds in the fatty acid chains. Lipids without both A and B represent the sum of all fatty acids in that class. For example, DAG (14:1) equals the sum of all diacylglycerol, i.e., summing all DAG (x/14:1) and DAG (14:1/x). 5 Values represent mass at retention time of the unknown metabolites, i.e., 73 12.10 indicates a mass of 73 at 12.10 min. In TAG, X:A/Y:B, X indicates the total number of carbon atoms and A indicates the total number of double bonds in the three fatty acid chains, and Y indicates the number of carbon atoms and B indicates the number of double bonds in one of the fatty acid chains.