Table 2.
Identification of meaningful metabolites using XGBoost
| Putative identification | HMBD ID | m/z | Formula | Feature importance | Pancreatic cancer incidence/control |
|---|---|---|---|---|---|
| Eicosa-11,14,17-trienoic acid | HMDB0244373 | 306.2560 | C20H34O2 | 6.0 | 1.826 |
| Kynurenic acid | HMDB0000715 | 189.0429 | C10H7NO3 | 6.0 | 1.069 |
| γ-Glutamyl tyrosine | HMDB0011741 | 310.1166 | C14H18N2O6 | 5.0 | 1.230 |
| N(6)-Methyllysine | HMDB0002038 | 160.1214 | C7H16N2O2 | 5.0 | 0.875 |
| LysoPE(18:0/0:0) | HMDB0011130 | 481.3170 | C23H48NO7P | 5.0 | 1.040 |
| Trans-3'-hydroxy cotinine | HMDB0304504 | 192.0901 | C10H12N2O2 | 4.0 | 1.130 |
| Palmitic amide | HMDB0012273 | 255.2563 | C16H33NO | 4.0 | 0.915 |
| L-Leucine | HMDB0000687 | 131.0949 | C6H13O2 | 4.0 | 1.144 |
| Adipic acid | HMDB0000448 | 146.0581 | C6H10O4 | 4.0 | 0.795 |
| 9-Decenoylcarnitine | HMDB0013205 | 313.2254 | C17H31NO4 | 4.0 | 0.794 |
| 5α-Pregnane-3,20-dione | HMDB0003759 | 316.2398 | C21H32O2 | 4.0 | 0.845 |
Feature Importance values > 4.0 are listed in Table 2. Feature Importance value was obtained from the XGBoost model of the training set (n = 209) [accuracy, 0.952; precision, 0.985; AUC 0.998], selecting discriminant metabolites related to pancreatic cancer incidence. The pancreatic cancer incidence/Control value was calculated using the relative abundance of each metabolite