Table 4.
Machine learning selected NMR peaks.
Peak | Peak min ppm | Peak max ppm | Delta ppm | # of NMR data points | Max importance observed | Average importance observed | PC1 explained variance (%) | PC2 explained variance (%) | Components selected for gene correlation | Identified by OPLS modelling |
---|---|---|---|---|---|---|---|---|---|---|
1 | 1.45 | 1.50 | 0.058 | 18 | 0.25 | 0.14 | 79.5 | 14.9 | PC1 | No |
2 | 2.03 | 2.07 | 0.047 | 21 | 0.28 | 0.15 | 92.4 | 5.1 | PC1, PC4, PC5 | Yes (N-acetyl glycoprotein) |
3 | 2.65 | 2.66 | 0.011 | 40 | 0.54 | 0.25 | 98.6 | 0.8 | PC1 | Yes (unassigned peak) |
4 | 3.56 | 3.57 | 0.017 | 47 | 1.00 | 0.39 | 82.1 | 15.2 | PC1, PC2 | Yes (glycerol) |
5 | 7.20 | 7.22 | 0.018 | 58 | 0.57 | 0.22 | 84.7 | 2.7 | PC1 | Yes (phenylalanine) |
6 | 7.25 | 7.26 | 0.016 | 15 | 0.27 | 0.16 | 84.9 | 4.9 | PC1 | Yes (phenylalanine) |
Significant values are in [bold].
Peaks identified by the RF classifier in the discrimination of CD patients by their CRP status. Reported importance is scaled by the maximum importance observed.