Skip to main content
. 2012 Jun 19;10:61. doi: 10.1186/1741-7015-10-61

Table 1.

Statistics of multivariate analysis models demonstrating an association between 1H NMR spectroscopic data and several biological and lifestyle factors.

Model Description Y variable Number of LVs R2X Q2Y Significance
(P value)
A. PLS ln(U-Cd) 3 0.251 0.237 < 0.01
B. PLS
(current smokers excluded)
ln(U-Cd) 5 0.308 0.330 < 0.001
C. PLS
(past and current smokers excluded)
ln(U-Cd) 1 0.0729 0.142 < 0.001
D. PLS sex 3 0.241 0.104 > 0.05
E. PLS age 2 0.216 0.224 < 0.001
F. PLS ln(U-NAG) 1 0.054 0.162 < 0.001
G. PLS-DA Smoking historya 2 0.194 0.185 < 0.01

aSmoking history was defined as either 1 = never smoked and past smoker (n = 106) or 2 = current smoker (n = 20), one individual did not complete the lifestyle questionnaire. Spectra that exhibited signs of bacterial contamination, analgesics or ethanol were excluded from these analyses. All variables were mean-centred and scaled to unit variance. NMR data were reduced to 1,127 data points of δ 0.01 resolution. Sample numbers for PLS models: A, D, E and F: n = 127. B: n = 106. C: n = 79. PLS-DA (model G) n = 126. Number of latent variables in a model were auto-fitted in SIMCA-P+. All models were assessed for validity by Y variable permutation analysis (1,000 permutations, see additional file 1 Figure S4). Scores scatter plots for each multivariate model can also be found in additional file 1 (Figure S5). ln(U-Cd), natural logarithm of urinary cadmium; ln(U-NAG), natural logarithm of urinary-N-acetyl-β-D-glucosaminidase; LV, latent variable; n, sample number; PLS, partial least squares; PLS-DA, partial least squares - discriminant analysis. R2X is the proportion of variance in the X matrix (i.e. spectral NMR data) described by the PLS model. Q2Y is the ability of the PLS model to predict the Y-score (ln(U-Cd), sex, age, ln(U-NAG) or smoking status) of a novel sample or the "cross-validated goodness-of-fit".