Table 1.
Training Data |
Test Data |
||||
---|---|---|---|---|---|
Factor Level | Latent Variableb | Cross validation Accuracyc | Overall Accuracyd | Sensitivitye | Specificitye |
Metagenomics species taxonomyf | 1 | 54.5% | 30.8% | ||
PD | 0.60 | 0.38 | |||
RD | 0.00 | 1.0 | |||
D3M | 0.33 | 0.60 | |||
D6M | 0.00 | 1.0 | |||
Metagenomics functional gene clustersf | 4 | 57.4% | 61.5% | ||
PD | 0.80 | 0.75 | |||
RD | 0.67 | 1.0 | |||
D3M | 0.33 | 0.80 | |||
D6M | 0.50 | 0.91 | |||
Metabolomics: primary metabolismf | 9 | 66.1% | 61.5% | ||
PD | 0.60 | 0.80 | |||
RD | 1.0 | 1.0 | |||
D3M | 0.33 | 0.80 | |||
D6M | 0.5 | 0.82 |
Four-class partial least squares-discriminant analysis (PLS-DA) model includes classifiers for age-matched, male University of California, Davis Type 2 Diabetes Mellitus (UCD-T2DM) rats before the onset of diabetes (PD; n = 15), 2 wk postonset of diabetes (RD; n = 10), 3 mo postonset of diabetes (D3M; n = 11), and 6 mo postonset of diabetes (D6M; n = 7). Data were split into training and test sets (67%:33%). Training and test splits were stratified by classifiers (i.e., distribution of within groups were equal in test and training sets). Models were preprocessed and fit with training data only. Test data were used only for class prediction.
Latent variable associated with highest overall 6-fold cross validation prediction accuracy.
Prediction accuracy from 6-fold cross validation.
Prediction classification accuracy of test data (n = 13) using the number of latent variables with highest classification accuracy from cross validation results in training set.
Sensitivity and specificity calculated for each group by comparing each group against all remaining levels in test set, i.e., one level vs. all remaining samples.
Count data from metagenomics analysis were normalized by centered log transformations before modeling. Metabolomics quantifier ion peak height data were log transformed and scaled to unit variance before modeling.