Preprocessing and pretreatment of the experimental data. (A) The matrix representation of the analytical chemical data. (B) The pretreatment of the raw data resulting in the processed data used for analysis. In the case of liquid chromatography mass spectrometry (LC-MS), the preprocessed data is a list of identified features. For nuclear magnetic resonance spectroscopy (NMR), the preprocessed data is binned spectra. For gas chromatography mass spectrometry (GC-MS), the preprocessed data is a total ion chromatogram (TIC). (C) A schematic of the uncertainty calculation using bootstrapping, described in Appendix A. The data matrix X is analyzed using principal components analysis (PCA), generating scores T and loadings P. Bootstrap matrices X* are sampled from X, generating bootstrap scores T* and loadings P*. The scores are aligned using an orthogonal Procrustes algorithm to estimate uncertainty in classifications.