Skip to main content
. Author manuscript; available in PMC: 2015 Jun 13.
Published in final edited form as: Curr Metabolomics. 2013;1(1):92–107. doi: 10.2174/2213235X11301010092

Table 1.

Listing of most commonly used data scaling methods in metabolic fingerprinting multivariate analyses.1

Method Equation Goal Advantage Disadvantage
Centering
xik=xikx¯k
Focus on differences, not similarities Removes offset from the data Unsuitable for heteroscedastic data

UV
xik=xikx¯ksk
Compare metabolites based on correlation All metabolites equally important Inflation of measurement errors
Range
xik=xikx¯kxk,maxxk,min
Compare metabolites relative to biological response range All metabolites equally important. Biologically related scaling Inflation of measurement errors, sensitive to outliers
Pareto
xik=xikx¯ksk
Reduce relative importance of large values, partially preserve data structure Stays closer to original measurement than UV Sensitive to large fold changes
Vast
xik=xikx¯kskCx¯ksk
Focus on small fluctuations Aims for robustness, uses prior group knowledge Not suited for large induced variation without group structure
Level
xik=xikx¯kx¯k
Focus on relative response Suited for biomarker identification Inflation of measurement errors
1

Variable subscripts reflect conventions shown in Figure 1, with the mean of the k -th variable in X represented by k and its deviation represented by sk, the sample standard deviation. Reprinted with permission from reference [82], (Copyright 2006 van den Berg et. al.).