Skip to main content
. 2014 Aug 30;7:581. doi: 10.1186/1756-0500-7-581

Table 6.

Details of data normalization

Data set names/ Data normalization Datanormalization
GEO ID Disease Data retrieval methods timing methods
GSE46579 AD GSE46579_AD_ngs_data_summarized.xls.gz before FE zero mean/variance is one
GSE37472 carcinoma getGEO before FE zero mean/variance is one
GSE49823 CAD getGEO after FE zero mean/variance is one
GSE43329 NPC getGEO before FE zero mean/variance is one +
GSE50013 HCC getGEO before FE # zero mean/variance is one
GSE41922 BC GSE41922_series_matrix.txt.gz after FE zero mean/variance is one
GSE49665 AML getGEO after FE zero mean/variance is one

*no normalization for SVM/lasso, +no normalization for SVM with PCA-based FE, #after FE for PCA-based LDA with universal features. All the sample normalizations were sample-based; i.e., each sample was normalized to have both zero mean and unit variance. AD, Alzheimer disease; CAD, coronary artery disease; NPC, nasopharyngeal carcinoma; HCC, hepatocellular carcinoma; BC, breast cancer; AML, acute myeloid leukemia. Data retrieval methods/data set names were used to name files and for analysis. getGEO indicates that individual sample profiles whose files names started with “GEO” were downloaded by the getGEO command in R.