. 2007 Oct 17;2(10):e1047. doi: 10.1371/journal.pone.0001047

Table 1. Experimental setup.

Features	n	n_opt	Training	Validation	Validation
Intra/Cross-lab Validation				Intra1	Cross1
Genes	10962	48	V₁	V₂	V₂+W₂
BC (V₁)	747	44	V₁	V₂	V₂+W₂
BCC (V₁+W₁+So)	911	66	V₁	V₂	V₂+W₂
HCC (Se)	1163	111	V₁	V₂	V₂+W₂
S456 (Se)	456	80	V₁	V₂	V₂+W₂
Inter-lab Validation				Inter1
Genes	10962	21	V	W
BC (V)	896	55	V	W
BCC (V+So)	934	137	V	W
HCC (Se)	1163	104	V	W
S456 (Se)	456	42	V	W
Intra/Cross-lab Validation				Intra2	Cross2
Genes	10962	101	W₁	W₂	V₂+W₂
BC (W₁)	576	59	W₁	W₂	V₂+W₂
BCC (V₁+W₁+So)	911	103	W₁	W₂	V₂+W₂
HCC (Se)	1163	71	W₁	W₂	V₂+W₂
S456 (Se)	456	67	W₁	W₂	V₂+W₂
Inter-lab Validation				Inter2
Genes	10962	58	W	V
BC (W)	704	17	W	V
BCC (W+So)	762	33	W	V
HCC (Se)	1163	78	W	V
S456 (Se)	456	10	W	V

Our experimental setup allows a validation of the classifiers on data from the same institution (Intra1 and Intra2), data from the same and another institution (Cross1 and Cross2), and data from another institution (Inter1 and Inter2). In all cases the training and validation sets are non-overlapping, and thus independent. Moreover, the validation data was not used in the first step where the unsupervised approach is used to extract modules. In each of the validation schemes we included a gene-based classifier (Genes), and several module-based classifiers (BC, BCC, HCC, and S456). For each of the module-based classifiers we indicate the datasets from which the modules were extracted (Features column), along with the number of features (n), and the optimal number of modules/genes output from the train/test protocol (n_opt). The Training column indicates the dataset on which the train/test protocol was used, and the Validation column indicates the datasets used for validation of the classifiers. All datasets are abbreviated as: V: [4], W: [3], So: [2], and Se: [10]. When we split a dataset in two equal independent parts we indicate the training (1) and validation (2) parts by subscripts.