Skip to main content
. 2007 Oct 17;2(10):e1047. doi: 10.1371/journal.pone.0001047

Table 1. Experimental setup.

Features n nopt Training Validation Validation
Intra/Cross-lab Validation Intra1 Cross1
Genes 10962 48 V1 V2 V2+W2
BC (V1) 747 44 V1 V2 V2+W2
BCC (V1+W1+So) 911 66 V1 V2 V2+W2
HCC (Se) 1163 111 V1 V2 V2+W2
S456 (Se) 456 80 V1 V2 V2+W2
Inter-lab Validation Inter1
Genes 10962 21 V W
BC (V) 896 55 V W
BCC (V+So) 934 137 V W
HCC (Se) 1163 104 V W
S456 (Se) 456 42 V W
Intra/Cross-lab Validation Intra2 Cross2
Genes 10962 101 W1 W2 V2+W2
BC (W1) 576 59 W1 W2 V2+W2
BCC (V1+W1+So) 911 103 W1 W2 V2+W2
HCC (Se) 1163 71 W1 W2 V2+W2
S456 (Se) 456 67 W1 W2 V2+W2
Inter-lab Validation Inter2
Genes 10962 58 W V
BC (W) 704 17 W V
BCC (W+So) 762 33 W V
HCC (Se) 1163 78 W V
S456 (Se) 456 10 W V

Our experimental setup allows a validation of the classifiers on data from the same institution (Intra1 and Intra2), data from the same and another institution (Cross1 and Cross2), and data from another institution (Inter1 and Inter2). In all cases the training and validation sets are non-overlapping, and thus independent. Moreover, the validation data was not used in the first step where the unsupervised approach is used to extract modules. In each of the validation schemes we included a gene-based classifier (Genes), and several module-based classifiers (BC, BCC, HCC, and S456). For each of the module-based classifiers we indicate the datasets from which the modules were extracted (Features column), along with the number of features (n), and the optimal number of modules/genes output from the train/test protocol (nopt). The Training column indicates the dataset on which the train/test protocol was used, and the Validation column indicates the datasets used for validation of the classifiers. All datasets are abbreviated as: V: [4], W: [3], So: [2], and Se: [10]. When we split a dataset in two equal independent parts we indicate the training (1) and validation (2) parts by subscripts.