. 2008 Jan 28;9:57. doi: 10.1186/1471-2105-9-57

Table 1.

Annotating Genes with Positive Samples (AGPS)

Input:

- positive training data P1

- validation set P2

- unlabeled data Ku

- unknown gene Ug

Output:

- Prediction results

Stage 1: Learning

U = Ku + P2;

Stage 1.1: Initial negative set generation

- Construct classifier f₁based on P1 and U with one-class SVMs;

- Classify U using f₁. The predicted negative set N₁is used as the initial negative training set in Stage 1.2;

- U = U - N₁.

Stage 1.2: Negative set expansion

- Classifier set FC = [ ], negative set NS = [ ], i = 1.

- repeat

- i = i + 1;

- Construct classifier f_ibased on P1 and N₁with two-class SVMs;

- FC(i - 1) = f_i, NS(i - 1) = N1;

- Classify U by f_i, N₂is the predicted negative set, where |N₂| ≤ k|P1|;

- N₁= [N₂; N_SV], where N_SVis the negative SVs of f_iin the previous step;

- U = U - N2.

- until |U| <k|P1|

Stage 1.3: Classifier and negative set selection

- Classify U with classifiers from FC, and select the classifier FC(i) with the best prediction accuracy;

- Return negative set TN ← NS(i).

Stage 2: classification

Classify Ug with P and TN, where P = P1 + P2.