Input: |
- positive training data P1
|
- validation set P2
|
- unlabeled data Ku
|
- unknown gene Ug
|
Output: |
- Prediction results |
Stage 1: Learning |
U = Ku + P2; |
Stage 1.1: Initial negative set generation |
- Construct classifier f1 based on P1 and U with one-class SVMs; |
- Classify U using f1. The predicted negative set N1 is used as the initial negative training set in Stage 1.2; |
- U = U - N1. |
Stage 1.2: Negative set expansion |
- Classifier set FC = [ ], negative set NS = [ ], i = 1. |
- repeat |
- i = i + 1; |
- Construct classifier fi based on P1 and N1 with two-class SVMs; |
- FC(i - 1) = fi, NS(i - 1) = N1; |
- Classify U by fi, N2 is the predicted negative set, where |N2| ≤ k|P1|; |
- N1 = [N2; NSV], where NSV is the negative SVs of fi in the previous step; |
- U = U - N2. |
- until |U| <k|P1| |
Stage 1.3: Classifier and negative set selection |
- Classify U with classifiers from FC, and select the classifier FC(i) with the best prediction accuracy; |
- Return negative set TN ← NS(i). |
Stage 2: classification |
Classify Ug with P and TN, where P = P1 + P2. |