1: |
for all Binary datasets B1 to Bq
do
|
2: |
Move 20% of the positive examples and 20% of the negative examples from Bi to a validation dataset (Vi). |
3: |
Put the remaining positive examples into a smaller training dataset (STSi). |
4: |
Score the remaining negative examples in Bi according to their similarity with positive examples. |
5: |
Initialize snapshot variable k = 1 |
6: |
while
Bi is not empty do
|
7: |
Remove the top 10% scored negative examples in Bi and add them to STSi. |
8: |
Record the snapshot of the current training set,
. |
9: |
Build a binary classifier for i-th code with training dataset
and record the F1.5 score on Vi. |
10: |
k = k + 1 |
11: |
Set the optimal training set OTS = the snapshot
with the highest F1.5 score. |