Skip to main content
. Author manuscript; available in PMC: 2018 Aug 13.
Published in final edited form as: IEEE/ACM Trans Comput Biol Bioinform. 2017 Jun 6;15(4):1315–1324. doi: 10.1109/TCBB.2017.2712607

ALGORITHM 1.

Pipeline of the semi-supervised self-training clustering algorithm under low-rank representation

Input X - The original data matrix;
λ - The control parameter of LRR;
distZ, distE - The distance metrics of K-means for clustering Z and E, respectively;
maxIterNum - The max number of iteration.

Output IZ - The clustering results, among which the labels of unlabeled samples are predicted.

Step 1: Perform LRR on original data matrix X.

minZ,EZ+λE2,1s.t.,X=XZ+E;
Re-arrange Z and E as Z=[ZlZu] and E=[ElEu], respectively;
currentIterNum ← 0; // Counter of clustering iterations

Step 2: Perform K-means algorithm on Z and E, respectively.

lZ = K-means(Z,distZ); //Using (1) to determine the initial point of each cluster.
lE = K-means(E,distE); // Using a method similar to (1) to determine the initial point of each cluster
// lZ and lE are the clustering results on Z and E, respectively;
currentIterNum ← currentIterNum 1;

Step 3: Select unlabeled samples as labeled ones for next round clustering.

S ← Φ;
FOR each unlabeled sample i in Zu
 Let lzi and lei be the predicted labels of sample i according to the clustering results of lZ and lE, respectively.
 IF lzi = lei
  SS ∪{i}; //select an unlabeled sample
 END IF
END FOR

Step 4: Decide whether to terminate the algorithm, and update Z and E.

IF S = ø; or currentIterNum > maxIterNum:
RETURN lZ;
ELSE
FOR each selected unlabeled sample i in S
 move ziu from Zu to Zl;
 move eiu from Eu to El;
END FOR
Z[ZlZu]; //update Z
E[ElEu]; //update E
GOTO Step 2;
END IF