ALGORITHM 1.
Pipeline of the semi-supervised self-training clustering algorithm under low-rank representation
| Input | X - The original data matrix; | |
| λ - The control parameter of LRR; | ||
| distZ, distE - The distance metrics of K-means for clustering Z and E, respectively; | ||
| maxIterNum - The max number of iteration. | ||
|
| ||
| Output | IZ - The clustering results, among which the labels of unlabeled samples are predicted. | |
|
| ||
| Step 1: | Perform LRR on original data matrix X. | |
|
| ||
|
|
||
| Re-arrange Z and E as and , respectively; | ||
| currentIterNum ← 0; // Counter of clustering iterations | ||
|
| ||
| Step 2: | Perform K-means algorithm on Z and E, respectively. | |
|
| ||
| lZ = K-means(Z,distZ); //Using (1) to determine the initial point of each cluster. | ||
| lE = K-means(E,distE); // Using a method similar to (1) to determine the initial point of each cluster | ||
| // lZ and lE are the clustering results on Z and E, respectively; | ||
| currentIterNum ← currentIterNum 1; | ||
|
| ||
| Step 3: | Select unlabeled samples as labeled ones for next round clustering. | |
|
| ||
| S ← Φ; | ||
| FOR each unlabeled sample i in Zu | ||
| Let lzi and lei be the predicted labels of sample i according to the clustering results of lZ and lE, respectively. | ||
| IF lzi = lei | ||
| S ← S ∪{i}; //select an unlabeled sample | ||
| END IF | ||
| END FOR | ||
|
| ||
| Step 4: | Decide whether to terminate the algorithm, and update Z and E. | |
|
| ||
| IF S = ø; or currentIterNum > maxIterNum: | ||
| RETURN lZ; | ||
| ELSE | ||
| FOR each selected unlabeled sample i in S | ||
| move from Zu to Zl; | ||
| move from Eu to El; | ||
| END FOR | ||
| ; //update Z | ||
| ; //update E | ||
| GOTO Step 2; | ||
| END IF | ||