. Author manuscript; available in PMC: 2022 Apr 8.

Published in final edited form as: IEEE/ACM Trans Comput Biol Bioinform. 2021 Apr 8;18(2):633–643. doi: 10.1109/TCBB.2019.2921577

Algorithm 1.

Object weighting K-means

Input:
	- the dataset X with n objects
	- the object weighting scheme (WS 1 to 7 defined in Section 2.1)
Output:
	- the best clustering according to the selected WS
	- the object weights and the cluster centroids
1. Parameter setting. Choose the number of clusters K and the weighting scheme (see Equations 5–11). Set w_i = 1, for i = 1,…,n.
2. Setting initial centers and clusters. Assign K objects from X, selected at random, to be initial cluster centers c₁, c₂,…, C_K. Assign each object x_i ∈ X to the cluster S_k represented by the nearest c_k as per (2) to form the initial clustering S.
3. Cluster update. Let $L (S_{k}, c_{k}, w) = \sum_{x_{i} \in S_{k}} w_{i} (\sum_{v = 1}^{V} {(x_{i v} - c_{k v})}^{2})$ be the weighted within sum of squares of the cluster S_k. For all objects x_i ∈ S_k and all clusters S_k (k = 1,…,K) do:
If there exists a cluster S_k′ such that after moving x_i from S_k to S_k′, we have:
DIFF = L(S_k, c_k, w) + L(S_k′, c_k′, w) − L(S_k \ x_i, c_k, w) − L(S_k′ ∪ x_i, c_k′, w) > 0
then assign x_i to the cluster S_k′ that provides the maximum of DIFF. The cluster centers and object weights in both S_k and S_k′ are updated when calculating DIFF. When all objects of X are examined, a new clustering S′ = {S′₁, S′₂,…, S′_K} is generated.
4. Decision step. If S′ = S or the maximum number of K-means iterations I is reached, then output the generated clustering S′, cluster centers C and object weights w, and end the computation; otherwise go to Step 3.