Table 2. Clustering of 10 datasets (generated after performing 10 attribute weighting algorithms) into T (mesophile) and F (thermophile) classes by four different unsupervised clustering algorithms (K-Means, K-Medoids, SVC and EMC).
Chi Squared | Correlation | Deviation | Gini Index | Information Gain | Relief | Rule | PCA | SVM | Uncertainty | |||||||||||
T | F | T | F | T | F | T | F | T | F | T | F | T | F | T | F | T | F | T | F | |
K-Means | 1461 | 596 | 1810 | 247 | 1222 | 835 | 1452 | 605 | 1333 | 724 | 1603 | 454 | 1601 | 456 | 434 | 1623 | 1076 | 981 | 372 | 1685 |
K-Medoids | 487 | 1570 | 1521 | 536 | 104 | 1953 | 1570 | 487 | 1152 | 905 | 583 | 1474 | 1652 | 405 | 1768 | 289 | 892 | 1165 | 939 | 1118 |
SVC | 363 | 1688 | 1701 | 328 | 1705 | 6 | 363 | 1688 | 570 | 1487 | 529 | 1324 | 1561 | 4 | 631 | 1426 | 0 | 2057 | 1089 | 947 |
EMC | 0 | 2057 | 1544 | 513 | 0 | 0 | 0 | 2057 | 0 | 2057 | 0 | 2057 | 4 | 2053 | 0 | 2057 | 0 | 2057 | 1544 | 513 |
The actual numbers of T (mesostable) and F (thermostable) classes in the original datasets were 1544 and 513, respectively. The highest accuracy (100%) was observed when the EMC clustering method was applied to datasets generated by Correlation and Uncertainty attribute weighting algorithms that highlighted in the table.