Table 1. Performance comparison of the EM MSTkNN with k-Means, SOM, CLICK and the original MSTkNN algorithms, in terms of homogeneity and separation.
Data | Methods/Algorithm | Parameter | Havg | Savg | #Clusters | Time |
AD Signature data set(n = 1,372) | k-Means | k = 5 | 0.179 | 0.121 | 5 | <0.5 min |
k = 120 | 0.394 | 0.172 | 120 | <1 min | ||
SOM | 2X5 grid | 0.185 | 0.183 | 6 | <0.5min | |
5X5 grid | 0.217 | 0.142 | 14 | <1 min | ||
CLICK | – | 0.606 | 0.245 | 5 | <1 min | |
MSTkNN | – | 0.780 | 0.369 | 226 | <0.5 min | |
EM MSTkNN(this paper) | – | 0.789 | 0.370 | 228 | <0.2 min | |
AD ratios data set(n = 941,885) | k-Means, SOM,CLICK, MSTkNN | – | Not Available | Not Available | Not Available | Not Available |
EM MSTkNN(this paper) | – | 0.812 | 0.420 | 40,139 | 30 min | |
AD ratios- sums-diffs-prods dataset(n = 3,763,403) | EM MSTkNN(this paper) | – | 0.879 | 0.521 | 121,611 | 120 min |
The implementations of the k-Means, SOM, CLICK algorithms are obtained from the Expander microarray data cluster tool in [124]. The homogeneity and separation are computed using the definition in [124]. The AD ratio metafeatures data set is generated by taking pair-wise ratios between the features in 1,372-probe AD signatures [5] and including MMSE score, NFT count, Braak staging, JSDcontrol and JSDsevere as five progression markers. The other data set contains four different types of metafeatures (ratios, summations, differences and products) and the aforementioned progression markers.