Skip to main content
. 2015 Sep 25;16(Suppl 13):S8. doi: 10.1186/1471-2105-16-S13-S8

Table 4.

K-means clustering accuracy and running time of SIDER2 dataset

T 5 10 20 30 40 50
Purity**(k = 20) 0.41 0.44 0.53 0.53 0.53 0.58
Purity(k = 30) 0.41 0.44 0.56 0.50 0.54 0.60
Time (ms) 43,378 45,233 48,252 49,278 50,493 51,443

T 60 70 80 90 100

Purity (k = 20) 0.59 0.55 0.57 0.56 0.54
Purity(k = 30) 0.59 0.57 0.57 0.56 0.56
Time (ms) 52,526 52,577 54,298 54,468 54,608

**Purity of each cluster is calculated as the ratio of correctly classified drugs in the total 996 drugs in the cluster. The ratios in the table represent the average purities of k clusters obtained for each topic modeling.