. 2015 Sep 25;16(Suppl 13):S8. doi: 10.1186/1471-2105-16-S13-S8

Table 4.

K-means clustering accuracy and running time of SIDER2 dataset

T	5	10	20	30	40	50
Purity**(k = 20)	0.41	0.44	0.53	0.53	0.53	0.58
Purity(k = 30)	0.41	0.44	0.56	0.50	0.54	0.60
Time (ms)	43,378	45,233	48,252	49,278	50,493	51,443

T	60	70	80	90	100

Purity (k = 20)	0.59	0.55	0.57	0.56	0.54
Purity(k = 30)	0.59	0.57	0.57	0.56	0.56
Time (ms)	52,526	52,577	54,298	54,468	54,608

**Purity of each cluster is calculated as the ratio of correctly classified drugs in the total 996 drugs in the cluster. The ratios in the table represent the average purities of k clusters obtained for each topic modeling.