Skip to main content
. 2014 Oct 21;15(Suppl 11):S11. doi: 10.1186/1471-2105-15-S11-S11

Table 3.

Comparison of the results on the lung cancer dataset using the proposed method of topic model-derived clustering based on feature selection and two conventional clustering methods of k-means and PCA.

Methods k Cluster ID Adenocarcinoma Squamous cell carcinoma No. of misclassified samples NMI
Topic model-derived clustering based on feature selection 2 1 42 11 22 0.2809
2 11 47

3 1 40 8 21 0.2417
2 4 15
3 9 35

4 1 37 8 18 0.2926
2 9 35
3 0 14
4 7 1

k-means 2 1 41 12 24 0.2461
2 12 46

3 1 8 35 31 0.1365
2 27 17
3 18 6

4 1 6 14 25 0.1602
2 22 6
3 18 6
4 7 32

PCA (10 features) + k-means 2 1 12 46 24 0.2461
2 41 12

3 1 8 35 31 0.1456
2 22 6
3 23 17

4 1 16 5 25 0.1605
2 6 14
3 7 32
4 24 7