Table 2.
The performance comparison between improved K-means algorithm based on semantic and traditional K-means algorithm.
| K Value | SSE | Silhouette coefficient | ||
|---|---|---|---|---|
| Traditional K-means algorithm based on bag-of-word model | Improved K-means algorithm based on semantic similarity and relatedness | Traditional K-means algorithm based on bag-of-word model | Improved K-means algorithm based on semantic similarity and relatedness | |
| 3 | 683.2779 | 672.4002 | 0.075280 | 0.363640 |
| 4 | 671.9479 | 577.2000 | 0.081084 | 0.355436 |
| 5 | 667.6426 | 532.9228 | 0.028152 | 0.352955 |
| 6 | 668.5414 | 519.1228 | 0.072469 | 0.355115 |
| 7 | 660.5026 | 461.7781 | 0.064319 | 0.328729 |
| 8 | 645.9339 | 385.5498 | 0.075842 | 0.352197 |
| 9 | 654.0991 | 392.1088 | 0.071101 | 0.362227 |
| 10 | 632.2624 | 373.0217 | 0.018480 | 0.344523 |
When the value of K equals 8, the improved K-means algorithm and the traditional K-means algorithm have a relatively higher value.