Table 2:
Strengths and Weakness of Clustering Technique used in intrusion detection literature
| Clustering Technique | Strength | Weakness | Ref |
|---|---|---|---|
| K-means | DR was high and FAR was below 4% and time complexity was low | Cluster number needs to be defined | Meng et al., Gerhard et al., Vipin et al |
| Hierarchical clustering overcame the shortcoming of K-means clustering to predict the number of clusters | Determining the cluster width manually, there was a chance of mislabeling the normal instance as an abnormal and vice-versa if W was not determined properly | Leonid et al. | |
| Y-means | Number of cluster dependency and dengenracy of K-means was overcome | High false positive compared to other algorithm | Guan et al. |
| Graph-based | Identifies clusters of any shape and it only uses a parameter and does not require to define anyclus- ter number | Computation increased as number of records in- creased | Zhou et al |
| k-medoids | Has advantages over the existing algorithm such as dependency on initial centroids, cluster number, and irrelevant clusters | The detection rate for the proposed algorithm was low for probing at- tack(70.51) and user to root attack(70.13) | Ravi et al. |
| IFCA | Introduce the function of validity for choosing number of clustering | Wei et al. | |
| IIDBC | Detection rate was high and the performance of DBSCAN was improved | Selection of Parameters | Li-XUE et al. |
| Grid-based | The performance was insensitive to the variation of the convergence crite- rion of clustering, attack and normal condition | Low detection rate and high false positive | Zhong et al. |
| CANN | Detection rate was high for 6-dimensional KDD datasets | Misclassified U2R and R2L as normal in case of 6-dimensional KDD datasets | Lin et al. |
| TANN | The accuracy rate, detection rate were high | Chih-Fong et al. | |
| Improved K-means | False alarm rate was relatively very high | Li Tian et al. | |
| Fuzzy C-means | Achieved good performance compared to Kmeans methods | Witcha et al | |
| Density+Grid based | High detection rate | High false positive rate | Leung et al. |
| K-means+One R classifica- tion | Detection rate above 99.0 and a false alarm rate be- low 2.75 and the perfor- mance of the hybrid clas- sifiers was higher as com- pared to the single classi- fier | Couldn’tclassify U2R and R2l attack | Z.muda et al. |
| Kmedoids+Naive Bayes | Showed better performance as compared to K-means and Naive Bayes hybrid algorithm | Chitrakar et al. | |
| PSO+K-means | Accuracy was very good for U2R and DoS | Low detection rate and high false positive | Lizhong et al., Zhiengje et al. |
| SVM+HC | Detection rate was high for all four types of attack | Horng et al. | |
| K-means+Naive Bayes | The algorithm was shown to be efficient in detecting network intrusion | This approach had a high false-positive rate | Sanjay etal Warsula et al. |
| K-means+C4.5 | The proposed algorithm gave notable detection rate | Amuthan et al. | |
| Random Forest+Weighted K-means | High false-positive rate | Reda et al. | |
| Fuzzy Cmeans+Svm | The accuracy value was slightly better than ex- isting K-Medoids+SVM and it was far better than SVM alone and it was shown to be stable over other | Abhaya et al. | |
| SOM+K-means+Fuzzy Cmean | Showed to outperform other competitor method by reducing FAR | Fatma et al | |
| Kmeans + Fuzzy + SVM | Effective for low-frequent attacks such as U2R and R2L | Chandrashekar et al. | |
| K-means + KNN + NB | May sometime misclassify the records | Hari Om et al. | |
| Fuzzy Cmeans+K-means | Very low detection rate | Partha et al. |