Skip to main content
. 2025 Aug 8;11:e3066. doi: 10.7717/peerj-cs.3066

Table 4. Comparison of advantages and disadvantages of traditional machine learning-based streaming data anomaly detection algorithms.

Model Advantage Disadvantage References
Statistical Model Capable of modeling data, inferring relationships between variables Requires certain a priori assumptions, needs validation of model reliability, requires the selection of fitting data processing methods Hunt & Willett (2018), Tao & Michailidis (2019), Yu, Jibin & Jiang (2016)
Distance model Can mine data in-depth High requirements for data preprocessing, demanding distance measurement methods, sensitive to noise Zhu et al. (2020), Ma, Aminian & Kirby (2019), Miao et al. (2018)
Clustering model Broad applicability, robust interpretability Not suitable for high-dimensional or large-scale streaming data, sensitive to initial values, high requirements for preprocessing Lee & Lee (2022), Raut et al. (2023)
Density model Simple to implement, quickly reveals potential structures and robust to noise Suffers from the curse of dimensionality in high-dimensional data, computationally intensive for large-scale data Liu et al. (2020), Zhang, Zhao & Li (2019)
Isolation model Capable of modeling data distribution, suitable for complex data distributions Performance may decrease with high-dimensional data Liu, Ting & Zhou (2008)
Frequent item mining Effective at identifying outliers and anomalies in low-density areas, no need for labeled data, supports unsupervised learning Potential for false positives due to noise and outliers in dataset Cai et al. (2020a), Hao et al. (2019), Cai et al. (2020b)