Skip to main content
. 2020 Nov 30;6:e321. doi: 10.7717/peerj-cs.321

Table 1. Comparison of Spark-based Clustering methods in terms of the supported Big Data characteristic (volume, variety and velocity) and in terms of the type of data (real and synthetic) the proposed method was validated.

Category Sub-category Paper Supported Big Data Characteristic Validated on
volume variety velocity Real Synthetic
K-means Machine Learning Kusuma et al. (2016)
Sarazin, Lebbah & Azzag (2014)
Gao & Zhang (2017)
Thakur & Dharavath (2018)
Lighari & Hussain (2017)
Kamaruddin, Ravi & Mayank (0000)
Fuzzy Wu et al. (2017)
Win et al. (2019a)
Win et al. (2019b)
Liu et al. (2019)
Bharill, Tiwari & Malviya (0000)
Statistics Shah (2016)
Lavanya, Sairabanu & Jain (2019)
Pang et al. (0000)
Chakravorty et al. (2014)
Scalable Wang et al. (2016)
Sinha & Jana (2016)
Backhoff & Ntoutsi (2016)
Ding et al. (2017)
Sharma, Shokeen & Mathur (2016)
Ben HajKacem, Ben N’Cir & Essoussi (2017)
Ben HajKacem, Ben N’cir & Essoussi (0000)
Zayani, Ben N’Cir & Essoussi (2016)
Chitrakar & Petrovic (2018)
Fatta & Al Ghamdi (2019)
Zhang et al. (2019)
Solaimani et al. (2014)
Mallios et al. (0000)
Hierarchical Data Mining Guo, Zhang & Zhang (2016)
Ianni et al. (2020)
Lee & Kim (2018)
Machine Learning Sarazin, Azzag & Lebbah (2014)
Malondkar et al. (2019)
Scalable Jin et al. (2015)
Solaimani et al. (0000)
Hassani et al. (2016)
Density Graph Rui et al. (2017)
Zhou & Wang (0000)
Kim et al. (2018)
Lulli, Dell’Amico & Ricci (2016)
Data Mining Han et al. (2018a)
Hosseini & Kourosh (2019)
Aryal & Wang (2018)
Machine Learning Hosseini & Kiani (2018)
Corizzo et al. (2019)
Liang et al. (2017)
Scalable Luo et al. (2016)
Han et al. (2018b)
Baralis, Garza & Pastor (2018)
Gong, Sinnott & Rimba (0000)