Table 7.
Machine learning (ML) and deep learning (DL) models and algorithms used by previous researchers in mobile phone data studies. The abbreviation of algorithms is shown in the back matter.
| References | Algorithm/Model | Objective |
|---|---|---|
| [102] | SVM, NB | To classify user relationships |
| [8] | FCM | To classify urban land use in Singapore |
| [4] | GCN | To classify criminals from non-criminals |
| [39] | BN | To classify suspect users from non-suspect users |
| [103] | GBDT | To detect significant locations in users’ visiting patterns |
| [29,30] | RF | To classify geographical areas into two classes, high or low crime levels |
| [7] | RF | To predict population density in Portugal and France |
| [23] | RF | To classify urban areas in Tel Aviv |
| [104] | DBSCAN, GMM | The DBSCAN algorithm is used to cluster users’ trajectories into meaningful places, while GMM is used to identify users’ habits |
| [11] | SVM | To classify urban land use in Beijing into six classes, (a) residential, (b) business, (c) scenic, (d) open, (e) other, and (f) entertainment |
| [12] | K-means | To identify urban functional areas (UFAs) in Beijing |
| [105] | GAN | To create artificial maps of population density distributions |
| [106,107] | K-means | To classify city users based on their calling behaviors into different types of city geographic areas, including residents, visitors, and commuters |
| [108] | MLP | To predict the real estate price in Budapest, Hungary |
| [109] | RF, GBDT, SVM, Adaptive boosting | To reconstruct individual trajectories |
| [110] | MLP, CNN, LSTM | To predict crowd distributions of people in urban areas |
| [111] | NB, LR, RF, DT, KNN | To prompt or recommend the best mobile phone contract services based on customer communication behaviors |
| [112] | BP | To estimate individual exposure to particulate matter (PM2.5) air pollution |
| [113] | ADTree, FT, RF | To detect subscriber identity module box (SIMbox) fraud |
| [114] | LR, SVM-Linear, SVM-RBF, KNN, RF | To predict demographic features such as age and gender |
| [115] | SVM-Linear, Logistic regression | To predict demographic features such as age and gender |
| [116] | NB, SVM, DS, RF, RNN | To predict the next location of tourists |
| [117] | HC | To cluster human mobility patterns based on similar individual trajectories |
| [118] | GAN | To generate synthetic data of mobile phone data |
| [119] | KNN, RF, SVM-Linear, SVM-RBF+CNN, LSTM, SDAE | To construct a classifier that enables the recognition of fraudulent phone calls |
| [120] | RF, GBDT, SVM +CNN | To classify churner customers from non-churner customers |
| [121] | DT, RF, GBDT, XGBoost | To predict customer churn in Syriatel telecom company |
| [122] | MLP, SVM, Bayesian networks | To detect prepaid customer churn in mobile telecommunications companies |
| [123] | RF, DT, MLP, GBDT | To build predictive models that can classify customers into different categories of loyalty, such as very high value customers (greater loyalty), medium value customers (average loyalty), and others |
| [124] | LDA, SVM-RBF, XGBoost), RF, LR, NB, KNN, Bagged CART, CART, GBDT, C5.0 | To predict customer demographic variables such as age and gender in Syriatel Telecom Company |
| [125] | K-means, DBSCAN | To detect fraudulent calls in telecommunications companies such as |
| [126] | GMM, ANN | To build a clustering-based classification model to classify cellular network traffic patterns into high-activity area, medium-activity area, low-activity area, etc. |
| [92,94,127] | K-means, GMM+CNN | To detect anomalous behavior through the identification of anomalous activities of mobile phone subscribers [92], to detect anomalies in a cellular network such as sleeping cells or unusual high call volume in a given region (traffic activity) [94] |
| [128] | FCM | To classify mobile subscribers based on extracting their calling features into three classes genuine, fraudulent, and suspicious |
| [129,130] | HC, k-means, FCM, SVM | To detect fraudulent behaviors in telecom companies such as detecting fraudulent calls |
| [10] | K-means, FCM, spectral clustering, consensus clustering | To cluster land use in Madrid |
| [131] | FKNN, MLP, C4.5, SVM GBDT, LR, RF, Adaptive boosting | To classify mobile customers into two classes, churners or non-churners |
| [132] | K-means | To cluster users according to their weekly mobility patterns into six different profiles |