Table 2.
List of applications, methods, and features used in criminology based on mobile phone data.
| Reference | Analysis Perspective | Feature/Characteristic | Application | Description | Algorithm/Technique | Geographical Unit/Spatial Unit |
|---|---|---|---|---|---|---|
| [29,30] | Human mobility patterns | Spatiotemporal features: cell tower IDs and timestamps to calculate the total number of mobile phone devices in each cell tower every hour | Crime prediction | Human mobility patterns extracted from mobile phone data can be used to predict crime hotspots | RF | Cellular network cells: 124,119 cells |
| [41] | Human mobility patterns | Spatiotemporal features such as cell tower IDs and timestamps to estimate footfall count entries in each cell per hour | Crime prediction | The results show that the relationship between crime activities and the diversity of the ages and ratios of visitors negatively correlated | Correlational analysis: Tjostheim’s coefficient | Grid cells: the geographic area is divided into 23,164 grid cells. |
| [42] | Human daily mobility patterns or daily population mobility patterns | Extracted spatiotemporal features: cell tower IDs and timestamps | Crime prediction | The daily mobility flows of the general population have been captured to provide a template of the daily mobility of criminals | Regression analysis: conditional logit discrete choice models | Census units: 1616 census units |
| [33] | Human mobility patterns and social activities | Spatiotemporal features and call logs: cell tower IDs, timestamps, and the number of phone calls or short message service (SMS) made and received | Crime prediction | Mobile phone data have been used to measure the ambient population at risk, and results showed a strong correlation between ambient population and criminal activities | Correlation analysis: Moran’s I statistic, and regression: negative binomial regression analysis | Grid cells: the study region is partitioned into equally sized grid cells of (306 × 306 m). |
| [28] | Mobile phone activity | Spatiotemporal features: timestamps and cell tower IDs to estimate or count the number of times a mobile phone device communicates with the cell tower, which this parameter has later used to measure the size of the ambient population | Crime prediction | The results showed strong correlations between the ambient population measures (workday population, mobile phone data, and population 24/7 daytime estimates) and crime patterns (the crime of theft from person) | Correlation analysis: Spearman’s rank correlation coefficient [ρ] statistics | Lower super output areas (LSOAs): cellular network grid cells converted to LSOA geographical units |
| [31,43] | Mobile phone activity | Spatiotemporal features: timestamps and cell tower IDs to calculate the total number of mobile phone devices in each cell tower every hour over a 3-month period | Crime prediction | A stronger correlation was found between ambient population and crime rates | Correlation analysis: Pearson correlation coefficient and point-biserial correlation coefficient | Grid cells of 200 × 200 m |
| [27] | Human mobility patterns | Spatiotemporal features: timestamps and cell tower IDs | Crime prediction | The results demonstrate a negative relationship between ambient population and street robbers’ criminal activities, in which ambient population has a significant effect by reducing opportunities to commit crimes | Correlation and regression analysis: discrete choice models and negative binomial regression | The geographical areas were created using Thiessen polygons, where 52,026 cell towers were mapped onto polygons |
| [44,45] | Intra-daily mobility patterns of the population | Spatiotemporal features: timestamps and cell tower IDs to identify the origin and destination of each user | Crime prediction | These studies proposed a new measure in calculating crime rates and exploring crime patterning, which is the exposed population at risk, which includes a mixed population of, for example, criminals, victims, and guardians. The results showed that the exposed population is more significant than the ambient population in exploring violent crimes in public spaces | Correlation analysis: Spearman’s rank correlation coefficient (ρ) statistics [44]. Regression analysis: negative binomial regression model (NBM) [45] |
Lower super output areas: 1673 LSOAs |
| [32] | Daily movement patterns of migrant and native offenders | Spatiotemporal features extracted: timestamp and cell tower ID to count the number of mobile phone devices connected to a given cell tower on a per-hour basis. This feature helps to estimate ambient population and criminal movements when a crime takes place | Detecting criminal mobility patterns | The results show that the ambient population has a positive relationship with dynamic patterns of violent crimes committed by migrant offenders | Descriptive statistics and negative binomial regression models | The geographical areas were shaped using the Thiessen polygon technique, where 52,026 cell towers were represented as Voronoi cells |
| [5] | Spatiotemporal mobility patterns of terrorists | Spatiotemporal features of terrorists:
|
Detecting mobility patterns of terrorists | This study identified the meaningful places for criminals based on the digital traces they left at home and other visited locations. The traces were then analyzed to determine the changes in the terrorist’s spatial behaviors | Correlation analysis: Spearman’s rank coefficient (ρ) statistics, Pearson’s correlations, and statistical analysis: the cumulative distribution function | Cellular network cells: cell tower locations were spatially approximated to the postcode area, which in the United Kingdom covers a small area of approximately 0.14 km2. |
| [3,34,35,36,37,46,47] | Criminal communication behaviors | Call features: outcoming/incoming calls, call frequency, maximum and minimum numbers of incoming or outgoing calls and messages, call timestamps, temporal changes in mobile phone call patterns, caller ID, called ID, type of communication (phone call, SMS, MMS, or voice), and call duration | Detecting criminal networks | These studies built multiple forensic systems to detect criminal networks based on their calling characteristics. Here, a criminal network is represented by a set of nodes (criminals) and the edges or links between them represent a communication (i.e., a phone call or SMS) | Social network analysis tools and graph algorithms such as Prim’s minimum spanning tree algorithm [35], the Girvan-Newman algorithm [34], Space algorithm [3], Blondel’s community detection algorithm [4], and Fruchterman–Reingold algorithm [47] | N/A Missing location data (i.e., the geographical position of nodes is unknown) |
| [48,49,50,51,52] | Communication and mobility patterns of suspects | Spatiotemporal and calling features: the SIM numbers and location ID of the suspects, calls made between the suspects, maximum call duration, call frequency, phone calls made at the crime location, the most frequent caller, the number of times the suspect called other suspects, suspect trajectories, and others | Identifying suspects and their associates | These studies built a call detail record query system to detect suspects and suspicious groups. | Big data technologies and analytics such as Hive, Hadoop MapReduce, and the Hadoop Distributed File System | The coverage areas of cell towers have not been intersected with any geographical units. |
| [4,39] | Suspects’ communication behaviors | Call features: call duration between suspects, maximum and average call duration, maximum duration of outgoing and incoming calls, standard deviation duration of incoming calls, phone calls made at the crime location, and others | Suspect classification | These studies built suspect classification models based on machine learning approaches that can classify suspects from non-suspects | Bayesian network [39] and graph convolutional networks [4] |
N/A |