Abstract
In this study we analyze one year of anonymized telecommunications data for over one million customers from a large European cellphone operator, and we investigate the relationship between people's calls and their physical location. We discover that more than 90% of users who have called each other have also shared the same space (cell tower), even if they live far apart. Moreover, we find that close to 70% of users who call each other frequently (at least once per month on average) have shared the same space at the same time - an instance that we call co-location. Co-locations appear indicative of coordination calls, which occur just before face-to-face meetings. Their number is highly predictable based on the amount of calls between two users and the distance between their home locations - suggesting a new way to quantify the interplay between telecommunications and face-to-face interactions.
Introduction
The interplay between telecommunications, travel and face-to-face meetings is an unresolved puzzle. In some cases it has been suggested that telecommunications may be a substitute for physical interaction [1] - an idea that gained traction during the nineties and the rapid expansion of the Internet [2], [3]. In other cases conflicting hypotheses have been made, including those of a complementary [4], [5], neutral [6] or reinforcing [7] effect. Recently, social networks have been identified as possible predictors of travel behavior, as well as the possible decision to telecommute [8], [9]. Social interaction has thus been integrated in activity-travel models [10], in addition to the existing categories of travel such as commuting, leisure and business. Furthermore, researchers such as Urry and others [11]–[13] have argued that flows and meetings of people produce small worlds, which require connections and meeting places - a phenomenon which is also known as the new mobilities paradigm.
This study aims to provide a new perspective into the relationship between telecommunicating people and their physical locations through an assesment of anonymized Call Detail Records (CDRs). CDRs show great promise for academic research: they have recently been used to explore human communications [14], [15], the geography of social networks [16], [17], urban dynamics [18], and human mobility patterns [19]–[22]. In this paper we use them for the first time to study the relationship between the telecommunications patterns of any two people and their physical locations.
Results
We use a large anonymized dataset of billing records for over one million mobile phone users, which was gathered in Portugal over a twelve month period between 2006 and 2007 (see Methods). We look at all communications between pairs of users, together with their locations at call time. As we are interested in comparing people locations, we discard users for which we do not have enough samples. We use two subsets: D1, which contains all reciprocal communications between the top 100,000 callers; and D2, which contains 10,000 pairs from D1, sampled at different home distances to ensure the same home distances distribution found in D1 (see Text S2). In the sequel, we use D2 in cases where computational complexity limits the use of a larger set.
We discover that at least 93% of users in D1 who reciprocally call each other, have at least once shared the same cell tower area in one year. The percentage decreases slightly as the distances between their homes decreases, but the value is still above 90% for users living 100 km apart (see Figure 1). It appears that almost all remote communications are associated with being physically sharing space. It may also be noted that we are underestimating the percentage as our data is only based on locations at call time, so users might have also shared space without this being recorded in our data. Results are consistent with what was recently found analyzing spatio-temporal coincidences in a geo-tagged pictures database to infer social ties [23].
If we also consider the temporal component, we can look at how often and where users are sharing the same space at the same time. We restrict our attention to the case when two users call each other using the same cell tower. This scenario is based on the hypothesis that they are calling each other to coordinate to meet in a nearby area, also called “coordination knot” [24]. Of course, two people living or working close by could also call each other very often without physically meeting. So, we excluded users living or working in the same cell tower area, estimated as described in Text S2. We define a co-location event between two users (who live and work in distinct locations) as a call between the users while they are connected to the same cellphone tower. Each co-location is characterized by a specific time and place. Based on this definition, we characterize the spatio-temporal features of co-location events, to see whether they represent a reasonable subset of actual face-to-face meetings between users.
Starting with the larger subset (D1), we analyze the relationship between calling activity and user's locations. Among the pairs of communicating users, 400,000 cases have two users calling each other while in the same cell tower area, 350,000 of which have distinct home and work locations. Interestingly, 38.33% of the communicating users co-locate at least once during the period examined. When stronger relationships are considered (users who call on average at least once per month) the percentage increases to 69.41%.
Call duration appears to increase with the homes distance between users (see black line in Figure 2). Calls that occur between co-located people (red line) have a shorter average call duration, suggesting that people who co-locate call each other briefly to coordinate the exact meeting place and time.
We also find that the number of calls between two users increases just before and after their co-location (Figure 3). The probability is rather constant in the interval, with two peaks around 0 and 1 (consecutive co-location events). The presence of these peaks suggests that the considered events (co-locations) represent a reasonable proxy for face-to-face meetings. In particular, a peak of calls just before the co-location event, suggests that the two people are talking on the phone to arrange a meeting, in line with what is hypothesized in [16], [24]. The peak right after the co-location event might be explained by a follow up call after the meeting.
We analyze the features of co-location places and compared it with geographical and communication differences between users. We define and as the distances traveled by two users at every co-location event , and compute three measures of comparison:
- The median ratio between the shortest and longest distance at co-location time:
- The fraction of times user travels less than its peer:
where: - The fraction of times one of the users travels less than the peer:
The first measure allows a comparison to be made between the lengths of the two users' trips. On the D2 subset, we find on average , i.e. one user travels about 3 times less than the other one. Due to the asymmetric behavior in the length of trips, we question whether the shorter trips are always taken by the same user, or if the two users share the short trips. The third measure allows an evaluation of the asymmetry at the pair level, showing an average of . This suggests that in 94% of the selected pairs, there is one user who constantly travels less than its peers. The second measure is a directed measure and is computed to see whether geographical and communication differences allow the user that travels less to be predicted. Text S3 reports how these measures vary with homes distance, population density, normalized tie strength and call direction. In particular we find that as users' homes distance increases, co-locations occur in a place that is closer to one of the users. Moreover, the more the normalized tie strength differs between users, the more the co-locations occur in places close to one of them.
Our definition of distance is based on the Euclidean distance between home and co-location places. Two limitations arises from this choice: 1) the Euclidean distance does not take into account the real path taken by a person; 2) the person might not travel directly from home but the origin of the trip to the co-location place could be different. However, as we are interested in the relative distances traveled by the two peers, we can assume that both limitations affect the two measures in a similar manner, thus limiting the potential bias.
We evaluate the relationship between the home locations' distance and the number of co-locations between users. Figure 4(a) shows the average number of co-locations, which decreases with distance. The result is consistent with what was found in [12], [25]–[27] using data from surveys. If we compare this decrease with the one of phone calls, and total call times (see Figure 4(a)) we find different decays with distance. Total call time is the least affected by distance (slope −0.04), followed by the number of calls (slope −0.07). In contrast with this, the number of co-locations is strongly affected by distance (slope −0.14). Even if we consider a broader definition of co-location, in which two users are considered co-located in the same cell tower if they happen to make a phone call (not necessarily to each other) from the same cell tower area within one hour, we still find a similar decreasing trend, as shown in Figure 4(b) computed for the D2 subset. The results are consistent with those from the analysis of fixed phone data combined with interviews showing the effect of distance on call duration and frequency of meetings [28], [29].
The number of calls has a strong influence on the number of co-locations, suggesting that the more people call each other, the more they co-locate (see Figure 5). As there appears to be a clear relationship between call patterns, distance and co-locations, we tried to built a predictor of the number of co-locations, starting from a measure of interaction (number of calls) and the geographical distance between users' home, obtaining with the model (Figure 6):
This result suggests that geography and telecommunication interactions account for 61% of variations in the number of co-locations (see also Text S4). This is consistent during the one year time frame under analysis, as reported in Text S5. The exponent for the reveals the correlation between an increase in the number of calls and an increase in the number of co-locations. This result suggests that telecommunications might play a complementary role in facilitating face-to-face interactions, supporting the observations found in other studies [4], [5].
Discussion
In this study we analyze one year of telecommunications data from a large European cellphone operator to investigate the relationship between people's calls and their physical location.
We discover that more than 90% of users who called each other have also shared the same space (cell tower), even if they live far apart. Moreover, we find that 69% of users who call each other frequently (at least once per month on average) have shared the same space at the same time - an instance that we call co-location. Co-locations appear highly indicative of coordination calls occurring just before face-to-face meetings. We are able to predict 61% of variations in the number of co-locations from the number of calls, and users' homes distance. In particular, as the distance between homes increases, the expected number of co-locations decreases.
We also characterize the co-location places in terms of distance from the home locations. As the users' homes distance increases, co-locations occur in a place that is closer to one of the users. In more than 90% of the cases, co-locations take place in an area that is closer to the same user of the pair (there is low reciprocity in the travel distance covered). Telecommunication strength helps predict which person of the pair travels less.
We believe that the above results suggest new ways to use CDRs to investigate the old conundrum of the interplay between telecommunications, travel and face-to-face meetings - with applications in the social sciences, urban planning and transportation studies.
Methods
Dataset
We use a large anonymized dataset of billing records for over one million mobile phone users, which was gathered in Portugal over a twelve month period between 2006 and 2007. To safeguard personal privacy, individual phone numbers were anonymized by the operator before leaving storage facilities, and they were identified with a security ID (hash code). Each entry in the dataset has a CDR, which consists of the following information: timestamp, callers ID, callees ID, call duration, callers cell tower ID, and callee's cell tower ID. This metadata on each call allows us to study both the mobile social interaction as well as the physical location of the users within the dataset. Notably, the dataset does not contain information regarding text messages (SMS) or data usage (internet). More details about the dataset can be found in Text S1.
Supporting Information
Acknowledgments
The authors thank Dima Ayyash, Dominik Dahlem, Santi Phithakkitnukoon and Prudence Robinson for their feedback, and Orange Labs, IBM Research, the National Science Foundation, the AT&T Foundation, the MIT SMART program, GE, Audi Volkswagen, SNCF, ENEL and the members of the MIT Senseable City Lab Consortium for supporting the research.
Footnotes
Competing Interests: Francesco Calabrese is employed by IBM Research. Zbigniew Smoreda is employed by Orange Labs. These affiliations do not alter the authors' adherence to all PLoS ONE policies on the sharing of data and materials.
Funding: The authors were partially funded by the AT&T Foundation, the National Science Foundation (grant number 0735956), the MIT SMART program, GE, Audi Volkswagen, SNCF and ENEL. Francesco Calabrese was partially funded by IBM Research, which had no role in study design, data collection, data analysis, decision to publish, or preparation of the manuscript. Zbigniew Smoreda was funded by Orange Labs, which contributed to data collection and had no role in study design, data analysis, decision to publish, or preparation of the manuscript. The other funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Albertson LA. Telecommunications as a travel substitute: Some psychological, organizational, and social aspects. Journal of Communication. 1977;27:32–43. [Google Scholar]
- 2.Cairncross F. The death of distance. Harvard Business School Press; 1997. [Google Scholar]
- 3.Mitchell WJ. City of bits: space, place, and the infobahn. MIT Press; 1996. [Google Scholar]
- 4.Mokhtarian PL. Telecommunications and travel: The case for complementarity. Journal of Industrial Ecology. 2003;6:43–57. [Google Scholar]
- 5.Mok D, Wellman B, Carrasco J. Does distance matter in the age of the internet? Urban Studies. 2010;47:2747–2783. [Google Scholar]
- 6.Choo S, Lee T, Mokhtarian PL. Do transportation and communications tend to be substitutes, complements, or neither? U.S. consumer expenditures perspective, 1984–2002. Transportation Research Record. 2010:121–132. [Google Scholar]
- 7.Sasaki K, Nishii K. Measurement of intention to travel: Considering the effect of telecom-munications on trips. Transportation Research Part C. 2010;18:36–44. [Google Scholar]
- 8.Salomon I. Technological change and social forecasting: the case of telecommuting as a travel substitute. Transportation Research Part C: Emerging Technologies. 1998;6:17–45. [Google Scholar]
- 9.Paez A, Scott D. Social inuence on travel behavior: a simulation example of the decision to telecommute. Environment and Planning A. 2007;39:647–665. [Google Scholar]
- 10.Arentze T, Timmermans H. Social networks, social interactions, and activity-travel behavior: a framework for microsimulation. Environment and Planning B. 2008;35:1012–1027. [Google Scholar]
- 11.Urry J. Sociology beyond societies: mobilities for the 21st century. Routledge, London; 1999. [Google Scholar]
- 12.Larsen J, Urry J, Axhausen K. Mobilities, networks, geographies. Ashgate, Aldershot; 2006. [Google Scholar]
- 13.Sheller M, Urry J. The new mobilities paradigm. Environment and Planning A. 2006;38:207–226. [Google Scholar]
- 14.Onnela J, Saramaki J, Hyvonen J, Szabo G, Lazer D, et al. Structure and tie strengths in mobile communication networks. Proceedings of the National Academy of Sciences. 2007;104:7332. doi: 10.1073/pnas.0610245104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Eagle N, Pentland AS, Lazer D. Inferring friendship network structure by using mobile phone data. Proceedings of the National Academy of Sciences. 2009;106:15274–15278. doi: 10.1073/pnas.0900282106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lambiotte R, Blondel V, de Kerchove C, Huens E, Prieur C, et al. Geographical dispersal of mobile communication networks. Physica A: Statistical Mechanics and its Applications. 2008;387:5317–5325. [Google Scholar]
- 17.Krings G, Calabrese F, Ratti C, Blondel V. A gravity model for inter-city telephone communication networks. Journal of Statistical Mechanics: Theory and Experiment L07003 2009 [Google Scholar]
- 18.Reades J, Calabrese F, Sevtsuk A, Ratti C. Cellular census: Explorations in urban data collection. IEEE Pervasive Computing. 2007;6:30–38. [Google Scholar]
- 19.Gonzalez M, Hidalgo C, Barabasi AL. Understanding individual human mobility patterns. Nature. 2008;453:779–782. doi: 10.1038/nature06958. [DOI] [PubMed] [Google Scholar]
- 20.Wang P, Gonzalez M, Hidalgo C, Barabasi AL. Understanding the spreading patterns of mobile phone viruses. Science. 2009;324:1071–1076. doi: 10.1126/science.1167053. [DOI] [PubMed] [Google Scholar]
- 21.Song C, Qu Z, Blumm N, Barabasi AL. Limits of Predictability in Human Mobility. Science. 2010;327:1018–1021. doi: 10.1126/science.1177170. [DOI] [PubMed] [Google Scholar]
- 22.Calabrese F, Pereira F, DiLorenzo G, Liu L. The geography of taste: analyzing cell-phone mobility and social events. International Conference on Pervasive Computing 2010 [Google Scholar]
- 23.Crandall D, Backstrom L, Cosley D, Suri A, Huttenlocher D, et al. Inferring social ties from geographic coincidences. Proceedings of the National Academy of Sciences. 2010;107:22436–22441. doi: 10.1073/pnas.1006155107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Diminescu D, Licoppe C, Smoreda Z, Ziemlicki C. The reconstruction of space and time. 2008. Mobile communication practices, New Crunswick and London: Transaction Publishers, chapter Tailing untethered mobile users: Studying urban mobilities and communication practices.
- 25.Larsen J, Axhausen K, Urry J. Geographies of social networks: Meetings, travels and communications. Mobilities. 2006;1:261–283. [Google Scholar]
- 26.Wheeler J, Stutz F. Spatial dimensions of urban social travel. Annuals of the Association of American Geographers. 1971;61:371–386. [Google Scholar]
- 27.Carrasco J, Miller E, Wellman B. How far and with whom do people socialize?: Empirical evidence about distance between social network members. Transportation Research Record: Journal of the Transportation Research Board. 2008;2076:114–122. [Google Scholar]
- 28.Licoppe C, Smoreda Z. 2006. Computers, Phones, and the Internet: Domesticating Information Technology, Oxford University Press, chapter Rhythms and ties: towards a pragmatics of technologically-mediated sociability.
- 29.Licoppe C, Smoreda Z. Liens sociaux et régulations domestiques dans l'usage du téléphone. de l'analyse quantitative de la durée des conversations à l'examen des interactions. Réseaux. 2000;18:253–276. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.