Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2012 Apr 18;7(4):e34487. doi: 10.1371/journal.pone.0034487

Collective Human Mobility Pattern from Taxi Trips in Urban Area

Chengbin Peng 1,2, Xiaogang Jin 2, Ka-Chun Wong 1, Meixia Shi 3, Pietro Liò 4,*
Editor: Matjaz Perc5
PMCID: PMC3329492  PMID: 22529917

Abstract

We analyze the passengers' traffic pattern for 1.58 million taxi trips of Shanghai, China. By employing the non-negative matrix factorization and optimization methods, we find that, people travel on workdays mainly for three purposes: commuting between home and workplace, traveling from workplace to workplace, and others such as leisure activities. Therefore, traffic flow in one area or between any pair of locations can be approximated by a linear combination of three basis flows, corresponding to the three purposes respectively. We name the coefficients in the linear combination as traffic powers, each of which indicates the strength of each basis flow. The traffic powers on different days are typically different even for the same location, due to the uncertainty of the human motion. Therefore, we provide a probability distribution function for the relative deviation of the traffic power. This distribution function is in terms of a series of functions for normalized binomial distributions. It can be well explained by statistical theories and is verified by empirical data. These findings are applicable in predicting the road traffic, tracing the traffic pattern and diagnosing the traffic related abnormal events. These results can also be used to infer land uses of urban area quite parsimoniously.

Introduction

Urban traffic has drawn the attention of physicists since more than one decade ago. Generally, there has been two kinds of approaches for the traffic analysis. In microscopic models, some researchers represent vehicles as particles interacting with each other [1], [2], while some others use the cellular automata framework [1], [3], [4]. Based on game theory, the impact of individuals' irregular behaviors on traffic system is also emphasized [5]. On the other hand, from the macroscopic perspective, the idea of fluid dynamics is introduced [1], [6].

In recent years, a new and more fundamental approach for traffic analysis is emerging: human mobility, by drawing statistical inferences from the enormous empirical data [7][9]. Several reasons boost the research in this area.

Firstly, the knowledge of the mobility pattern is essential in traffic modeling [10], [11] for simulation, forecasting [12], [13] and control [11]. In addition, by measuring the traffic flow during some time interval to see whether or not it agrees with the verified estimation, the collective mobility analysis can serve as a tool for abnormality definition and detection [14], [15]. Compared to computer vision based detection [16], [17], collective mobility model based abnormality detection can be applied in a much larger scale of area, for example, the whole city.

Secondly, the mobility pattern and the consequential traffic flow can also interact with the land use. The characteristics of traveling strongly influence urban formation, evolving, and future planning [18][21], whereas the land use can also affect the urban traffic [22][24] and the human mobility [25].

Thirdly, the better understanding of human mobility can help to more easily control the spreading of contagious diseases by limiting the contact among individuals [26], since the transmission of infected people from one place to another is an important way to infect the susceptible ones, either in a small scale area [27],[28] or from a worldwide viewpoint [29][31]. Similar theories hold for viruses contamination with malicious code among wireless communication devices [32], [33].

Due to the high importance of human mobility research, and the availability of the large amount of empirical data as a consequence of the prevalence of wireless communication devices, researchers become more and more interested in the statistical features of human mobility pattern via real world data [34]. Ref. [7] and Ref. [9] suggest that human travels are reminiscent of Lévy Flights [35] according to the trajectories of bank notes and taxies respectively, while Ref. [36] reports some variances by the GPS information from volunteers. These differences are later recognized as a result of the periodic pattern of individual's traveling [8] and recently Ref. [37] discovers up to 93% of total time when individual locations are predictable in their data set, which contains trajectories of mobile phone users. For taxi trips, Ref. [38] studies the distribution of the travel distances and time.

Nevertheless, previous statistical inferences of human mobility mostly focus on individual level, while this article analyzes the citizens' collective dynamics in the urban area. In our research, based on the traveling purposes, we discovered three distinct basis patterns for collective traffic flow regardless of the location. In addition, a distribution is revealed that can characterize the fluctuation of the traffic flow at any time in each location. As mentioned above, these findings can be useful for urban planning, traffic estimation and anomalous detection. Further studies on interaction between different areas will provide a more detailed collective mobility model, and would additionally benefit the research on epidemic spreading in urban area.

Analysis

Data Description and Background Assumptions

In this research, the data [39] are collected from about two thousand taxies operating within the urban area of Shanghai, China. These data mainly focus on the central part of city, and the population in this part is about seven million according to the fifth national population census [40]. The information about when and where passengers were picked up and dropped off can be retrieved from the raw data, and every pair of picking and dropping information is defined as a taxi trip. The data set includes about 1.58 million taxi trips. The longitude and latitude location information in the data by GPS is converted to positions in a planar coordinate system, with the city landmark Oriental Pearl Tower as the origin. For the ease of analyzing and representing, the urban area is divided into squares, similar to a chessboard. The side lengths of each square is identically 200 meters. In our context, each location corresponds to one of these squares. More details can be found in Appendix S1.

Basis Traffic Flows: the Constancy

As we know, even a Inline graphic area in a city can possess land of several different types, for example, containing schools, shops and apartments at the same time. In this section, we will discuss how to categorize the taxi trips according to the traveling purposes, and then use these categories to infer the land use composition for each square.

First of all, we consider the taxi trip categorization. People setting out in the same location would possibly have different purposes: some may go to workplaces while some others may go for entertainment. Meanwhile, for trips belonging to the same category but in different locations, the collective pattern should be similar, regarding to the departure and arrival time in a large amount of data. For example, if the number of trips between residential area and workplaces (for commuting purpose) reaches the highest at 8:00 am (going to work) and 5:00 pm (getting off work), then the number of trips in this category in any place would peak almost at the same time, although the scale may be different.

In short, we can define a set of basis collective patterns, each of which corresponding to a trip category respectively. Then linear combinations of these patterns can describe the macro traveling pattern of each location. Finally, the coefficients in a linear combination can reflect the land uses of the location.

Directly from the taxi data, we can only calculate the macro patterns. Therefore, we should adopt appropriate inference methods to find the basis patterns and the coefficients for each location.

To represent our method more formally, we define Inline graphic to index the square in Inline graphicth row and Inline graphicth column among all the squares divided within the city. If Inline graphic is the number of rows and Inline graphic is the number of columns for squares in the map, then Inline graphic, and Inline graphic. Let Inline graphic be the number of time slots, normally 24 for one day. Therefore for location Inline graphic, the numbers of departure and arrival trips (macro pattern) along time each day can be represented by a Inline graphic vector Inline graphic, which is easy to calculate. We can also define a set of Inline graphic vectors containing normalized numbers of trips along time: Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic, each for one basis pattern that we seek for.

The macro pattern is a linear combination of basis patterns, so we have

graphic file with name pone.0034487.e019.jpg (1)

where Inline graphic is a row vector containing Inline graphic coefficients for the linear combination on the right-hand side.

By taking all the locations into account, it can also be written as

graphic file with name pone.0034487.e022.jpg (2)

and abbreviated as

graphic file with name pone.0034487.e023.jpg (3)

Because the two matrices on the right-hand side of Eq. (3) are unknown, there are many matrix decomposition methods that may apply. However, according to the physical meaning of Inline graphic and Inline graphic, all the entries of these two matrices should be nonnegative. Therefore, we choose nonnegative matrix factorization (NMF) [41], [42] for the decomposition.

In our context, it is a method to factorize a matrix Inline graphic into two nonnegative factors Inline graphic and Inline graphic approximately. By this approach, we can find the basis patterns (the row vectors of Inline graphic) and the parameter vectors (the row vectors of Inline graphic) simultaneously. As vector Inline graphic (the Inline graphicth row of matrix Inline graphic) is only responsible for vector Inline graphic (the Inline graphicth row of matrix Inline graphic), in fact, each element of Inline graphic denotes the scale of traffic flow with respect to the corresponding category, in location Inline graphic. Hence, we also call these elements the traffic power because they reflect how strong the traffic flows of different categories are.

Now the only thing left is to determine Inline graphic, the number of the basis patterns.

From the algorithmic perspective, we noticed that NMF starts with random initial conditions [41]. By experiments on the taxi data with many different random initial conditions, we find that only when Inline graphic equals 3, the factorization results can be stable. This fact indicates that with parameter Inline graphic, NMF can find out statistically significant characteristics for the data, and Fig. 1 demonstrates the resulted basis pattern Inline graphic, Inline graphic and Inline graphic.

Figure 1. Basis Pattern B: Green is B1, Red is B2, and Blue is B3.

Figure 1

Solid Lines Represent the Mean Inline graphic, while Dashed Lines Represent the Positive and Negative Deviations Averaged on Different Days.

On the other hand, from the land-use and trip-category perspective, Inline graphic is a reasonable choice in categorizing trip purposes.

There are several land-use definitions related to the topic of mobility. For example, each place may be classified as a residential (home), working, shopping, or recreational location [27]. It may also be regarded as one of the following types: a residential area, a workplace, a commercial zone, a recreation area and educational facilities [43]. In Ref. [44], these types are simplified into workplace, home and shop. Specifically for the city of Shanghai based on GIS information, Ref. [45] refers to the land types including residence, industry, agriculture, roads, water, land for construction and other urban land. In our context, we can simplify the land-use definition to be: residences, workplaces and others. Here workplaces shall include any industrial and office workplaces as well as schools, and other places can include shopping and recreational facilities, hospitals, etc.

For trips, some scientists categorize these individual activities into several orientations: family, work, leisure and service-based movement [46]. Similarly, according to our land-use definition, we can use three purpose-based categories for the trips: commuting between home and workplace (Inline graphic), business traveling between two workplaces (Inline graphic), and trips from or to other places (Inline graphic). This representation is in accordance with the algorithmic result in Fig. 1. Take a typical workday as an example, based on our three categories, the major traffic flows in the city are supposed to be as follows: those from home to workplaces in the early morning (green line), from one workplace to another in the daytime (red line), from workplace to home or to other places at dusk (green line again), and those between other places and home in the night (blue line).

Therefore, Inline graphic is an effective and reasonable choice.

In the following sections with Inline graphic, for clarity, we will use Inline graphic, Inline graphic and Inline graphic to replace Inline graphic, Inline graphic and Inline graphic respectively:

graphic file with name pone.0034487.e058.jpg (4)

We also use Inline graphic, Inline graphic and Inline graphic to represent the three entries in vector Inline graphic:

graphic file with name pone.0034487.e063.jpg (5)

Appendix S2 describes the detailed implementation about applying NMF to this problem. The basis pattern on different days are averaged to Inline graphic. Then, Inline graphic, the traffic power, can be recalculated based on Inline graphic for different day. If it variants in an acceptable interval day by day, the daily average of Inline graphic, represented by Inline graphic, can indicate the land use of location Inline graphic. For example, if Inline graphic is large, then the traffic flow corresponding to basis pattern Inline graphic is large, suggesting that location Inline graphic serves mainly for residences or workplaces, while if Inline graphic is the largest, we can be quite sure that this location is mainly for workplaces. In addition, if the variation of Inline graphic on some day goes out of the acceptable interval, it indicates that something abnormal happens on that day. This feature can be helpful for anomaly detection on human activities in a large area. In the next section, we will analyze the variance of Inline graphic, to determine what is an acceptable interval.

Daily Traffic Power: the Variation

Typically in a city, the volume of the traffic flow is quite regular everyday [8]. However even for the same time in the same location but on different days, the volume is vulnerable to change within a certain range. This section is devoted to analyze how Inline graphic fluctuates everyday. In this case, Inline graphic is calculated from the average basis pattern Inline graphic according to Appendix S2.

We define a random variable Inline graphic to represent the relative variance of the traffic power.

The empirical distribution function of Inline graphic can be simply extracted from a collection of the following expressions in different locations on different days:

graphic file with name pone.0034487.e081.jpg (6)

where Inline graphic means the daily average, as we have used.

We also find the theoretical distribution function of Inline graphic, which is more complex.

First, we try to find Inline graphic only for the first category of trips in location Inline graphic. We define Inline graphic as the potential population that may affect the first-category traffic in this location, and Inline graphic as the probability (ratio) that an individual in the population finally becomes part of that traffic flow. Then the number of such trips follows a binomial distribution:

graphic file with name pone.0034487.e088.jpg (7)

where Inline graphic can be any non-negative integer less than Inline graphic. Because it is a binomial distribution, the corresponding CDF can be written in terms of the beta functions:

graphic file with name pone.0034487.e091.jpg (8)

where Inline graphic. Inline graphic is the incomplete beta function as Inline graphic and Inline graphic is the beta function as Inline graphic. Eq. (8) is strictly equal when Inline graphic is a positive integer, while for a real positive number of Inline graphic, we may use this approximation:

graphic file with name pone.0034487.e099.jpg (9)

According to the definition, Inline graphic, where Inline graphic is equivalent to Inline graphic by the property of expectation of the binomial distribution, and can be treated as a constant for a given location. Therefore, the probability density function (PDF) of Inline graphic is:

graphic file with name pone.0034487.e104.jpg (10)

where Inline graphic should satisfy the condition that Inline graphic is a non-negative integer. The cumulative distribution function (CDF) is

graphic file with name pone.0034487.e107.jpg (11)

where Inline graphic where represents the floor function. We call this distribution the normalized binomial distribution of Inline graphic. As listed in Appendix S3, the moment generation functions of Inline graphic indicate that Inline graphic plays an essential role in the distribution. Numerical simulations also provide evidence that the distribution of Inline graphic is strongly affected by Inline graphic (the product of Inline graphic and Inline graphic), but is almost irrelevant to Inline graphic or Inline graphic alone. Therefore, we can assign an constant integer Inline graphic to Inline graphic.

Let Inline graphic be a vector containing all the possible values of Inline graphic. Then the PDF of Inline graphic with Inline graphic can be written in this form

graphic file with name pone.0034487.e124.jpg (12)

and the CDF is

graphic file with name pone.0034487.e125.jpg (13)

Finally, we discuss how to make Inline graphic representative for variations of any traffic category in any location. We define a vector Inline graphic, in which each entry Inline graphic denotes the proportion of traffic flow corresponding to Inline graphic. Then for a randomly selected traffic flow, when the average number of trips Inline graphic is not given, a general expression for the CDF of Inline graphic is

graphic file with name pone.0034487.e132.jpg (14)

By beta approximation as in Eq. (9), it can be written into a continuous version

graphic file with name pone.0034487.e133.jpg (15)

Results

In this section, we demonstrate how our theoretical results are supported by the empirical investigation.

The general characteristics of our data set, such as the displacement distribution in Fig. 2 and the visiting frequency distribution in Fig. 3, are similar to others' [8], [38]. The plot of daily traffic flow in Fig. 4 exhibits some hot areas by red, including the most flourishing commercial street Nanjing Road as the largest red block, Shanghai Railway Station, Shanghai South Railway Station, Lujiazui Finance & Trade Zone, etc. The largest isolated area in blue is the Pudong International Airport.

Figure 2. Traveling Distance Distribution.

Figure 2

Figure 3. Visiting Frequency Distribution of Different Locations.

Figure 3

Figure 4. The Average Traffic Flow of Each Location, and the Tags Corresponding to Following Locations:

Figure 4

◯1 Shanghai Railway Station; ◯2 Nanjing Road & People's Square; ◯3 Lujiazui Finance & Trade Zone; ◯4 Shanghai South Railway Station; ◯5 Pudong International Airport.

Without any intentional intervention, by NMF with random initial values, we find that the normalized basis pattern on workdays is generally quite similar (Fig. 1). Therefore, we can use the traffic power Inline graphic to analyze the mean and the deviation of daily traffic.

In Fig. 5, the three components of Inline graphic in every location is normalized and represented by yellow, red and blue respectively. For example, a location in yellow color means the traffic flow of the first category (Inline graphic: commuting between home and workplace) is dominant there. Mixed colors in some places indicate a mixture of traffic flows of different categories. It is noticeable that in area where the traffic flow is large, the positive (Fig. 6(a)) and negative (Fig. 6(b)) deviation of the traffic power Inline graphic is quite small. The distribution of this deviation can be represented accordingly by Fig. 7(a) and Fig. 7(b), which is fitted well with Eq. (15). This fitting result is quite different from the best fitted normal distribution by the central limit theory, which verifies Eq. (14) and Eq. (15) that Inline graphic should be a collection of random variables following a set of distributions with different parameters. The proportion of traffic flow with Inline graphic is Inline graphic, as plotted in Fig. 8. Here we limit each Inline graphic to be no larger than twice of the empirical value. According to the result in Fig. 7, for the whole city, 80% of the deviations are within the range of Inline graphic. Although the lengths of vectors Inline graphic and Inline graphic are identically 50 in our estimation, the number of active pairs (Inline graphic) of Inline graphic and Inline graphic is only about 10, and this number can be reduced if we only calculate for a small area given the sufficient amount of data. In short, we can see that Eq. (15) can be a reasonable approximation for the relative deviation of the daily traffic flow. Fig. 9(b) presents the components of Inline graphic for the central part of the city in comparison with the urban planning map for Year 2004–2020 in Fig. 9(a). Generally, it can be seen that the residence area have a large volume of traffic with respect to Inline graphic and Inline graphic, corresponding to trips between home and workplaces and trips for other purposes, while in the workplace area especially for business, there are lots of flows corresponding to the second category Inline graphic, and in the remaining area, the third one Inline graphic is quite significant. We should note that the urban planning map (2004–2020) is not an exact description for the land uses of Year 2007, and consequently, the patterns of the two figures may not agree well in some small areas. For example, the red patch around point Inline graphic in Fig. 9(a) is planned as an industrial land, namely, workplace in our context, while in fact it was a construction site for Expo 2010 Shanghai China with very few taxi traffic in Year 2007. Yet it is still reasonable for a construction site to have the major taxi flows of type Inline graphic as shown in Fig. 9(b) because in the evening workers would be very likely to go out for recreation, entertainments, etc.

Figure 5. The Average Component Proportions of Inline graphic in Each Location, Equivalent to the Categorical Proportion of the Traffic.

Figure 5

Figure 6. The Relative Deviation for Components of Inline graphic in Each Location: (a) the Average Positive Deviation; (b) The Average Negative Deviation.

Figure 6

Figure 7. The Distribution of the Relative Deviation for Components of Inline graphic: (a) CDF; (b) PDF.

Figure 7

Figure 8. The Parameters for the Distribution.

Figure 8

Figure 9. Comparing the Empirical Data to Urban Planning Map: (a) the Area Type from Urban Planning [47] for Central Part of the City; (b) the Average Categorical Proportion of Traffic for Central Part of the City.

Figure 9

In addition, we can see how the government planning [47] is affected by what it is now. For example, Nanjing Road and near by is the largest block with high traffic throughput, and traffic flows are constituted mainly by those of workplaces related (Inline graphic) and other facilities related (Inline graphic) categories. In the planning, it is designed to be a public activity center for administrative, business and shopping purposes. Lujiazui is another similar but smaller zone, which is planned mainly for business and shopping centers.

Discussion

In this research, we find that the traffic on workdays can be divided into three categories according to the different purposes: commuting between home and workplaces, traveling from workplace to workplace, and others such as leisure activities. Each of these categories has a highly distinguishable basis pattern: Inline graphic, Inline graphic or Inline graphic. The relative daily deviation of the traffic flow in each category can be modeled as Eq. (14), which is a mixture of normalized binomial distributions, with a continuous approximation as Eq. (15).

This basis pattern theory is applicable to data sets containing the beginning and ending information of trips, such as the bicycle departure and arrival data [48], cell phone based mobility information [8], GPS based data, etc.

The first contribution of this research is, it provides a very economical approach to understand how the urban traffic at different locations are composed from the three categories. For instance, a large Inline graphic means there is a large portion of traffic between home and workplaces at location Inline graphic. This theory can also help to infer the land use composition by a quite easy, real-time, and automated way. For example, the evidence of a large Inline graphic everyday indicates location Inline graphic is mainly for residential or working purpose, while a large Inline graphic can imply that it has lots of workplaces. A mixture of different land uses in a single location can be found by this method as well.

Second, based on the NMF approach, the time series of the total traffic at any location can be expressed as a linear combination of the basis patterns. Therefore, we can compress the traffic data of a large area into a very small data size, but still with a quite high resolution. Namely, we only need to store the global basis patterns, and for each location, we use a small vector for the traffic power to represent how strong each basis pattern is.

Third, we find that the distribution of the relative deviation is not a normal distribution, indicating that the random variable Inline graphic is not identical from one place to another, or from time to time. The significance of Eq. (14) and Eq. (15) is, they provide an expression of how traffic fluctuates for various unknown positions and time intervals. This description of relative deviation can also be helpful to estimate the change of the traffic flow, which would be important in traffic predicting, controlling and urban planning.

Finally, with the deviation distribution, we can not only predict the change of traffic, but also diagnose the abnormality of the traffic: where, when, why, and how. The first two functions are obvious, while ‘why’ abnormal can be disclosed by the traffic power, and ‘how’ abnormal can be revealed by the probability of the deviation. For example, if some traffic flow is very abnormal one day, the probability density of the variance on that day should be very small.

Our analysis focusing on the traffic flows in different locations on different workdays. Our results can also be extend to the traffic on a road. The road traffic is a summation of the traffic passing this road from several sources and to several destinations. Therefore, the volume and the deviation of the road traffic flow can also be explained in our framework.

Supporting Information

Appendix S1

More on Data Description and Background Assumptions.

(PDF)

Appendix S2

Implementation Details about the Factorization.

(PDF)

Appendix S3

Moment Generation Function of Inline graphic.

(PDF)

Acknowledgments

We would like to thank Wireless and Sensor networks Lab (WnSN, Shanghai Jiao Tong University, China) for providing the data source. We thank Dr. Min-You Wu, Yang Yang (Shanghai Jiao Tong University, China) for supports in data. We also thank Xianchuang Su, Dr. Yixiao Li, Dr. Yong Min and Chuanzi Chen (Zhejiang University, China), Dr. David Keyes and Dr. Xiangliang Zhang (King Abdullah University of Science and Technology, Saudi Arabia) for precious suggestions. For computer time, this research used the resources of the Supercomputing Laboratory at King Abdullah University of Science & Technology (KAUST) in Thuwal, Saudi Arabia.

Footnotes

Competing Interests: The authors have declared that no competing interests exist.

Funding: CP was supported by Graduate Fellowship from King Abdullah University of Science and Technology. CP and XJ were supported by the National Science Foundation of China under Grant No. 61070069. PL was supported by the following project: RECOGNITION: Relevance and Cognition for Self-Awareness in a Content-Centric Internet (257756), which is funded by the European Commission within the 7th Framework Programme (FP7). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Chowdhury D, Santen L, Schadschneider A. Statistical physics of vehicular traffic and some related systems. Physics Reports. 2000;329:199–329. [Google Scholar]
  • 2.Nagel K. Particle hopping models and traffic flow theory. Physical Review E. 1996;53:4655. doi: 10.1103/physreve.53.4655. [DOI] [PubMed] [Google Scholar]
  • 3.Esser J, Schreckenberg M. Microscopic simulation of urban traffic based on cellular automata. International Journal of Modern Physics C-Physics and Computer. 1997;8:1025–1036. [Google Scholar]
  • 4.Simon P, Nagel K. A simplified cellular automaton model for city traffic. Arxiv preprint cond-mat/ 1998;9801022 [Google Scholar]
  • 5.Perc M. Premature seizure of traffic flow due to the introduction of evolutionary games. New Journal of Physics. 2007;9:3. [Google Scholar]
  • 6.Helbing D. Improved fluid-dynamic model for vehicular traffic. Physical Review E. 1995;51:3164. doi: 10.1103/physreve.51.3164. [DOI] [PubMed] [Google Scholar]
  • 7.Brockmann D, Hufnagel L, Geisel T. The scaling laws of human travel. Nature. 2006;439:462–465. doi: 10.1038/nature04292. [DOI] [PubMed] [Google Scholar]
  • 8.González M, Hidalgo C, Barabási A. Understanding individual human mobility patterns. Nature. 2008;453:779–782. doi: 10.1038/nature06958. [DOI] [PubMed] [Google Scholar]
  • 9.Jiang B, Yin J, Zhao S. Characterizing the human mobility pattern in a large street network. Physical Review E. 2009;80:021136. doi: 10.1103/PhysRevE.80.021136. [DOI] [PubMed] [Google Scholar]
  • 10.Leutzbach W. Springer Verlag; 1987. Introduction to the theory of traffic flow. [Google Scholar]
  • 11.Kerner B. Springer Verlag; 2009. Introduction to modern traffic flow theory and control: the long road to threephase traffic theory. [Google Scholar]
  • 12.Kitamura R, Chen C, Pendyala R, Narayanan R. Micro-simulation of daily activity-travel patterns for travel demand forecasting. Transportation. 2000;27:25–51. [Google Scholar]
  • 13.Kuppam A, Pendyala R. A structural equations analysis of commuters' activity and travel patterns. Transportation. 2001;28:33–54. [Google Scholar]
  • 14.Liao Z, Yang S, Liang J. IEEE/ACM International Conference on Green Computing and Communications & IEEE/ACM International Conferenceon Cyber, Physical and Social Computing. IEEE; 2010. Detection of Abnormal Crowd Distribution. pp. 600–604. [Google Scholar]
  • 15.Candia J, González M, Wang P, Schoenharl T, Madey G, et al. Uncovering individual and collective human dynamics from mobile phone records. Journal of Physics A: Mathematical and Theoretical. 2008;41:224015. [Google Scholar]
  • 16.Andrade E, Blunsden S, Fisher R. Proceedings of the 18th International Conference on Pattern Recognition. IEEE, volume 1; 2006. Modelling crowd scenes for event detection. pp. 175–178. [Google Scholar]
  • 17.Mehran R, Oyama A, Shah M. IEEE Conference on Computer Vision and Pattern Recognition. IEEE; 2009. Abnormal crowd behavior detection using social force model. pp. 935–942. [Google Scholar]
  • 18.Handy S. Methodologies for exploring the link between urban form and travel behavior. Transportation Research Part D: Transport and Environment. 1996;1:151–165. [Google Scholar]
  • 19.Horner M, O'Kelly M. Embedding economies of scale concepts for hub network design. Journal of Transport Geography. 2001;9:255–265. [Google Scholar]
  • 20.Dieleman F, Dijst M, Burghouwt G. Urban form and travel behaviour: micro-level household attributes and residential context. Urban Studies. 2002;39:507. [Google Scholar]
  • 21.Waddell P. Modeling urban development for land use, transportation, and environmental planning. Journal of the American Planning Association. 2002;68:297–314. [Google Scholar]
  • 22.Boarnet M, Crane R. The influence of land use on travel behavior: specification and estimation strategies. Transportation Research Part A: Policy and Practice. 2001;35:823–845. [Google Scholar]
  • 23.Wegener M. Overview of land use transport models. Handbook of transport geography and spatial systems. 2004;5:127–146. [Google Scholar]
  • 24.Handy S. Smart growth and the transportation-land use connection: what does the research tell us? International Regional Science Review. 2005;28:146. [Google Scholar]
  • 25.Han X, Hao Q, Wang B, Zhou T. Origin of the scaling law in human mobility: Hierarchy of traffic systems. Physical Review E. 2011;83:036117. doi: 10.1103/PhysRevE.83.036117. [DOI] [PubMed] [Google Scholar]
  • 26.Longini I, Nizam A, Xu S, Ungchusak K, Hanshaoworakul W, et al. Containing pandemic influenza at the source. Science. 2005;309:1083. doi: 10.1126/science.1115717. [DOI] [PubMed] [Google Scholar]
  • 27.Eubank S, Guclu H, Kumar V, Marathe M, Srinivasan A, et al. Modelling disease outbreaks in realistic urban social networks. Nature. 2004;429:180–184. doi: 10.1038/nature02541. [DOI] [PubMed] [Google Scholar]
  • 28.Easley D, Kleinberg J. Cambridge University Press; 2010. Networks, crowds, and markets: Reasoning about a highly connected world. [Google Scholar]
  • 29.Anderson R, Fraser C, Ghani A, Donnelly C, Riley S, et al. Epidemiology, transmission dynamics and control of SARS: the 2002–2003 epidemic. Philosophical Transactions of the Royal Society of London Series B: Biological Sciences. 2004;359:1091. doi: 10.1098/rstb.2004.1490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Hufnagel L, Brockmann D, Geisel T. Forecast and control of epidemics in a globalized world. Proceedings of the National Academy of Sciences of the United States of America. 2004;101:15124. doi: 10.1073/pnas.0308344101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Riley S. Large-scale spatial-transmission models of infectious disease. Science. 2007;316:1298. doi: 10.1126/science.1134695. [DOI] [PubMed] [Google Scholar]
  • 32.Kleinberg J. The wireless epidemic. Nature. 2007;449:287–288. doi: 10.1038/449287a. [DOI] [PubMed] [Google Scholar]
  • 33.Hu H, Myers S, Colizza V, Vespignani A. WiFi networks and malware epidemiology. Proceedings of the National Academy of Sciences. 2009;106:1318. doi: 10.1073/pnas.0811973106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Castellano C, Fortunato S, Loreto V. Statistical physics of social dynamics. Reviews of modern physics. 2009;81:591–646. [Google Scholar]
  • 35.Shlesinger M, Zaslavsky G, Frisch U. Lévy flights and related topics in physics. In: Lévy Flights and Related Topics in Physics: Proceedings of the International Workshop Held at Nice, France. volume. 1995;450 [Google Scholar]
  • 36.Rhee I, Shin M, Hong S, Lee K, Chong S. INFOCOM 2008. The 27th Conference on Computer Communications. IEEE. IEEE; 2008. On the levy-walk nature of human mobility. pp. 924–932. [Google Scholar]
  • 37.Song C, Qu Z, Blumm N, Barabasi A. Limits of predictability in human mobility. Science. 2010;327:1018. doi: 10.1126/science.1177170. [DOI] [PubMed] [Google Scholar]
  • 38.Liang X, Zheng X, Lv W, Zhu T, Xu K. The scaling of human mobility by taxis is exponential. Physica A: Statistical Mechanics and its Applications. 2012;391:2135–2144. [Google Scholar]
  • 39.Shanghai Jiao Tong University, China. SUVnet-Trace data. 2007;9 Available: http://wirelesslab.sjtu.edu.cn. Accessed 2012 Mar. [Google Scholar]
  • 40.Shanghai Population and Family Planning Commission, China. From the fifth population census to evaluate the population condition for the sustainable development of Shanghai. 2001;9 Available: http://www.popinfo.gov.cn/yearbook/2001nj/zhuanwen/7-4.htm. Accessed 2012 Mar. [Google Scholar]
  • 41.Lee D, Seung H. Learning the parts of objects by non-negative matrix factorization. Nature. 1999;401:788–791. doi: 10.1038/44565. [DOI] [PubMed] [Google Scholar]
  • 42.Lin C. Projected gradient methods for nonnegative matrix factorization. Neural computation. 2007;19:2756–2779. doi: 10.1162/neco.2007.19.10.2756. [DOI] [PubMed] [Google Scholar]
  • 43.Hollick M, Krop T, Schmitt J, Huth H, Steinmetz R. Modeling mobility and workload for wireless metropolitan area networks. Computer Communications. 2004;27:751–761. [Google Scholar]
  • 44.Ben-Akiva M, Bowman J, Ramming S, Walker J. Transportation Models in the Policy-Making Process: Uses, Misuses and Lessons for the Future; 1998. Behavioral realism in urban transportation planning models. pp. 4–6. [Google Scholar]
  • 45.Zhang L, Wu J, Zhen Y, Shu J. A GIS-based gradient analysis of urban landscape pattern of Shanghai metropolitan area, China. Landscape and Urban Planning. 2004;69:1–16. [Google Scholar]
  • 46.Onnela J, Saramäki J, Hyvönen J, Szabó G, Lazer D, et al. Structure and tie strengths in mobile communication networks. Proceedings of the National Academy of Sciences. 2007;104:7332. doi: 10.1073/pnas.0610245104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Shanghai Municipal Bureau of Planning and Land Resources, China. Shanghai urban planning: land-use planning. 2009;9 Available: http://www.china.com.cn/aboutchina/zhuanti/09dfgl/2009-09/08/content184882372.htm. Accessed 2012 Mar. [Google Scholar]
  • 48.Kaltenbrunner A, Meza R, Grivolla J, Codina J, Banchs R. Arxiv preprint arXiv; 2008. Bicycle cycles and mobility patterns-Exploring and characterizing data from a community bicycle program. 08104187. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix S1

More on Data Description and Background Assumptions.

(PDF)

Appendix S2

Implementation Details about the Factorization.

(PDF)

Appendix S3

Moment Generation Function of Inline graphic.

(PDF)


Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES