Abstract
We study the dynamical process of congestion formation for large-scale urban networks by exploring a unique dataset of taxi movements in a megacity. We develop a dynamic model based on a reaction and a diffusion term that properly reproduces the cascade phenomena of traffic. The interaction of these two terms brings the values of the speeds on road network in self-organized patterns and it reveals an elegant physical law that reproduces the dynamics of congestion with very few parameters. The results presented show a promising match with an available real data set of link speeds estimated from more than 40 millions of GPS coordinates per day of about 20,000 taxis in Shenzhen, China.
Subject terms: Civil engineering, Complex networks
Introduction
Traffic models are fundamental for engineers and researchers in transportation to design and apply control strategies in order to mitigate urban congestion. Combining realistic modeling of congestion and efficient traffic control for large scale urban systems remains a big challenge. This is due to the high unpredictability of choices of travelers (in terms of route, time of departure and mode of travel), the uncertainty in their reactions to the actions of others (users or controllers), the spatiotemporal propagation of congestion, the lack of coordinated actions coupled with the limited infrastructure available1,2 and the strong scatter of the data for link-level traffic sensors. The growing research interest on this topic is testified also by several recent works, (for example3–11) where the authors try to recover travelers’ behavior through extensive data-driven trip prediction, i.e. collecting GPS of mobile phones. While there is a vast literature on congestion dynamics, control and spreading in one-dimensional traffic systems with a single mode of traffic12–17, most of the analysis at the network level is based on simplistic models or detailed simulation, which requires a large number of input parameters (sometimes not observable with existing data) and cannot be solved in real-time.
In literature, we can distinguish four families of traffic flow models: network, macroscopic, mesoscopic and microscopic models, that often are implemented in micro-simulators (a survey on this argument can be found in ref. 18). Based on the purpose of the research, one can choose to engage a model of an appropriate scale. Network level models normally use a regional description of the network and aggregate traffic variables. The advantages of this type of models are the low computation cost and the possibility to use well-defined empirical relations between the main traffic variables: flow, density and speed, known as Fundamental Diagram (FD). Note that these relations are not for individual links as in the traditional link-level models, but for homogeneously congested regions of a city. It has been empirically observed19,20 that by spatially aggregating the highly scattered plots of flow vs. density from individual links (e.g. 1 min data), the scatter decreases and a well-defined curve exists between space-mean flow and density. Thus instead of high scatter local FDs, a Macroscopic Fundamental Diagram (MFD) is established at a network level21–23. The shape of MFD is a property of the network infrastructure and control and not very sensitive to the demand, but it depends on the spatial heterogeneity of congestion. This is important network property, as details at the micro-level are not necessary to describe congestion propagation. It can also be utilized to introduce simple control strategies to improve mobility in multi-region city centers building on the concept of an MFD, like in refs. 24,25, and others. Recent findings from empirical and simulated data26,27 have identified the spatial distribution of vehicle density in the network as one of the key components that influence the shape and the scatter of an MFD. These findings are of great importance because the concept of an MFD can be applied for heterogeneously loaded cities with multiple centers of congestion if these cities can be partitioned into a number of homogeneous spatially connected clusters19,28–30.
The mesoscale models simulate traffic for smaller network elements, like roads or for groups of cars. Another powerful and detailed class of models are those based on the micro-simulators. They are agent-based models where the decision variables, necessary to run microsimulations, are chosen at the driver level. This fact implies a high computational cost and also greater stochasticity of the results. The car-following model is one of the most common for lower level modeling, proposed in ref. 31 and in a simplified version in ref. 32. Nevertheless, it emerges, for example in ref. 33, the complexity to calibrate the huge amount of parameters (route and mode choice, traffic dynamics, demand generation and more) in the widely used micro-simulators for driving behavior models and the difficulty to find a universal and definitive values system.
The model proposed here is a link-based model, but without requiring detailed origin-destination information, like turning ratios at intersections, or queue lengths and drivers’ behaviour. The objective of this model is not to predict accurately the speed at link level, but to reproduce network characteristics such as speed distribution, the spatial pattern of congested areas in the road network as well as the relation between mean and standard deviation in quasi-congested period and their evolution over time. In the same spirit as the network MFD approach, it only requires a few parameters to be calibrated. The average link speed (in a time horizon of a few minutes) is the main variable of interest. Nevertheless, mesoscopic models often require tedious calibration as specifying capacity for each link and run route choice algorithms, detailed or aggregated34–36.
This work has been inspired by a very general version of the reaction-diffusion phenomenon that is at the basis of various and widespread aspects in nature (described in continuous space in refs. 37,38 and generalized for networks in ref. 39). As in ref. 40, we combine the traffic variables in a dynamical system governed by classical physical laws. In fact, these systems develop self-organized spatial patterns that share, in particular configurations, similar characteristics with those that can be observed during a peak hour for connected components of congested links in urban networks (for example in ref. 41). Other works, as in ref. 42, point out the influential distance and correlation between the origin of traffic jams and the consecutive spatio-temporal spreading over all the road network during the peak-hours.
The original idea is to invert the common point of view of the models, that is not to estimate congestion given an OD matrix, drivers’ behavior, road capacity, and other specific traffic variables, but look at the product of all these variegate factors as an unique cause that makes the speed values on the road network tractable with a physical dynamical system. Now, it is worth to remark the reasons for which we chose the link-speed as the main variable and not, like in many other classical conservation-based traffic models, the link density. Many real datasets and the most used on-line measurement techniques are nowadays based on probe traffic data as GPS signals from taxis, buses and mobile phone of private cars’ drivers and all of these provide only trustworthy speed information by recording position and time. This technology is considered by far much cheaper and potentially ubiquitous information than the sensors used to measure the density, based mainly on measurements from loop detectors that yield only point density and not link density. For example, in the Shenzhen data used in the paper, the penetration rate is less than 1%, meaning that on average we observe 1 or 2 vehicles every 5 minutes in a link. This information is sufficient to have a proxy of speed, but it creates significant uncertainty if density needs to be estimated.
The more visible advantage of this model with respect to the other models in the literature is that it can reproduce congestion propagation in large-scale networks with thousands of roads by clustering only a few parameters. The dynamical process of urban traffic in networks43 is here simulated with an elegant model and without any exogenous information like origin-destination (OD) matrix and drivers’ path choice. Moreover, the distribution characteristics such as standard deviation of link speeds and the average mean are accurately reproduced by the reaction-diffusion model at all moments.
A reaction-diffusion model
The master differential equation of the proposed model is composed of two parts, a diffusion and a reaction term, operating in a spatial network, represented by a graph of nodes and links.
The diffusion part is regulated by the combinatorial Laplacian where is the corresponding adjacency matrix of graph and the diagonal matrix with the corresponding node degree of each node . For the purpose of this work, it will be simpler and useful to define the roads of the transportation network as nodes and the intersection points between road segments as links. In particular, the element if and only if road and road are adjacent, otherwise. The elements on the diagonal of the matrix , , represent the number of the adjacent links of each intersection point . In literature, one might find another definition of the combinatorial Laplacian, with opposite sign: , instead of , if there is a link between and , and in the diagonal. We chose to keep the definition in the cited seminal paper about reaction-diffusion system in networks39. The reaction-diffusion equation that we propose was inspired by Fick’s law of concentration of particles, but models with mass and flow conservation have been studied in transportation with various limitations, mostly related to transient states (high scatter FDs) and route choice behavior. We aim to demonstrate that the abstraction effort that consists to pass from a road density description to a link velocity perspective is still valid and that it is worth considering.
For our traffic model the vectorial variable will represent the vector of average link speed for each road at a certain time . If then, on the one hand, the diffusion simulates the linear propagation of congestion, on the other hand, the reaction adds to the system a non-linear effect, characteristic of the vehicular traffic, as the non-uniform demand in space and/or in time. In particular, the reaction term will be represented by non-linear function depending, in general, on a weight , on the speeds’ vector , the expected network average speed , on the location and time and parameterized by a factor that represents the reaction rigidity as we will explain in the next section. In the following, we will refer to as the reaction function. The differential equation at the core of the model, at link level , is:
| 1 |
where the reaction and diffusion parameters, and respectively, can locally depend on space.
The diffusion term changes the distribution of link speed values among the network while the reaction term is the responsible for the change of the network average speed (). The weights for the effects of two terms can be regulated by their respective parameters and . Finally, is a random number from a zero-mean symmetric density probability function. We can assume that the noise does not influence the network average speed but it actually has an effect on the distribution of the speed values and it measures the ‘randomness’ of the real system. Moreover, the term is related with the estimation of the metastability of the model (similarly to the technique in ref. 44) and especially at the optimal configuration of the parameters , in the sense that the error remains small (minimal) for limited random input values . For our applications we considered a uniform distribution for . Gaussian white noise for the random term, with the same standard variation (/) has been tested in the model and the results appear totally similar with respect to the uniform distribution.
Principles of reaction and diffusion
We decided to keep the mathematical form as elegant as possible and look at what are the two fundamental principles that regulate the observable traffic propagation: the linear effect of spatial diffusion and the non-linear interaction between congested and non-congested adjacent links.
The two principles of reaction and diffusion that are expressed in the differential Eq. (1) can be seen as the counterpart of two following empirical facts (observed, also, in ref. 45):
A congested link leads the drivers to prefer one of its neighbor links;
If a link is surrounded by congested links with high probability it will get congested with a brief delay.
In other words, captures in an rough way the awareness of drivers’ about the traffic condition ahead46–48 while the dispersal/aggregation dynamics that occur in an emerging congested pattern during the onset and offset of peak-hour events49.
For each link and time , assertion is simulated by the diffusion term , where is the set of the adjacent link to . When diffusion is applied, the consequence is a value transfer from a link to its neighbors , that means to get higher or lower value proportionally to the difference with its adjacent links maintaining the sum of all link speed values unchanged. On the other hand, is made effective by the reaction term. After computing the difference in speed between a link and its neighbors, the reaction function and the parameter calibrate the increment or decrement of the value at each time step . For this reason, for each link , we introduced as argument of the reaction function the variable . It is worthy to remark here that the non-linearity of and the lack of an explicit counterbalancing term in the Eq. (1) distinguishes the reaction from the diffusion, that is the simulation of the effect of an exogenous demand (characterized by , weighted by and locally influenced by ) from the endogenous diffusion dynamics. The assumption restricts the type of reaction function that we can choose. In particular, it has to be monotonic in in order to follow the trend of the values of the neighborhood, bounded and non linear, to be able to translate the complexity of the congestion phenomenon and its characteristic self-dependent and self-enhanced growth and dispersion50,51.
Parameters region-to-region
An urban transportation network is composed by streets classifiable according to their characteristics (speed limit, capacity, number of lanes, etc.) and functionality (e.g. type of intersection control) as periphery, primary or secondary roads, highway and others. Therefore, it is reasonable to imagine that the traffic flow between two roads belonging to two different types can be correlated in a different way than, for example, two consecutive links.
In this work, we apply another differentiation using clustering algorithms that divide the street network into large zones based on the level of congestion or well defined Macroscopic Fundamental Diagrams (for example, as done in refs. 29,52). For this reason, in order to reach more accuracy, it can be useful to set reaction () and diffusion () parameters for each cluster. For example, traffic congestion might propagate differently in freeways than urban streets. In general, and will be represented by two squared matrices of dimension , where is the number of the regions or types identified in the network. We will denote the set of links belonging to a certain region . Using this notation, the parameters used in Eq. (1) will be and with respectively the index of the clusters of road and road .
The values of these parameters have been calibrated looking at the principal effect that they have on the speed distribution. In particular, the reaction parameter influences the changes of the network average speed and the diffusion parameter affects the distribution of the link speed values, namely on the standard deviation. It is a fact that a higher value for leads the network speed distribution towards more homogeneity.
Results
In the case that congestion grows in a network, e.g. during the morning onset of congestion, the parameters of the model do not significantly change and the value of the reaction term , that can be set at the beginning of the peak hour, is proportional to the average decrements of the network speed. But, in order to simulate the fluctuating traffic behavior during the day, instead of using demand information in the model (the rate of new trips that are generated), which is difficult to obtain, we integrate an adaptive mechanism based on real-time observable measurements. More specifically, we monitor every some time interval (e.g. 20 minutes) the online error between the real and the estimated network average speed . For the whole-day case, we looked for a symmetric function, bounded and where the online adaptation algorithm is easy to apply and set up. In fact, the on-line adaptation is based on a constant that brings the system to increase or decrease the network average speed based on the real online measurements at time , with , where is the total number of the calls of the adaptive algorithm. For this aim, we chose to use the function , with , that respects all our requirements. We want to notice the other functions, as the sigmoid or arctan, could fit the principles that we sustained above, but the main scope of this work is to demonstrate the existence of physical dynamics of traffic propagation described by a local differential equation of type (1).
Adaptive parameters implementation: a whole day example
We tested the reaction-diffusion model to simulate a whole day with on- and off-peak of the congestion in the Shenzhen downtown. In particular, all the results that we show in Figs. 1–4 are relative to the September 2011 (Day 1), after the calibration process from all the available dataset explained in the corresponding section.
Figure 1.
Simulation accuracy for regional average link speeds. In the panel on the top-right, it is reported the average speed for each cluster of Shenzhen downtown. The continuous lines correspond to the real data while the dashed line come from the simulation of the reaction-diffusion model with , and . Each color corresponds to the region of the same color on the Shenzhen map on the right. On the bottom, the average speed for all the network. The simulation is referred to the whole day on September 2011 and the 3 regions are the result of an off-line clustering algorithm that tends to minimize the variance inside each region of the city in congested scenario.
Figure 4.
Histogram of the link speeds distribution. The panels from the top-right to the bottom-left report every 4 hours the histograms of the link speeds distribution on Shenzhen network for the real data of Day 1 (in blue) and the corresponding simulation (in red). Being and the two empirical cumulative density functions (ECDFs) of the observed and the simulated link speed distribution at time , the two-sample Kolmogorov-Smirnov (KS) statistic has an average of overall the day. In this picture, all the distributions reported from 8am to midnight pass the KS test with , while it gives negative answer during the night (2 am–6:40 am) due to the scarcity of GPS data in that time window. For an easier visual comparison, the fitted normal distribution of the two histograms have been represented in each panel.
For each region , we define and the average speed at time across the links belonging to for the real data and the simulation respectively. As we anticipated in the previous section, we considered for this scope the symmetric function , for each region and time , with to add to the argument of the reaction function in order to calibrate the model. With the control parameter , we adjust the reaction function in order to follow each regional speed and, as consequence, the whole network speed (Fig. 1). The weight is a scalar to regulate the strength of the adaptive algorithm and has been chosen to be the same for all regions. Note that can be estimated in real-time as it only looks for the difference of regional speeds and, in addition, this is also straightforward to estimate if data from loop detectors is available in line with the MFD theory19. In all the results shown in this paper, the parameter has been evenly updated 70 times in a day (), that corresponds to every 20 minutes in a 24h day.
We define a connected component of congested links as a set of links whose observed velocity is below a certain threshold, that we fixed at 8 [mph], and connected each other by at least a path all contained in the same set. The value of 8 [mph] has been chosen in accordance with the traffic flow theory of MFD, where the definition of congestion is when the saturation flow is reached and any increment in road density corresponds to a decrease of traffic flow. Experience and large-scale measurements from different case-studies, as in ref. 19, demonstrate the consistence of the choice of this particular value. In Fig. 2, we illustrate as example the evolution of the spatial pattern of the congestion during the morning peak-hour while in Fig. 3 we analyze the fluctuation of the Largest Connected Component (LCC) of congested links during the whole day simulation and we compared it with the real data. As one can notice, the model reproduces the size of the largest component very precisely except for the evening peak hour where the model underestimates it. Nevertheless, the total number of the congested links in all the network is reproduced with high accuracy during the whole simulation. We can observe the typical congestion pattern formation during a peak hour period: from some critical links the congestion spreads in neighbor links forming large connected components (a network analysis of this transition mechanism is done in refs. 42,45,53). This phenomenon is also enhanced by the gridlock effect and traffic spillback2,54. Applications of percolation theory to traffic jams have attracted attention recently and provide intuitive explanations for the identification of critical links and bottlenecks in a network55,56. Usually, a link is considered not operational when the velocity is very low and it is removed from a connected cluster. Nevertheless, in the physical world of traffic congestion, when a link reaches such a state, it contains a large number of vehicles that still need to travel to their destinations. Thus, these links (even at a very congested state) are still connected with the other parts of the network. Reaction-diffusion models still consider the whole network during speed propagation and might be considered an alternative to study these phenomena.
Figure 2.
Map of congested links from real data and simulation during morning peak-hour. Links colored in red are considered congested (velocity less than 8 [mph]) at 6 am, 9 am and 12 pm of Day 1.
Figure 3.

Comparison between the LCC of congested links in real data and in the simulation of the model. The continuous lines indicate the total number of the links (out of 2013) that are considered congested at each time of the day, that is with a velocity less than 8 [mph]. The blue color is used for the real data and the red for the simulation. The dashed lines report the size of the corresponding LCC.
We want to stress the fact that during the simulation we never upload the link speeds of the model with the available real data but we only consider the average over a macro-region (or the whole network) to update the parameter for the argument of the reaction function. In fact, the model reproduces by itself the local dynamics given the general traffic condition and this guarantees the continuity and consistency of the simulated speed values and the applicability of the model using the estimations for macro-region deducible from historical data and easier online measurements. In this sense, the model is able to simulate the on-peak and off-peak of the congestion without losing the accuracy on the distribution of the link speed values (Fig. 4) and the evolving spatial congestion pattern.
Calibration of adaptive strength and randomness
In order to calibrate the parameters and of the model for the case of the whole day simulation, we used the measure
| 2 |
where, as mentioned before, is the number of the calls of the adaptive algorithm, the time at which each call corresponds and stands for standard deviation of the speed distribution vector. The quantity is the average Euclidean distance in the plane between each corresponding couple of points of the estimated speed and real speed data . For each value , we change the interval () relative to the noise and compute the accuracy in the relation between network average speed and standard deviation of the link speed distribution through the quantity defined in formula (2).
For the calibration of and , we used the data from different working days in September in Shenzhen. We report here 3 representative cases that we named Day 1, 2 and 3. We notice that the best parameters for the model are almost the same for all days. In order to minimize the error among all dataset, a unique couple has been fixed and applied for all days and the standard deviation of the link speeds remains always less than 0.13 [mph] and the MS error below 0.3. The results showed in this section all refer to Day 1 ( September 2011) if not specified otherwise. Figure 5 shows the relation between standard deviation and average of the link speed values for 3 iconic values of when . It clearly appears a mathematical relation between these variables that the reaction-diffusion model reproduces and maintains all simulation long, especially when the congestion starts to appear. We call this the quasi-congested period, when the network speed is below the critical threshold [mph]. The mathematical reason of the value of has been discussed in the following section about heteroscedasticity. In this regime, we observed that the standard deviation decreases as the congestion in the network increases (and speed decreases) and the relation with the network speed goes from a scattered behaviour () to a well-defined linear dependence (). Empirically, the standard deviation of link speed distribution can be approximated by a linear relation in terms of the mean speed , that is . This shows that even complex and large urban networks experience a simple law between the level of congestion (as expressed by a space-mean speed) and spatial heterogeneity (as expressed by standard deviation of link speeds distribution) when speed is below a critical value. We found that this relation is maintained also for different days, as show in the following. For the points in the quasi-congested period () out of the 288 measurements par day evenly sampled from our 72-hours link speed data, the best linear fitting is reached with and . Figure 5 shows also how that only depends on while the slope remains the same at fixed . Figure 6 represents the boxplot of the average error in mean speed for each parameter varying the noise . All the panels in Fig. 6 show the existence of an optimum value for and that varies just a little in different days. We observe that this relation between average mean speed and standard deviation reaches its best fit in for Day 1 and 3 and in for Day 2 where the MS error defined in formula (2) has its minimum value. However, it seems that Day 2 does not show a clear U-shape like the other two days. In the panel on the right, it is reported the obtained distribution of the standard deviation error between the real data and the simulation when the adaptation is made every 20 minutes in a day. For Day 1, we also show the same results considering a normal distribution of the noise instead of the uniform presented in the text. In order to compare the two distributions, we considered a normal distribution with zero-mean and same standard deviation than the uniform one (that in our case is /).
Figure 5.
Relation between the mean and the standard deviation of the distribution of link speeds (Day 1). Every 5 minutes the values of network average speed and standard deviation () are reported in this graph and visualized as circled points. In particular, the blue circled points correspond to the real data measurements and they are compared with the simulations for 3 different levels of noise , that is (no noise), (best-fit) and . The parameter has been fixed at . The parameter has been evenly updated every 20 minutes (). In the quasi-congested regime the values of the linear relation between standard deviation and network speed (continuous blue line) are and .
Figure 6.
Calibration of a and b parameters for 3 days. For each value , we run 30 simulations by increasing the noise parameter from to by steps of . In the panel on the left, it is shown the obtained distribution of MS defined in formula (2), when the adaptation is made every 20 minutes in a day. On the right, for each , it is reported the boxplot of the distribution of the standard deviation of the error between the real data and the simulation , . For Day 1, we also show the same results considering a normal distribution of the noise instead of the uniform presented in the main text. In each plot, we put in evidence the average of the error values, in particular, in blue for the uniform distribution of and in red for the Gaussian distribution of . Both for the and for the , the uniform and the Gaussian noise distribution gives equivalent results.
Heteroscedasticity in link speed distribution
Merging the data of Day 1, 2 and 3 as explanatory representatives of the dataset, we have been able to show the implicit and fundamental relation already investigated in Fig. 5. We can clearly distinguish two regimes for the standard deviation/average speed relation by two factors: the slope and the Residual Standard Deviation (RSD) of the linear regression in each of the two parts. A phenomenon of heteroscedasticity between the quasi-congested regime and the free-flow regime has been, here, measured. An important fact is that the model reproduces and maintains the same type of relation for all different days and for both regimes, as shown in Fig. 7. Computing the RSD of the linear regression of the data with an average speed with , we pointed out a drastic transition at that determines a phenomenon of heteroscedasticity in the relation between and . In particular, is the threshold that we take into account to define the quasi-congested regime and separate it from the free-flow condition. This important relation represents a global network behaviour that combines the topology structure with the traffic distribution and propagation.
Figure 7.
Heteroscedasticity in the distribution of link speed values. The data plotted here are from 3 days of the dataset of Shenzhen and one point every 20 minutes. From this scatter plot it is possible to deduce that the relation between average link speeds and standard deviation is independent of the day and it follows two distinguished regimes: the free-flow and the quasi-congested one. The subpanel in the top-left represents the Residual Standard Deviations (RSD), defined as the average norm of the residual of a linear fitting of the data whose average speed is less than the threshold . At it clearly appears a critical increment of data dispersion and, consequentially, the heteroscedasticity. Therefore, the value has been used to distinguish the two regimes. In the plot on the bottom, it is reported the residual plot of the linear regression of the points of the flee-flow part (blue) and the quasi-congested part (red). The RSD for these two parts are 0.277 and 0.116 respectively.
Discussion
The reaction-diffusion model had been applied in many different fields in science since its complete mathematical formulation. A common feature that all the previous applications shared was a physical and/or biological nature without neither external factors nor decision-makers. Applying such a system to traffic and congestion propagation might appear to exile the natural limits of the reaction-diffusion models. Nevertheless, we showed that the evolution of the speed values in a road network affected by daily congestion and the traffic emerging patterns can be accurately simulated by a reaction-diffusion differential system. We illustrated that it is possible to reproduce recurrent dynamics for the congestion propagation considering, on one hand, a general principle with a common reaction function, on the other hand, some diffusion and reaction parameters for the characteristics of the urban transportation system studied. An important observation is that the model parameters are defined based on principles of aggregated traffic models of MFD type. The ultimate scope of this work was to find how the traffic evolves given an initial state and a behavior for the global network speed.
We demonstrated that this kind of model is able to reproduce the average network link speeds with simple, global or regional, adaptive control parameters, the link speeds distribution and the congested spatial patterns during a whole day simulation. We obtained all these results without using any external information like traffic demand or drivers’ behavior that means not only light computational effort but also less real data needed.
Moreover, we pointed out the existence of an empirical relation between the network average speed and standard deviation of speed values distribution that appears in an urban network that experiences congestion and how this relation is reproduced by the reaction-diffusion model.
We believe that this work brings significant theoretical results and also computational advantages. In fact, in literature, it has been proved the existence of some internal and peculiar laws that connect the most common traffic variable like speed, density and flow, like the Macroscopic Fundamental Diagram Theory19 and also some similarities with physical systems40,48,57. We aimed to set a mathematical description of the congestion propagation which maintained the empirical relations in spatial networks with limited capacity, like the urban transportation systems.
Further applications of this work can be done for active traffic management and, in particular, for perimeter control as explained, for example, in refs. 1,58,59. In addition, because our model requires very few and aggregate data, like the average speed in a macro-region and a few parameters to be calibrated, it can be particularly adapted for the simulation of traffic in those cities in the developing countries where the cost of link detailed sensors remains not affordable.
Data description
The data set used in this paper consists of several millions of daily taxis GPS data of the city of Shenzhen, China. Shenzhen is, nowadays, one of the most important cities in China, and the largest city in the Pearl River Delta region, in the south of Southern China’s Guangdong Province, known for a significant growth of its manufacturing market. In 1980, Shenzhen was designated China’s first Special Economic Zones (SEZs) and this testifies his crucial and important role in Chinese economic. The high and fast-growing density of people in Shenzhen, that in 2017 counted more than 12 millions of inhabitants, created, as expected, large congestion problems in the urban network. The daily dataset is composed by more than 40 millions of spatio-temporal GPS coordinates with a resolution of about 30 s for each taxi and each trip and they are located inside an area of about of the city center. In particular, we based our analysis on the estimated speed of 3 whole days, from to of September 2011, of the Shenzhen network profile. We showed the speed profile only for the (Day 1) as explanatory example. We used the other days for the calibration of the parameters and and to ascertain the stability of the model that maintains its accuracy for different traffic conditions. The estimation of average speed for each link has been done after applying a map matching algorithm in the trajectories of the vehicles. Then, the average space-mean speed every 5 minutes of those taxis that drove along that link, as the total distance travelled divided by the total time spent in the link during this period. In particular, the trajectories have been reconstructed by assigning each GPS point to the closest link on the road network and using the shortest path algorithm to select the links between two consecutive GPS traces. Then, the vehicular speed between two consecutive traces has been calculated dividing the partial trip length by the recorded time difference of the two GPS points. The network consists in links and points. To integrate the differential equation (1) we use an Euler integration with and time steps for a whole day simulation.
Reaction-diffusion parameters ρ and σ.
For the division of the traffic map of Shenzhen into clusters, we used the optimal solution of the algorithm described in ref. 29, the algorithm identifies 3 regions and, in particular, the green in Fig. 1 stands for the highway that surrounds the city center and the south and north part of the downtown that, for their different functional characteristics, may be consider separated and with slightly different parameters of the RD model for traffic propagation.
In general, and will be represented by two squared matrices of dimension , where is the number of the regions or types identified in the network. The calibration of these parameters has been done at looking at the best benchmark with the comparison with real data in terms of final average speed and distribution at regional level. In our case, with 3 regions obtained from the clustering algorithm described in ref. 29, we end up with the following values: and . The reasons at the bases of these values are the highest empirical correlation between roads belonging to the same clusters for traffic propagation dynamics. In particular, the third cluster (in green) delimited by the algorithms, highlights the highway that surrounds the city center and that it maintains a higher speed during the all day and, also it results less influenced by the neighbor links but of those adjacent and part of the same high-speed road (upstream and downstream links).
Acknowledgements
This research has been supported by the European Research Council (ERC) Starting Grant “METAFERW: Modelling and controlling traffic congestion and propagation in large-scale urban multimodal networks” (Grant 338205).
Author contributions
L.B. and N.G. conceived of the presented idea. L.B. developed the theory and performed the computations. L.B. and N.G. verified the analytical methods. L.B. and N.G. wrote the paper. All authors discussed the results and contributed to the final manuscript.
Data availability
All the data related to this study and its documentation are accessible using the following links: 10.6084/m9.figshare.7212230.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Ramezani M, Haddad J, Geroliminis N. Dynamics of heterogeneity in urban networks: aggregated traffic modeling and hierarchical control. Transportation Research Part B: Methodological. 2015;74:1–19. doi: 10.1016/j.trb.2014.12.010. [DOI] [Google Scholar]
- 2.Mahmassani HS, Saberi M, Zockaie A. Urban network gridlock: Theory, characteristics, and dynamics. Transportation Research Part C: Emerging Technologies. 2013;36:480–497. doi: 10.1016/j.trc.2013.07.002. [DOI] [Google Scholar]
- 3.Gonzalez MC, Hidalgo CA, Barabasi A-L. Understanding individual human mobility patterns. Nature. 2008;453:779. doi: 10.1038/nature06958. [DOI] [PubMed] [Google Scholar]
- 4.Wang P, Hunter T, Bayen AM, Schechtner K, González MC. Understanding road usage patterns in urban areas. Scientific reports. 2012;2:1001. doi: 10.1038/srep01001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Wang J, Wei D, He K, Gong H, Wang P. Encapsulating urban traffic rhythms into road networks. Scientific reports. 2014;4:4141. doi: 10.1038/srep04141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Candia, J. et al. Uncovering individual and collective human dynamics from mobile phone records. Journal of Physics A: Mathematical and Theoretical41, 224015, http://stacks.iop.org/1751-8121/41/i=22/a=224015 (2008).
- 7.Hasan S, Schneider CM, Ukkusuri SV, González MC. Spatiotemporal patterns of urban human mobility. Journal of Statistical Physics. 2013;151:304–318. doi: 10.1007/s10955-012-0645-0. [DOI] [Google Scholar]
- 8.Chen, G., Jin, X. & Yang, J. Study on spatial and temporal mobility pattern of urban taxi services. In Intelligent Systems and Knowledge Engineering (ISKE), 2010 International Conference on, 422–425 (IEEE, 2010).
- 9.Ratti C, Frenchman D, Pulselli RM, Williams S. Mobile landscapes: using location data from cell phones for urban analysis. Environment and Planning B: Planning and Design. 2006;33:727–748. doi: 10.1068/b32047. [DOI] [Google Scholar]
- 10.Louail T, et al. Uncovering the spatial structure of mobility networks. Nature Communications. 2015;6:6007. doi: 10.1038/ncomms7007. [DOI] [PubMed] [Google Scholar]
- 11.Bazzani A, Giorgini B, Rambaldi S, Gallotti R, Giovannini L. Statistical laws in urban mobility from microscopic GPS data in the area of florence. Journal of Statistical Mechanics: Theory and Experiment. 2010;2010:P05001. doi: 10.1088/1742-5468/2010/05/P05001. [DOI] [Google Scholar]
- 12.Nagel K, Schreckenberg M. A cellular automaton model for freeway traffic. Journal de physique. 1992;I:2221–2229. [Google Scholar]
- 13.Nagel K, Paczuski M. Emergent traffic jams. Phys. Rev. E. 1995;51:2909–2918. doi: 10.1103/PhysRevE.51.2909. [DOI] [PubMed] [Google Scholar]
- 14.Helbing D, Hennecke A, Shvetsov V, Treiber M. Micro-and macro-simulation of freeway traffic. Mathematical and computer modelling. 2002;35:517–547. doi: 10.1016/S0895-7177(02)80019-X. [DOI] [Google Scholar]
- 15.Louail T, et al. From mobile phone data to the spatial structure of cities. Scientific reports. 2014;4:5276. doi: 10.1038/srep05276. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Gallotti R, Bazzani A, Rambaldi S, Barthelemy M. A stochastic model of randomly accelerated walkers for human mobility. Nature communications. 2016;7:12600. doi: 10.1038/ncomms12600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Scellato S, Fortuna L, Frasca M, Gomez-Gardenes V, Latora J. Traffic optimization in transport networks based on local routing. The European Physical Journal B. 2010;73:303–308. doi: 10.1140/epjb/e2009-00438-2. [DOI] [Google Scholar]
- 18.Nagatani T. The physics of traffic jams. Reports on progress in physics. 2002;65:1331. doi: 10.1088/0034-4885/65/9/203. [DOI] [Google Scholar]
- 19.Geroliminis N, Daganzo CF. Existence of urban-scale macroscopic fundamental diagrams: Some experimental findings. Transportation Research Part B: Methodological. 2008;42.9:759–770. doi: 10.1016/j.trb.2008.02.002. [DOI] [Google Scholar]
- 20.Buisson C, Ladier C. Exploring the impact of homogeneity of traffic measurements on the existence of macroscopic fundamental diagrams. Transportation Research Record. 2009;2124:127–136. doi: 10.3141/2124-12. [DOI] [Google Scholar]
- 21.Geroliminis N, Sun J. Properties of a well-defined macroscopic fundamental diagram for urban traffic. Transportation Research Part B: Methodological. 2011;45:605–617. doi: 10.1016/j.trb.2010.11.004. [DOI] [Google Scholar]
- 22. Leclercq, L., Chiabaut, N. & Trinquier, B. Macroscopic fundamental diagrams: A cross-comparison of estimation methods. Transportation Research Part B: Methodological62, 1–12, 10.1016/j.trb.2014.01.007 (2014).
- 23. Mariotte, G., Leclercq, L. & Laval, J. A. Macroscopic urban dynamics: Analytical and numerical comparisons of existing models. Transportation Research Part B: Methodological101, 245–267, 10.1016/j.trb.2017.04.002 (2017).
- 24.Ramezani M, Haddad J, Geroliminis N. Dynamics of heterogeneity in urban networks: aggregated traffic modeling and hierarchical control. Transportation Research Part B: Methodological. 2015;74:1–19. doi: 10.1016/j.trb.2014.12.010. [DOI] [Google Scholar]
- 25.Haddad J, Shraiber A. Robust perimeter control design for an urban region. Transportation Research Part B: Methodological. 2014;68:315–332. doi: 10.1016/j.trb.2014.06.010. [DOI] [Google Scholar]
- 26.Geroliminis N, Jie S. Properties of a well-defined macroscopic fundamental diagram for urban traffics. Transportation Research Part B: Methodological. 2011;45.3:605–617. doi: 10.1016/j.trb.2010.11.004. [DOI] [Google Scholar]
- 27.Knoop VL, Hoogendoorn SP. Empirics of a generalized macroscopic fundamental diagram for urban freeways. Transportation Research Record. 2013;2391:133–141. doi: 10.3141/2391-13. [DOI] [Google Scholar]
- 28. Saeedmanesh, M. & Geroliminis, N. Dynamic clustering and propagation of congestion in heterogeneously congested urban traffic networks. Transportation Research Part B: Methodological105, 193–211, 10.1016/j.trb.2017.08.021 (2017).
- 29. Saeedmanesh, M. & Geroliminis, N. Clustering of heterogeneous networks with directional flows based on “snake” similarities. Transportation Research Part B: Methodological, 91, 250–269, 10.1016/j.trb.2016.05.008 (2016).
- 30.Lopez C, Leclercq L, Krishnakumari P, Chiabaut N, Lint H. Revealing the day-to-day regularity of urban congestion patterns with 3d speed maps. Scientific Reports. 2017;7:14029. doi: 10.1038/s41598-017-14237-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Gipps PG. A behavioural car-following model for computer simulation. Transportation Research Part B: Methodological. 1981;15:105–111. doi: 10.1016/0191-2615(81)90037-0. [DOI] [Google Scholar]
- 32. Newell, G. A simplified car-following theory: a lower order model. Transportation Research Part B: Methodological36, 195–205, 10.1016/S0191-2615(00)00044-8 (2002).
- 33.Kim J, Mahmassani HS. Correlated parameters in driving behavior models: Car-following example and implications for traffic microsimulation. Transportation Research Record. 2011;2249:62–77. doi: 10.3141/2249-09. [DOI] [Google Scholar]
- 34.Leclercq, L. Hybrid approaches to the solutions of the “lighthill-whitham-richards” model. Transportation Research Part B: Methodological41, 701–709 (2007).
- 35.Daganzo CF. The cell transmission model: A dynamic representation of highway traffic consistent with the hydrodynamic theory. Transportation Research Part B: Methodological. 1994;28:269–287. doi: 10.1016/0191-2615(94)90002-7. [DOI] [Google Scholar]
- 36.Daganzo CF. The cell transmission model, part ii: network traffic. Transportation Research Part B: Methodological. 1995;29:79–93. doi: 10.1016/0191-2615(94)00022-R. [DOI] [Google Scholar]
- 37.Turing AM. The chemical basis of morphogenesis. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences. 1952;237:37–72. doi: 10.1098/rstb.1952.0012. [DOI] [Google Scholar]
- 38. Murray, J. D. Mathematical biology. second corrected edition (1993).
- 39.Nakao H, Mikhailov AS. Turing patterns in network-organized activator-inhibitor systems. Nature Physics. 2010;6:544. doi: 10.1038/nphys1651. [DOI] [Google Scholar]
- 40.Helbing D, Hennecke A, Shvetsov V, Treiber M. Master: macroscopic traffic simulation based on a gas-kinetic, non-local traffic model. Transportation Research Part B: Methodological. 2001;35:183–211. doi: 10.1016/S0191-2615(99)00047-8. [DOI] [Google Scholar]
- 41.Biham O, Middleton AA, Levine D. Self-organization and a dynamical transition in traffic-flow models. Physical Review A. 1992;46:R6124. doi: 10.1103/PhysRevA.46.R6124. [DOI] [PubMed] [Google Scholar]
- 42. Jiang, Y., Kang, R., Li, D., Guo, S. & Havlin, S. Spatio-temporal propagation of traffic jams in urban traffic networks. arXiv preprint arXiv:1705.08269 (2017).
- 43. Barrat, A., Barthelemy, M. & Vespignani, A. Dynamical processes on complex networks (Cambridge university press, 2008).
- 44. Ciuffo, B., Casas, J., Montanino, M., Perarnau, J. & Punzo, V. From theory to practice: Gaussian process meta-models for sensitivity analysis of traffic simulation models: Case study of aimsun mesoscopic model. 92th TRB Annual Meeting, Washington DC. (2013).
- 45.Daqing L, Yinan J, Rui K, Havlin S. Spatial correlation analysis of cascading failures: congestions and blackouts. Scientific reports. 2014;4:5381. doi: 10.1038/srep05381. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Daganzo CF. Requiem for second-order fluid approximations of traffic flow. Transportation Research Part B: Methodological. 1995;29:277–286. doi: 10.1016/0191-2615(95)00007-Z. [DOI] [Google Scholar]
- 47. Whitham, G. B. Linear and nonlinear waves, vol. 42 (John Wiley & Sons, 2011).
- 48.Helbing D. Gas-kinetic derivation of navier-stokes-like traffic equations. Phys. Rev. E. 1996;53:2366–2381. doi: 10.1103/PhysRevE.53.2366. [DOI] [PubMed] [Google Scholar]
- 49. Jordan, P. Growth and decay of shock and acceleration waves in a traffic flow model with relaxation. Physica D: Nonlinear Phenomena207, 220–229, 10.1016/j.physd.2005.06.002 (2005).
- 50.Kerner BS, Rehborn H. Experimental properties of complexity in traffic flow. Phys. Rev. E. 1996;53:R4275–R4278. doi: 10.1103/PhysRevE.53.R4275. [DOI] [PubMed] [Google Scholar]
- 51.Kerner BS, Konh�user P. Structure and parameters of clusters in traffic flow. Phys. Rev. E. 1994;50:54–83. doi: 10.1103/PhysRevE.50.54. [DOI] [PubMed] [Google Scholar]
- 52. Ji, Y., Jun, L. & Geroliminis, N. Empirical observations of congestion propagation and dynamic partitioning with probe data for large-scale systems. In Transportation Research Record: Journal of the Transportation Research Board 2422, 1–11 (2014).
- 53.Li D, et al. Percolation transition in dynamical traffic network with evolving critical bottlenecks. Proceedings of the National Academy of Sciences. 2015;112:669–672. doi: 10.1073/pnas.1419185112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Ji., Y., Luo, J. & Geroliminis, N. Empirical observations of congestion propagation and dynamical partitioning with probe data for large-scale systems. Transportation Research Record 1–11 (2014).
- 55.Zeng G, et al. Switch between critical percolation modes in city traffic dynamics. Proceedings of the National Academy of Sciences. 2019;116:23–28. doi: 10.1073/pnas.1801545116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Zhang L, et al. Scale-free resilience of real traffic jams. Proceedings of the National Academy of Sciences. 2019;116:8673–8678. doi: 10.1073/pnas.1814982116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Helbing D. Improved fluid-dynamic model for vehicular traffic. Phys. Rev. E. 1995;51:3164–3169. doi: 10.1103/PhysRevE.51.3164. [DOI] [PubMed] [Google Scholar]
- 58.Kouvelas A, Saeedmanesh M, Geroliminis N. Enhancing model-based feedback perimeter control with data-driven online adaptive optimization. Transportation Research Part B: Methodological. 2017;96:26–45. doi: 10.1016/j.trb.2016.10.011. [DOI] [Google Scholar]
- 59. Haddad, J. Optimal perimeter control synthesis for two urban regions with aggregate boundary queue dynamics. Transportation Research Part B: Methodological96, 1–25, 10.1016/j.trb.2016.10.016 (2017).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
All the data related to this study and its documentation are accessible using the following links: 10.6084/m9.figshare.7212230.






