Significance
How properties of cities vary with their population is of great interest for elaborating theoretical models but also for making predictions about the future of cities. However, it is still unclear whether large cities are just scaled-up versions of smaller cities, which casts some doubt on measures mixing different urban systems of different sizes and histories. Here we show in the example of congestion-induced traffic delays that the path dependency is so strong that these delays depend on both the population and the whole history of the system, prohibiting the existence of simple scaling forms.
Keywords: science of cities, scaling, dynamics of cities, urban modeling, traffic congestion
Abstract
Scaling has been proposed as a powerful tool to analyze the properties of complex systems and in particular for cities where it describes how various properties change with population. The empirical study of scaling on a wide range of urban datasets displays apparent nonlinear behaviors whose statistical validity and meaning were recently the focus of many debates. We discuss here another aspect, which is the implication of such scaling forms on individual cities and how they can be used for predicting the behavior of a city when its population changes. We illustrate this discussion in the case of delay due to traffic congestion with a dataset of 101 US cities in the years 1982–2014. We show that the scaling form obtained by agglomerating all of the available data for different cities and for different years does display a nonlinear behavior, but which appears to be unrelated to the dynamics of individual cities when their population grows. In other words, the congestion-induced delay in a given city does not depend on its population only, but also on its previous history. This strong path dependency prohibits the existence of a simple scaling form valid for all cities and shows that we cannot always agglomerate the data for many different systems. More generally, these results also challenge the use of transversal data for understanding longitudinal series for cities.
The recent availability of data for cities opens the fascinating possibility of a science of cities (1, 2) and has led numerous scientists to search for general laws (3, 4) ruling the evolution of various socioeconomic and structural indicators such as patent production, personal income, or electric cable total length, etc. In ref. 3, it was suggested that assuming the population to be the most important determinant for cities, we could study the evolution of many different features when is increasing. In ref. 4, many socioeconomic factors were studied vs. population, indicating the existence of simple scaling laws under the form of power laws. For each indicator , Bettencourt et al. (4) found a power law of the form , where the exponent depends on the quantity considered. Some quantities evolve superlinearly with the population (), for instance new patents (), gross domestic product (GDP) (), or serious crime (), while some others behave sublinearly () as gasoline stations or sales. Quantities that are independent from the size of the city—typically human-related quantities such as water consumption—scale with an exponent . The usual explanation for these effects is the impact of interactions (scaling as ) for superlinear quantities and economies of scale for sublinear quantities. This publication (4) was followed by a wealth of other measures such as the abundance of business categories (5), the number of sexually transmitted infections (6), road networks (7), or carbon dioxide emissions (8–12).
Scaling in urban systems has, however, been criticized in some recent papers (10, 13–16). A first reanalysis of the data for the GDP and income (13) showed that the power law could not be distinguished from other functional forms, or that the linear fit is better (14), and in ref. 15 the authors led a rigorous investigation into the statistical quality of scalings for various quantities and found that in many superlinear cases, the linear assumption could in fact not be rejected. They also showed that the fitting results depend crucially on the assumptions about noise. From another point of view, the authors in ref. 16 showed that, for some socioeconomic indicators, those scalings are not universal and could depend on details of urban systems. More precisely, they showed in data from French cities that two different definitions of the cities [unité urbaine (urban units) and aire urbaine (metropolitan areas)] led to different values of the scaling exponent for a given quantity, a result confirmed in transport-emitted CO2 in ref. 10. Not only the value of the exponent can change, but in some cases, for different definitions of the city, the scaling regime changes: For instance, the number of jobs in the manufacturing sector grows superlinearly with the population of urban units, but sublinearly if one considers metropolitan areas (16). We can expect the results to change quantitatively, but here we have changes from the superlinear to the sublinear regime, casting some doubt on this nonlinear scaling and its universality.
In this paper we consider another problem, that is, the relevance of such a scaling for the individual dynamics of cities. At a more theoretical level, we question here the scaling assumption where a quantity (usually extensive) is assumed to be determined by the population only, (where is in general an unknown function). Even if the population is an important determinant for cities, we cannot exclude time effect and path dependency which would then imply that the quantity depends also on time and possibly on all for . In other words, the path dependency means that it does not make sense in general to compare two cities having the same population but at very different dates: Both central Paris and Phoenix, AZ had a population of about 1 million inhabitants, the former in 1840 and the latter in 1990, and it is very likely that the dynamics—for most of the relevant quantities—from 1840 in Paris will be very different from those starting in 1990 in Phoenix, implying that the usual scaling form does not apply in general. In this paper, we investigate this question and test whether a scaling exponent computed by aggregating data for different cities (usually at the same date) is relevant for predicting what will happen at the level of individual cities as their populations grow. We illustrate this discussion in the case of congestion-induced delays but our results could have far-reaching consequences for many other scaling results for cities.
Aggregating All Cities: Global Scaling
We focus on the particular case of traffic congestion and its impact on time delays. Previous studies have been made to empirically test and theoretically explain how traffic congestion scales with the population. In refs. 17 and 18 for instance, the authors propose a theory of urban growth which accounts for some of the observed scalings. The theoretical predictions are tested against several datasets, collected by Organization for Economic Cooperation and Development (OECD) or by a GPS device company (TomTom) (17). Here, we study the dataset (freely available from ref. 19) published by the Texas A&M Transportation Institute (TTI) in the Urban Mobility Report (UMR), obtained for cities in the United States over y from 1982 to 2014 (the methodology used for constructing this dataset is described in ref. 20, and we also give more details about this dataset in SI Appendix, section 1). This database was investigated in 2017 by the authors of ref. 21 and in this study, the authors agglomerated all of the data corresponding to different cities and performed the usual power-law fit of the form
[1] |
where is the annual congestion-induced delay corresponding to city . In this study we take for (also denoted by in the following) the number of car commuters for the city rather than the population, because this is the relevant parameter in many models that deal with congestion in cities (18). If we take the population instead of the number of car commuters, our results are qualitatively the same and our conclusions remain unchanged, even if all of the exponent values change slightly (a fit for all cities and all years shows that the number of car commuters is approximately a constant fraction of order of the population). In ref. 21, they used the least-squares method to estimate and for the year 2014 (the last available year in the UMR), we find with this method . We plot the data and the corresponding fit in Fig. 1.
The quality of a fit must in general be carefully checked with the help of statistical methods (15), and computing a good estimation of this exponent value relies on several assumptions: Data points are independent, and the noise is multiplicative and has a variance independent of (homoscedasticity). It should also be checked that the nonlinear fit that has an additional parameter compared with the linear one is much better than what would be expected by pure chance. In this case, the trend seems, however, to fit the data in a reasonably good way with a large , even if we have only two decades here. The value of larger than indicates a superlinear behavior of the traffic congestion, a fact in agreement with recent empirical (21) and theoretical approaches (18, 22).
We can repeat this fit for each year separately, from 1982 to 2014. Formally, we test for each time the relationship , where is the scaling exponent to be determined. We show the values of vs. in Fig. 2 and we observe that is not constant through time and displays nonnegligible fluctuations of order . However, all these values are larger than , indicating a consistent superlinear behavior. In ref. 21 a least-squares method was used on all of the points available: The authors mix all of the 33 y available for each of the 101 cities and get points leading to a scaling exponent , consistent again with a superlinear relation, as found in ref. 21. For this dataset, we plot the scatterplot and the corresponding nonlinear fit in Fig. 3, Top, (note that we plot here the delay per capita). We observe some variability but the global increasing trend seems to be correct. This way of proceeding with data is common: One mixes data for different cities and for the available years and then performs a regression over the whole set. The scaling that is obtained—and that we qualify as “global”—is then used for discussing theoretical approaches. For instance, in ref. 22, this approach is used for computing some scaling exponents (for quantities such as land area, wages, etc.) and is compared with the exponent expected from theoretical calculations. In ref. 23, empirical regularities are found by applying this methodology to different indicators, suggesting the existence of universal socioeconomic dynamics. Beyond statistical problems related to fitting procedures, the exact meaning and the relevance of this global scaling for individual cities are, however, not clear. In other words, when we know that a certain quantity scales for all cities as , what can we say about the evolution of a single city? In the following we address this question in the case of congestion delay and by studying in detail the dynamics of every individual city and compare its behavior with the global scaling described above.
The Dynamics of Individual Cities
In Fig. 3, Bottom, we show the same plot as in Fig. 3, Top, but where we now distinguish cities (one color corresponds to one city). This allows us to compare the evolution of the delay due to congestion in each city when its population grows. The first striking observation is that for all cities in our dataset, the evolution of the congestion delay does not behave as predicted by the global trend. They have their own trend which depends on their particular history. In this respect, it is natural to ask, what are the individual city dynamics and what do they have in common with the global scaling? In what follows we thus focus on this individual behavior and discuss its relation with the global power-law exponent.
Absence of a Single Scaling.
With this dataset, we can monitor the evolution of each city when its population grows. We first observe in the examples in Fig. 4, Top, that the annual delay is not a simple function of only. The value of the number of drivers (or the population) is not enough to determine the delay. We also note in Fig. 4 that the slopes are different (a power-law fit gives for Bakersfield and for Sarasota), showing that even when a power law exists it is not with the same exponent (see Type-1 Cities: Power-Law Growth for a further analysis of this point). To test further the existence of a scaling of the form we plot in Fig. 4, Bottom, for all cities vs. , where is the first available time. Even if the prefactor changes from one city to another one, this rescaling allows us to test the existence of a unique power-law scaling. As we can see in Fig. 4, Bottom, the curves for different cities do not collapse, signaling the absence of a scaling form governed by a single exponent. In the following we focus on the different behaviors observed for this set of cities.
Different Categories of Cities.
We analyze the behavior of each of the 101 cities in the dataset and we observe a variety of behaviors. More precisely, there are two main categories characterized by different time evolutions:
-
•
The delay increases with and in most cases can be fitted by a power law (Fig. 5, Top) and we refer to this set as “type-1” cities, which represent over of our cases. We note that for the dataset studied here, the time range (from 1982 to 2014) does not allow us to have a very large variation of the number of drivers [the ratio varies from to approximately] and a much larger dataset would be needed to have better accuracy for these exponent values.
-
•
The other cities (about of all cities) display two regimes separated by a change of slope that is in general abrupt. The second regime for these “type-2” cities can be in some cases a “saturation” where the delay stays constant. We show in Fig. 5, Bottom, an example of such a city that displays saturation with zero slope in the second regime.
-
•
The remaining cities () do not display a common behavior (for instance, some present two or three changes of slope, etc.).
In most cases, however, the individual behavior of a city does not correspond to the global scaling . In the following we focus on each of these classes and try to characterize them more precisely.
Type-1 Cities: Power-Law Growth.
This particular class comprises cities that display an individual scaling law that can be fitted by a power law of the form , where is the number of commuters at time and the corresponding annual congestion-induced delay. The quantity depends in general on the city and we show in Fig. 6 the histogram for this exponent computed for all type-1 cities. We clearly see that very few cities behave as the “global trend” predicted: Only 2 cities of 31 have an exponent , while 13 cities have an exponent (we give in SI Appendix, section 2 the list of values for ). This result shows that when we observe a power-law behavior at the individual city level, it is generally with an exponent that is much larger than 1 and much larger than the result found for the global scaling. In other words there seems to be no correlation between the global observation made on all cities and the individual behavior of cities when their population evolves.
Type-2 Cities: Existence of Two Regimes.
For about of the cities in the dataset, the delay vs. the number of car commuters displays a change of slope and is a piecewise linear function of . Formally one could write
[2] |
This behavior indicates that the dynamics of the traffic congestion in those cities followed successively two different scaling laws with two different exponents and and we plot the histograms for both these exponents in Fig. 7 (we give in SI Appendix, section 2 the list of values of and ). We note that the average of is around , while the average of drops to , closer to the “global exponent” (but with a large dispersion around this value). Beyond averages, we have that for almost every case, (we also show in SI Appendix, Fig. S4 that there are no correlations between and ). Almost all of the exponents of the first regime are above 2 (indicating a strong superlinearity) while the second exponents are mostly . For this second regime, some cities do not exhibit superlinear behavior. Indeed for some cities (), the exponent is very close to , indicating a linear behavior and equivalently a delay per capita constant—that we coined as saturation. The cities of Akron (Fig. 8) or Birmingham, for instance, fall into that subcategory. We also observe that in some cases a crossing between the curves corresponding to different cities can occur (such as Akron and Albuquerque in Fig. 8). This crossing is another sign that the posterior evolution of a city is not uniquely determined by the population and the delay at a certain time (if it was, the evolution after the crossing should be identical for the two cities).
In other cases (), the exponent is clearly , which indicates sublinearity and that the delay per capita decreases with the population. We show the example of the city of Albuquerque, NM in Fig. 8. This phenomenon is very counterintuitive, even if we point out some elements of explanation. Indeed, in addition to the congestion-induced delay, we also have the data for the total driven length (in miles commuters) for each city and each year. We can check whether this quantity can explain, even partially, the behavior of the total delay. For some type-2 cities with two regimes, we plot the driven length per commuter against the number of drivers and we observe that this curve displays a change of regime at the exact same point for the delay. In Fig. 9, Top, we see that for the case of Birmingham, from 1998, the delay remains almost constant, whereas it increased constantly at a high rate before that (more precisely we have here and ). In Fig. 9, Bottom, we observe that in the same year, the curve for the total driven length per capita experienced a change of slope: The length per capita increased before 1998 and slowly decreases after that date. This could explain partially why the delay does not evolve after this date: There are certainly more people on the road after 1998, and therefore more likely some congestion, but each commuter drives less on average, which decreases the occurrence of traffic jams: These two effects can compensate each other. This is one possible partial explanation, which, however, does not hold for all of the cities. The change of slope in vs. is common in this dataset and in most cases happens simultaneously with the change of regimes of the delay, pointing to the existence of correlations between these quantities, even if not in a causal manner. The simultaneous change of regime for these two quantities might also be the sign that the city experienced a large-scale structural change.
For this category of cities, beyond the two exponents and , we can also study (i) at what time the change of slope happened, (ii) what the population of the city was when it happened (), and (iii) what the delay per capita was when it happened (). The histograms for these quantities are shown in SI Appendix, Fig. S5. The distribution of is difficult to interpret and does not display a typical date at which the slope changes. The change of slopes therefore does not occur at the same time for these cities, which would have been the case, for instance, if there had been a national plan in the United States to rebuild the whole road system or any other federal decision. The histogram for seems clearer to interpret with the existence of a clear maximum around commuters and a quick decay for larger values. The average of the distribution is , while the SD is . Finally, the delay per capita displays a histogram that has a relatively small compact support, with an average of about h per year and a SD of about h per year. This relatively small variation of suggests that it is the congestion that triggers the change of regime signaled by different exponents. Further studies are, however, certainly needed to clarify this important point.
Discussion
We focused in this paper on the dataset for congestion-induced delay in some US cities. This is a particularly interesting dataset as it is both transversal (it contains many cities) and longitudinal (for each city we have the temporal evolution of the delay). This is a rather rare case at the moment, but this type of data will certainly become more abundant in the future and will allow us to test our results on other quantities. Our observations about scaling might therefore have far-reaching consequences for the quantitative study of urban systems, well beyond the case of congestion-induced delays.
The general scaling form indicates that if the population is multiplied by a factor , the quantity is then multiplied by a factor . This scaling form relies, however, on a strong implicit assumption which is the “logarithmic population translation” invariance. In other words, this scaling form implies that for any times and we have and then it depends on the ratio of populations only (or the difference of logarithms). As we observed in this study, there is no such scaling at the individual city level but a variety of behaviors. In the language of statistical physics, the quantity (here equal to ) is not a state function determined by the population only and displays some sort of aging effect where the delay in a city depends not only on the population but also on the time and probably on the whole history of the city. In any case we cannot make for a given city a prediction for time knowing only its state for . This idea of path dependency is natural for many complex systems, and in statistical physics, we know that spin glasses (24) for example display aging, which means that some features of the system (for instance, the relaxation time) evolve with the age of the system and do not depend on the state of the system only. This in particular implies that we do not have time translation invariance but that most functions of two times and do not depend on only. This aging theory has been applied to many other complex systems, from “soft material” (25) to superparamagnet (26), and it would be interesting to understand it in the framework of the evolution of urban systems. An interesting direction for future research would be to investigate the relation between the growth rate of a city and the importance of aging. We could, for example, test the naive expectation that a slow enough “adiabatic” growth would imply that the size of the city is very important, while a rapid growth would imply that the state of the system at previous times becomes relevant.
The results presented in this paper illustrated in the case of congestion-induced delays could in principle be applied to any other quantity. They highlight the risk of agglomerating data for different cities and to consider that cities are scaled-up versions of each other (as questioned in ref. 27, for example): There are strong constraints for being allowed to do that such as path independence, which is apparently not satisfied in the case of congestion and which should be checked in each case.
Beyond scaling, these results also pose the challenging problem of using transversal data (i.e., for different cities) to get some information about the longitudinal series for individual cities. This is a fundamental problem that needs to be clarified when looking for generic properties of cities.
Supplementary Material
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1718690115/-/DCSupplemental.
References
- 1.Batty M. The New Science of Cities. MIT Press; Cambridge, MA: 2013. [Google Scholar]
- 2.Barthelemy M. The Structure and Dynamics of Cities. Cambridge Univ Press; Cambridge, UK: 2016. [Google Scholar]
- 3.Pumain D. 2004 Scaling laws and urban systems (Santa Fe Institute, Santa Fe, NM). Available at https://www.santafe.edu/research/results/working-papers/scaling-laws-and-urban-systems. Accessed February 14, 2018.
- 4.Bettencourt LMA, Lobo J, Helbing D, Kuhnert C, West GB. Growth, innovation, scaling, and the pace of life in cities. Proc Natl Acad Sci USA. 2007;104:7301–7306. doi: 10.1073/pnas.0610172104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Youn H, et al. Scaling and universality in urban economic diversification. J R Soc Interface. 2016;13:20150937. doi: 10.1098/rsif.2015.0937. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Patterson-Lomba O, Goldstein E, Gomez-Liévano A, Castillo-Chavez C, Towers S. Per-capita incidence of sexually transmitted infections increases systematically with urban population size: A cross-sectional study. Sex Transm Infect. 2015;91:610–614. doi: 10.1136/sextrans-2014-051932. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Samaniego H, Moses ME. Cities as organisms: Allometric scaling of urban road networks. J Transp Land Use. 2008;1:21–39. [Google Scholar]
- 8.Glaeser EL, Kahn ME. The greenness of cities: Carbon dioxide emissions and urban development. J Urban Econ. 2010;67:404–418. [Google Scholar]
- 9.Fragkias M, Lobo J, Strumsky D, Seto KC. Does size matter? Scaling of CO2 emissions and US urban areas. PLoS One. 2013;8:e64727. doi: 10.1371/journal.pone.0064727. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Louf R, Barthelemy M. Scaling: Lost in the smog. Environ Plan B Plan Des. 2014;41:767–769. [Google Scholar]
- 11.Oliveira EA, Andrade JS, Jr, Makse HA. Large cities are less green. Sci Rep. 2014;4:4235. doi: 10.1038/srep04235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Rybski D, et al. Cities as nuclei of sustainability? Environ Plan B Urban Anal City Sci. 2017;44:425–440. [Google Scholar]
- 13.Shalizi CR. 2011. Scaling and hierarchy in urban economies. arXiv:1102.4101.
- 14.Arcaute E, et al. Constructing cities, deconstructing scaling laws. J R Soc Interface. 2015;12:20140745. doi: 10.1098/rsif.2014.0745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Leitao JC, Miotto JM, Gerlach M, Altmann EG. Is this scaling nonlinear? R Soc Open Sci. 2016;3:150649. doi: 10.1098/rsos.150649. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Cottineau C, Hatna E, Arcaute E, Batty M. Diverse cities or the systematic paradox of urban scaling laws. Comput Environ Urban Syst. 2017;63:80–94. [Google Scholar]
- 17.Barthelemy M. A global take on congestion in urban areas. Environ Plan B Plan Des. 2016;43:800–804. [Google Scholar]
- 18.Louf R, Barthelemy M. How congestion shapes cities: From mobility patterns to scaling. Sci Rep. 2014;4:5561. doi: 10.1038/srep05561. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Texas A&M Transportation Institute (2015) Excel spreadsheet - 101 urban areas. Available at tti.tamu.edu/documents/ums/congestion-data/complete-data.xlsx. Accessed February 12, 2018.
- 20.Lomax T, Schrank D, Eisele B. Appendix A – Methodology for the 2015 Urban Mobility Scorecard (Texas A&M Transportation Institute, College Station, TX) 2015 Available at https://mobility.tamu.edu/ums/methodology/. Accessed February 14, 2018.
- 21.Chang YS, Lee YJ, Choi SS. Is there more traffic congestion in larger cities? -Scaling analysis of the 101 largest U.S. urban centers. Transp Policy. 2017;59:54–63. [Google Scholar]
- 22.Bettencourt LMA. The origins of scaling in cities. Science. 2013;340:1438–1441. doi: 10.1126/science.1235823. [DOI] [PubMed] [Google Scholar]
- 23.Bettencourt LMA, Lobo J, Strumsky D, West GB. Urban scaling and its deviations: Revealing the structure of wealth, innovation and crime across cities. PLoS One. 2010;5:e13541. doi: 10.1371/journal.pone.0013541. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Bouchaud JP, Cugliandolo LF, Kurchan J, Mézard M. Out of equilibrium dynamics in spin-glasses and other glassy systems. In: Young AP, editor. Spin Glasses and Random Fields. World Scientific; Singapore: 1997. pp. 161–223. [Google Scholar]
- 25.Fielding SM, Sollich P, Cates ME. Aging and rheology in soft materials. J Rheol. 2000;44:323–369. [Google Scholar]
- 26.Sasaki M, Jonsson PE, Takayama H, Mamiya H. Aging and memory effects in superparamagnets and superspin glasses. Phys Rev B. 2005;71:104405. [Google Scholar]
- 27.Thisse JF. The new science of cities by Michael Batty: The opinion of an economist. J Econ Lit. 2014;52:805–819. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.