Tomography of scaling

Marc Barthelemy

doi:10.1098/rsif.2019.0602

. 2019 Nov 27;16(160):20190602. doi: 10.1098/rsif.2019.0602

Tomography of scaling

Marc Barthelemy ^1,^2,^✉

PMCID: PMC6893487 PMID: 31771453

Abstract

Scaling describes how a given quantity Y that characterizes a system varies with its size P. For most complex systems, it is of the form $Y \sim P^{β}$ with a non-trivial value of the exponent β, usually determined by regression methods. The presence of noise can make it difficult to conclude about the existence of a nonlinear behaviour with β ≠ 1 and we propose here to circumvent fitting problems by investigating how two different systems of sizes P₁ and P₂ are related to each other. This leads us to define a local scaling exponent β_loc that we study versus the ratio P₂/P₁ and provides some sort of ‘tomography scan’ of scaling across different values of the size ratio, allowing us to assess the relevance of nonlinearity in the system and to identify an effective exponent that minimizes the error for predicting the value of Y. We illustrate this method on various real-world datasets for cities and show that our method reinforces in some cases the standard analysis, but is also able to provide new insights in inconclusive cases and to detect problems in the scaling form such as the absence of a single scaling exponent or the presence of threshold effects.

Keywords: scaling, complex systems, nonlinearity, cities

1. Introduction

Scaling laws and associated scaling exponents are fundamental objects. Used in biology in order to understand how the metabolic rate varies with body size [1,2], scaling was widely used in physics to understand polymers [3], phase transitions [4], fluid dynamics and turbulence [5]. Scaling also became a central tool for describing macroscopic properties of complex systems [6–8] for two reasons: first, the existence of a scaling law points to self-similarity: the system reproduces itself as the scales change. Second, these exponents also constitute important guides for identifying critical factors and mechanisms in complex systems. In particular, when they cannot be deduced from simple dimensional considerations, they point to relevant scales and ingredients.

Given the simplicity of scaling measures, it is tempting to use this approach to obtain an understanding of the behaviour of complex systems that in general comprise a large number of constituents that interact with each other over various spatial and temporal scales. This is particularly true for urban systems for which we have now an abundance of data but are still lacking quantitative models for many aspects [9–11]. For cities, the scaling problem is to understand how extensive quantities vary with the size of the city, usually measured by its population [7,8]. Although our theoretical discussion is very general and could, in principle, be applied to any system, we will use here the language of cities and apply our method to urban data. We thus consider a macroscopic quantity Y that describes a given aspect of cities which can be socio-economical, about infrastructures, etc., and ask how it varies with the population P of the city (according to a given definition of the city, see [12] and below for a discussion about this point). Empirical results for various quantities in cities were compiled for the first time in [8] and provided evidence that many quantities follow the scaling relation

Y = a P^{β},

1.1

where a is a prefactor and where the exponent β is in general positive. This relation implies that the quantity per capita behaves as Y/P ∼ P^β−1 and in the linear case (β = 1) the quantity per capita is independent from the size of the city. This is in contrast to all the other cases (β ≠ 1), where Y/P depends on P which means that there is a (nonlinear) effect of interactions in the city. It is therefore crucial to distinguish the case β = 1 from β ≠ 1 as it will determine how we model and understand the city. In the seminal paper [8], it was shown that we have three different classes of quantities according to the value of β and that these correspond to different processes. As we just noted β = 1 is the linear case for which the size of city has no impact—think of human-related quantities for example—while for β < 1 we mostly have infrastructure quantities denoting an economy of scale and for β > 1 a positive effect of interactions (as expected for creative processes, social interaction-dependent quantities such as innovations, or unfortunately negative aspects such as crimes or epidemic spreading). This study triggered a very large number of subsequent works that are difficult to cite here, but we can mention scaling for the properties of roads, [13], for green space areas [14], for urban supply networks [15], for CO₂ emissions in cities [16–19], for interaction activity [20], wealth, innovation and crimes [21–27], etc. These different results motivated the search for a theoretical understanding and modelling that can explain these values [28–34].

2. Problems with fitting

The usual (and simplest) way to determine an estimator $\hat{β}$ of the scaling exponent β is the ordinary least-square regression, which essentially consists of plotting Y versus P in loglog and finding a power-law fit (that is linear in loglog) such that the error (measured as the sum of squared differences) is minimum. This is the classical method used throughout many different fields and poses no problem if (i) there are enough decades on both axes, (ii) there is not too much noise. If we are interested in the existence of nonlinearity (which is the case for cities), we can also add the constraint (iii) that the exponent should be clearly different from one. These conditions are unfortunately not always met. As an example we plot the GDP for each city in the USA (for the year 2010) versus population and the result is shown in figure 1.

Figure 1. — GDP (in millions of current dollars) in 2010 for Metropolitan Statistical Areas versus their population. We show here both the linear and the nonlinear fits. At this point, it is difficult to conclude about a possible nonlinear behaviour. Data from the Bureau of Economic Analysis [35]. (Online version in colour.)

We basically have two decades of variation on both axes (which, roughly speaking, is the minimum in order to determine a power-law exponent) and a reasonable amount of noise, leading to a power-law fit that gives $\hat{β} \approx 1.13$ (with r² = 0.98). We see here that conditions (i) and (iii) are not met and we can only place a relative confidence in this value 1.13. Indeed, a linear fit, which has one parameter less compared to the nonlinear fit, is also good (figure 1). More generally, involved statistical methods need then to be invoked if we can reject or not the linear assumption and this was the point of the excellent paper [36]. These authors tested the hypothesis that observations are compatible with a nonlinear behaviour and their conclusion for various quantities is that the estimate of β together with confidence intervals depend a lot on fluctuations in the data and how they are modelled. It is thus difficult to get a clear-cut answer to the fundamental question of whether β is different from one or not. These fitting problems were also discussed in [37] on GDP and income in the USA, where it was argued that other scaling forms could be used and that non-trivial scaling exponent values could be an artefact of using extensive quantities instead of intensive ones (per capita rates).

These problems were reinforced by other studies [12,18,38] that showed the importance of the definition of cities: the authors of [12] developed a framework for defining cities using commuter numbers and population density thresholds and could show on a UK dataset that many urban indicators scale linearly with population size, independently of the definition of urban boundaries. For quantities that display a nonlinear behaviour, the scaling exponent value fluctuates considerably, and more importantly can be either larger or less than one according to the definition used (a problem also observed on the case of CO₂ emissions by transport [39]). In addition to these empirical problems, we also mention a study on congestion induced delays in cities which seem to scale with an exponent that varies in time, posing in fine the problem of mixing different cities at different stages of their evolution [40].

From an Ockham’s razor perspective [41] choosing between a linear behaviour independent from urban boundaries or a nonlinear scaling exponent whose value fluctuates considerably, leads to the conclusion that many socio-economical indicators are described by a linear behaviour with β = 1. This is, however, not a scientific proof and as our capacity for understanding cities relies crucially on this exponent value, the question is still somehow open and begs for a more satisfying answer. As most statistical frameworks and approaches lead to conclusions that depend critically on assumptions, especially in ‘grey cases’ (with large noise, few decades for fitting, exponent value close to one, etc.) it would be useful to get other evidences of nonlinearities and to somehow circumvent the fitting problem. A value of β different from one is not only a matter of numerical value, but essentially points to nonlinear effects that are in general relevant at a large scale. In particular, nonlinearities could probably be seen in the dynamics of these systems and this search could constitute an interesting direction for future (urban) studies. Here, we will focus on the much more pragmatic question that lies at the core of the idea of (urban) scaling: knowing some quantity Y₁ for a city of size P₁ what can we say about the corresponding quantity Y₂ for a city of size P₂? In other words, if we accept the idea of scaling, what is the exponent β that we should use in order to compute Y₂ according to $Y_{2} = Y_{1} {(P_{2} / P_{1})}^{β}$ ? This simple question is at the core of the analysis presented here. In the next section, we present in more detail the tools and the method developed here and in the following sections we apply them to different quantities from various datasets.

3. Scaling: simple tools for a thorough examination

3.1. Local exponent across sizes

We will focus on the ‘practical’ aspect of scaling: instead of fitting the data with all the problems discussed above, we consider two cities 1 and 2 with populations P₁ and P₂. Assuming the scaling form equation (1.1) to be correct (with the standard assumption of a constant prefactor, see for example [42] for a generalization of the scaling form), knowing P₁, P₂ and Y₁, we obtain Y₂ as

Y_{2} = Y_{1} {(\frac{P_{2}}{P_{1}})}^{β} .

3.1

The scaling assumption and the value of β thus allow us to predict what will happen to a scaled-up version of a given city. Conversely, we could also ask what would be the ‘local’ exponent that allows us to predict correctly Y₂. Obviously, we have from equation (3.1)

β_{loc} = \frac{\log (Y_{2} / Y_{1})}{\log (P_{2} / P_{1})},

3.2

which has the simple geometric interpretation of being the slope of the straight line joining the points (P₁, Y₁) and (P₂, Y₂) in the loglog representation. If there is no noise and all data points are aligned, we obtain only one value for β_loc for all pairs of cities and which also corresponds to the value $\hat{β}$ obtained by the direct fit (in the following, we will denote the population ratio by r = P₂/P₁ where we consider that P₂ is the largest population so that we always have r ≥ 1). In the general case, studying this value β_loc tells us how different cities are related to each other giving a representation of scaling across different values of the size ratio, akin to some sort of ‘tomography’ scan of scaling. Plotting β_loc versus r is what we will call in this paper the ‘tomography plot’ as it allows us to explore scaling for various cross-sections of the size ratio.

If we assume that $Y_{2} = Y_{1} {(P_{2} / P_{1})}^{β} (1 + η)$ where η is due to noise, we obtain for P₂/P₁ > 1 the general expression

β_{loc} = β + \frac{\log (1 + η)}{\log (P_{2} / P_{1})} .

3.3

This expression shows that when the noise if not too large, the effective exponent converges for large P₂/P₁ to the theoretical one and to its estimate via fitting: $β_{loc} ≃ \hat{β} ≃ β$ . This expression also shows that a plot of β_loc versus log P₂/P₁ for all pairs of cities should display a hyperbolic envelope and that β_loc → β for large size ratio values. For similar populations P₂ = P₁(1 + ɛ) (with ɛ ≪ 1), we obtain at lowest order in ɛ

β_{loc} ≃ β + \frac{\log (1 + η)}{ε} .

3.4

We see here that for small ɛ we can observe arbitrary large values of β_loc for non-zero fluctuations η. For similar cities, noise is therefore relevant and their comparison cannot help us much in determining the scaling exponent.

In the case of a non-multiplicative noise, we could imagine an expression of the form $Y_{2} = Y_{1} {(P_{2} / P_{1})}^{β} + η$ . The noise η cannot be too large, otherwise the scaling assumption is not correct and Y₂/Y₁ would not depend on the ratio P₂/P₁ only. If we however accept this form, a simple calculation shows that

β_{loc} = β + \frac{1}{\log r} \log (1 + \frac{η}{r^{β} Y_{1}}) .

3.5

We thus see that for large r there is a convergence towards β for a large class of noise η. In particular, for r large enough we have

β_{loc} ≃ β + \frac{1}{r^{β} \log r} \frac{η}{Y_{1}},

3.6

which shows that even in this case, if the scaling assumption is correct, there should be a convergence of β_loc towards β.

We end this part by noting some similarities with multiscaling. Indeed we can rewrite the relation equation (3.2) as

Y_{2} = Y_{1} r^{β_{loc} (r)},

3.7

and multiscaling is here encoded in the function β_loc(r). It is however unclear at this stage if we can connect (and how) the behaviour of this function to the existence of multiple scales as it is the case in growth kinetics [43] and this could constitute an interesting question for future research.

3.2. Identifying a benchmark city and defining an effective exponent

This local exponent allows us to define and identify a ‘benchmark city’ that can serve as a reference value for computing quantities for other cities. More precisely, for a city i we first compute the corresponding local exponents for all other cities j versus the ratio r_ij = P_j/P_i as

β_{loc} (i, j) = \frac{\log (Y_{j} / Y_{i})}{\log r_{i j}} .

3.8

We then compute the average and the variance of the local exponent when varying j

⟨ β_{loc} (i) ⟩ = \frac{1}{N - 1} \sum_{j} β_{loc} (i, j)

3.9

and σ^{2} (i) = ⟨ β_{loc}^{2} (i) ⟩ - {⟨ β_{loc} (i) ⟩}^{2},

3.10

where the brackets denote here $⟨ O ⟩ = \sum_{j} O (j) / (N - 1)$ (N is the number of cities). We then define the benchmark city such that the variance σ²(i) is the smallest possible and we denote it by i_min. For this city, the fluctuations of the local exponent are the smallest possible around its average β_eff ≡ 〈β_loc(i_min)〉. This city can then serve as a benchmark in the sense that we can use it for ‘reliably’ computing properties of other cities through the formula

Y (j) = Y (i_{min}) {(\frac{P_{j}}{P_{i_{min}}})}^{β_{eff}},

3.11

and justifies the denomination ‘effective exponent’ as it can be used for practical predictions. Other choices for an effective exponent are of course possible but in the spirit of practical applications we are interested in picking a single value of β for computing the quantity Y for all cities. In this respect, minimizing the variance of β_loc is a simple sensible answer to this question, although probably not the only one.

We note here that this discussion is different from the one about SAMIs (scale-adjusted metropolitan indicators) defined in [21,23] as being the variation of a given city with respect to the fit given by $\hat{β}$

ξ_{i} = \log \frac{Y_{i}}{Y_{0} P_{i}^{\hat{β}}} .

3.12

We will however consider a similar quantity. Knowing β_eff, we compute the fraction f(ɛ₁, ɛ₂) of cities for which

ε_{1} Y_{data} < Y_{predicted} < ε_{2} Y_{data},

3.13

where Y_data is the actual value for a given city of population P and

Y_{predicted} = Y (i_{min}) {(\frac{P}{P_{i_{min}}})}^{β_{eff}} .

3.14

In particular, we will focus on the case ɛ₁ = 1/ɛ₂ for different values of ɛ₂. We will systematically give the value of f(1/2) for ɛ₂ = 2 as it gives a good idea of the accuracy of the prediction computed with β_eff. Additional information can be provided by plotting the function f(ɛ) ≡ f(1/ɛ, ɛ) for ɛ > 1 and we will show it in a few cases.

Finally, we note that it might be possible to construct a more general framework that includes these different definitions and objects, exhibiting possible relations between these tools and we leave this question for future research.

4. Applications to real-world datasets

We now apply these tools to the different datasets discussed in [36]. These datasets concern different areas of the world (Europe, USA, Organization for Economic Co-operation and Development (OECD), Brazil) and various socio-economical quantities and were analysed with standard statistical methods. They represent therefore an interesting benchmark dataset for testing other methods. We note that for most of these datasets cities have to be understood as urban areas, except for brazilian datasets where administrative boundaries were used (all the details can be found in [36]). We first discuss clear cases for which there is no or little ambiguity about the scaling behaviour and see how it is confirmed with the tools proposed here. We then focus on less clear cases for which we find the results are not completely consistent with the classical analysis and also on the datasets where the statistical analysis in [36] was ‘inconclusive’, meaning that the result depended on the assumption taken for the disorder. Our main goal here will be to show how our tools can shed new light on these problematic or inconclusive cases.

4.1. Simple cases

4.1.1. Income and patents in the UK

We first consider the case of the total weekly income in the UK considered in [36]. In this case, the behaviour seems to be linear and the fit gives $\hat{β} = 1.01$ (r² = 0.99). We note here that this result is consistent with those found in [12] for various definitions of cities. The fit is shown in figure 2.

Figure 2. — Total weekly income for cities in the UK (see [12,36] for a description of the data) versus population. The power-law fit gives β = 1.01 and r² = 0.99. (Online version in colour.)

We compute the local exponent β_loc versus P₂/P₁ in this case and obtain the result shown in figure 3. We also show the average of β_loc in each r-bin and the corresponding error bar computed as the standard dispersion, the values corresponding to the linear case (β = 1) and to the power-law fit $\hat{β} = 1.01$ ).

We observe on this plot large fluctuations for small r = P₂/P₁ as expected: in this regime, the local exponent is governed by fluctuations among cities of similar population sizes. For larger r, we observe a quick convergence to 1, and for most pairs of cities with $r ≳ 100$ we observe β_loc = 1.0 ± 0.1. We also note that for very large ratios $r ≳ 10^{4}$ , the local exponent is slightly smaller than one (figure 3b).

In order to complete this picture, we now identify the ‘benchmark’ city, defined above as the city which allows the most reliable prediction (i.e. with the smallest fluctuations), and obtained as the minimization of the variance given by equation (3.10). The ‘effective’ exponent is the one associated with this benchmark city as it can be used for reliable predictions. Instead of plotting β and σ for each city, we directly show in figure 4a the dispersion σ versus β and we find that it is the smallest city of the dataset that is the benchmark. The corresponding effective exponent is β_eff = 0.97 ± 0.03.

Figure 4. — (a) Dispersion σ versus the value of the exponent β (we show here the zoom on the part with σ < 40). (b) Variation of f(1/ɛ, ɛ) with ɛ > 1 for β = β_eff. We observe that for ɛ > 2, the ratio Y_predicted/Y_data is at least in the range [0.5, 2.0] for all cities. (c) Using the benchmark city, we compare predictions and data with the ratio Y_predicted/Y_data. We observe here that for all pairs of cities this ratio is in the range [0.5, 2.0]. (Online version in colour.)

For this value of β_eff, we study the evolution of the function f(ɛ) which represents the fraction of cities for which the prediction lies in the range [Y_data/ɛ, Y_dataɛ] and show it in figure 4b. We also display in figure 4c the ratio Y_predicted/Y_data versus the population P_i and which shows that the fraction f(1/2) of cities with a ratio Y_predicted/Y_data in the range [0.5, 2.0] is $f (1 / 2) = 100 %$ (we note here that for another value such as $β = \hat{β}$ the corresponding value is a bit smaller $f (1 / 2) = 97 %$ ).

We thus see on this example a convergence of evidences: the naive fit gives $\hat{β} \approx 1$ , the quantity β_loc converges quickly towards 1 and most pairs of cities are related to each other via a linear relation. Finally, the most reliable way to compute the income for a city i is to use an effective exponent 0.97 demonstrating a slight sublinearity (which comes from pairs of nodes with a large population ratio as can be seen in figure 3b).

We observe the same sort of behaviour for patents in the UK (plots not shown): the fit gives $\hat{β} = 1.06$ (r² = 0.88), an effective exponent β_eff = 0.96. We compare for this case the functions f(1/ɛ, ɛ) obtained in the different cases β = β_eff and $β = \hat{β}$ (figure 5).

Figure 5. — Patents in the UK: variation of f(1/ɛ, ɛ) with ɛ > 1 for β = β_eff and $β = \hat{β}$ . We observe here that in terms of prediction the value β_eff is a better choice than the value obtained by fitting. (Online version in colour.)

We see that the effective exponent is always better in terms of predictions, with in particular a value $f (1 / 2) = 62 %$ consistent with a linear behaviour. We note that this linear result for the patents in the UK is in contrast with the result obtained for the US in [8] and more recently in [21] with a nonlinear behaviour characterized by β = 1.28 while in [12] the results for patents in UK cities seem to strongly depend on the definition of cities. We will discuss in more detail this case of US patents below in the ‘Problematic cases’ section.

4.1.2. Clear nonlinear behaviour: the USA case

We now consider here the two datasets for the USA that were studied in [36]. The first one is about the GDP of cities and the second one about the number of miles of roads (in each city). The nonlinear fits for these two quantities are shown in figure 6. In the first case, the GDP displays a clear superlinear behaviour with $\hat{β} = 1.11$ , while for infrastructure the expected sublinear behaviour is observed with $\hat{β} = 0.85$ . We now inspect in more detail these cases with the help of the local exponent β_loc (figure 7). We observe on these plots that the ‘naive’ nonlinear fitting is confirmed: for most pairs of cities, the local exponent is different from one and is equal to $\hat{β}$ (within error bars). If we now compute the effective exponents we obtain for the GDP β_eff = 1.13 ± 0.07 and for the number of miles β_eff = 0.80 ± 0.1. We note that in both cases the benchmark city is New York city (NYC), the largest urban area in the USA (at this point, it seems that there isn't a clear rule for identifying the benchmark city). We see here that all evidences are pointing to the same conclusion of a nonlinear behaviour. Even if β = 1.11 is only slightly different from one the tomography plot (figure 7a) clearly shows the reality of this nonlinear exponent and is confirmed by the value of the effective exponent β_eff = 1.13. Using these effective exponents for the GDP and the number of miles, the fraction f(1/2) is 98% and 91% for the GDP and the miles, respectively. In other words, using NYC as the benchmark city and the effective exponents, we get excellent predictions for all the other cities.

Figure 6. — (a) GDP (in millions of current US dollars) for cities in the USA for the year 2013 [36]. The (red) line is a power-law fit with $\hat{β} = 1.11$ and r² = 0.98. (b) Number of miles for US cities (year 2013) versus population. The power-law fit gives β = 0.85 (r² = 0.91). (Online version in colour.)

Figure 7. — Local exponent versus population ratio r for (a) the GDP for cities in the USA (year 2013) and (b) the total number of miles in the USA (year 2013). (Online version in colour.)

Finally, in order to address the problem discussed in [37], we redo the analysis in the US case for the GDP per capita. In this case, the nonlinear fit is indeed less good with an exponent 0.11 (r² = 0.42) but which corresponds to the value $\hat{β} - 1$ . The tomography plot constructed for this quantity is shown in figure 8 and seems to be free of any ambiguity: for most values of r the local exponent is equal on average to 0.11 and converges quickly towards this value when r increases, and is strictly positive (within error bars). We note that by construction this tomography plot is obtained from the figure 7a by a shift of –1 on the y-axis and the convergence properties are therefore the same. In addition the effective exponent is in this case equal to β_eff = 0.13 ± 0.07 and leads to an impressive value of the fraction of cities $f (1 / 2) = 98 %$ whose value is correctly predicted. All these elements suggest that there is indeed a nonlinear behaviour for the GDP in US cities, even if we work on the GDP per capita that should exclude effects due to the extensivity of this quantity as claimed in [37].

Figure 8. — Tomography plot for the GDP *per capita* for US cities (year 2013). (Online version in colour.)

4.1.3. Nonlinear behaviour with large fluctuations: museum and libraries in Europe

We now consider two other datasets that were also studied in [36]: the attendance of museums (in the year 2011), and the number of public libraries in each city (also for the year 2011). The authors found that there is a superlinear behaviour for the museum case, and a sublinear one for the number of libraries. Both these results are sensible: we expect that the number of libraries scales sublinearly with population as it would be the case for many other facilities [10]. Also, it is not difficult to accept that the attendance of museums can largely benefit from positive interaction effects in cities, leading to a superlinear behaviour. These quantities versus populations are shown in figure 9. In both cases, we observe that there are large fluctuations and not much more than one decade over which the fit is made. For libraries, the power-law fit gives $\hat{β} = 0.80 (r^{2} = 0.59)$ and for museum usage $\hat{β} = 1.42 (r^{2} = 0.69)$ . The tomography plots for these cases are shown in figure 10, and provide further information. First for libraries, there is no convergence of β_loc towards $\hat{β}$ consistent with the fact that the power-law fit is indeed not reliable. In addition, we find that β_eff = 0.17 ± 0.32 (very different from $\hat{β}$ ) and $f (1 / 2) = 55 %$ showing that even the most reliable power-law exponent accounts for about half of the data only. This is a case where our analysis actually weakens the conclusions obtained with standard statistical tools. The situation is obviously improved if we remove outliers. For example, if we remove cities with small population (P < 10⁴) or with a large number of libraries (Y > 10²), the naive fit remains the same with a value $\hat{β} = 0.807$ (r² = 0.73), and the tomography plot is shown in figure 11. We observe that the range of x-axis is obviously smaller (as we removed small size cities), but that the qualitative behaviour remains the same, namely with a relative consistency towards a sublinear behaviour. As expected, the sublinear behaviour is better supported here as outliers are removed.

Figure 9. — (a) Number of libraries in European cities (2011). The (red) line is a power-law fit with $\hat{β} = 0.80 (r^{2} = 0.59)$ . (b) Yearly attendance of museums in European cities (2011). The (red) line is a power-law fit with $\hat{β} = 1.42 (r^{2} = 0.69)$ . (Online version in colour.)

Figure 11. — Tomography plot for the number of libraries in European cities with outliers removed (P < 10⁴ or Y > 10²). (Online version in colour.)

The situation is very different for museum usage as shown in figure 10b: there is a clear convergence of β_loc towards $\hat{β} = 1.42$ , a superlinear behaviour confirmed by β_eff = 1.64 ± 0.47, but with $f (1 / 2) = 45 %$ signalling the presence of very large fluctuations. We also see the effect of these large fluctuations in the slowly increasing fraction f(1/ɛ, ɛ) for increasing ɛ > 1 shown in figure 10c. We thus see on these two examples how our analysis can bring further insights about the quality of the fit.

4.2. Problematic cases

We consider here datasets for which the analysis in [36] did not apparently pose too many problems but for which our tools revealed some difficulties. These datasets are the UK railroads, AIDS cases in Brazil, and the number of patents in cities belonging to OECD countries.

4.2.1. UK railroads and AIDS cases in Brazil: existence of a threshold

The authors of [36] studied the number of train stations in UK cities and found a linear behaviour $\hat{β} = 1.0$ . However, if we plot this number versus the population we obtain the result shown in figure 12a. We first observe that there is a lot of noise and the quality of any fit will likely be very poor. Also, we note that there is a large number of cities with one station exactly which potentially will impact any fitting method. Given all these problems, the linear fit is not too bad, in agreement with the result of the analysis of [36]. However, the plot of the local exponent versus r shown in figure 12b signals the existence of important problems. Indeed, this plot seems to indicate a sublinear behaviour, far from the linear prediction, but also with very large fluctuations (the different hyperbolas appears because of cities with the same number of stations such as 1, or 2 stations, etc—as a test we added noise to the data in order to destroy this effect and observe that the tomography plot is robust). This inconsistency suggests the existence of a problem in this dataset. The presence of large fluctuations could be a reason for the discrepancy observed between the linear behaviour and β_loc(r), but it could also signal another scaling form. In particular, the data are not inconsistent with a fit of the form a + bP where a < 0 (figure 12a) implying a threshold effect: for P < P_c ≈ 30 084 we have no stations while for P ≫ P_c we observe a linear behaviour. In the power-law scaling assumption, we can compute the effective exponent and find β_eff = 0.12 ± 0.17 with $f (1 / 2) = 85 %$ , but given the high level of noise and the high likelihood of another scaling form, we do not assign a high confidence in this result.

Figure 12. — (a) Number of rail stations versus population in UK cities [36]. We show here the linear fit aP with a ≈ 4.67 10⁻⁵ (r² = 0.98), the fit a + bP where a ≈ −1.42 and b ≈ 4.72 10⁻⁵ (r² = 0.98), and the power-law fit $a P^{β}$ where a ≈ 0.01 and β ≈ 0.50 (r² = 0.76). (b) Tomography plot for the number of rail stations. (Online version in colour.)

The situation for the number of AIDS cases in Brazil (for the year 2010) is similar to the previous case. The plot of this number versus population is shown in figure 13a. The power-law fit gives an exponent $\hat{β} = 0.74$ consistent with the sublinear conclusion of [36], but given the large fluctuations a fit of the form a + bP is also consistent with the data. This fit predicts a threshold effect with P_c = 10, 090 and a linear behaviour for P ≫ P_c, similar to the previous case of UK rail stations. The tomography plot (figure 13b) shows that the scaling behaviour is not clear around r ∼ 10³ for which the local exponent is close to 1 but for other values of r we observe a sublinear exponent. The effective exponent is β_eff = 1.03 with a fraction $f (1 / 2) = 67 %$ . It thus seems here that the sublinear conclusion of [36] could actually be challenged by a threshold function and/or a linear behaviour.

Figure 13. — (a) Number of AIDS cases versus city population in Brazil (for 2010). We show both the power-law fit s the power-law fit with exponent $\hat{β} = 0.74$ (r² = 0.81) and the fit of the form $a + b P^{\hat{β}}$ with $\hat{β} = 0.99$ , a = −1.009 and b = 0.00010 (r² = 0.93). (b) Corresponding tomography plot. (Online version in colour.)

4.2.2. Patents from cities in the Organization for Economic Co-operation and Development: not a simple scaling function?

In the case of patents in OECD cities, Leitao et al. [36] found a linear behaviour. The plot of this number versus the population of cities is shown in figure 14a. We observe that there are large fluctuations and that both the linear and the nonlinear fit are consistent with the data (this is obviously due to the noise and the small number of available decades over which we can fit the data). The r² value for the linear fit is better and in agreement with the results of [36] suggesting that the data follow a linear behaviour. However, if we plot the local exponent (figure 14b), it seems that the superlinear behaviour with $\hat{β} = 1.28$ has a possible relevance to the data. The effective exponent β_eff = 1.43 is consistent with this superlinear behaviour, but the fraction of cities with correct prediction is however small and about 37% (figure 14c). At this stage, our analysis suggests that the behaviour of OECD patents is neither linear nor superlinear, and probably not well represented by a simple scaling form. This might be due to the fact that we mix here different countries, with different economies, prohibiting a simple description in terms of a simple scaling function characterized by a single exponent.

Figure 14. — (a) Number of patents versus population for cities in OECD countries [36]. We show here the linear fit aP with a ≈ 1.9 10⁻⁴ (r² = 0.80), and the power-law fit $a P^{β}$ where a ≈ 1.23 10⁻⁶ and β ≈ 1.28 (r² = 0.53). (b) Tomography plot for the number of patents in cities belonging to OECD countries. (c) Ratio of the predicted value over the real value. The grey area represents ratios that are in the range [0.5, 2]. (Online version in colour.)

4.2.2.1. A note on scaling for patents

The number of patents is an important indicator for the productivity and innovation in cities and its study is therefore of great importance for understanding cities and the critical factors for innovation [24]. We saw in previous sections that our analysis for patents in the UK shows a linear/slightly sublinear behaviour and for OECD countries that the scaling form could be more complex than a single power-law form.

The case of US patents was not considered in [36] but was studied in particular in [21]. The power-law fit for the 2005 US data (see [21] for a detailed description of the dataset) gives $\hat{β} = 1.35 (r^{2} = 0.85)$ . However, a linear fit of the form a + bP with a = −19.4 and b = 0.0018 (r² = 0.85) is also consistent with data (figure 15a). This last fit points to the possible existence of a threshold effect with a value P_c ≃ 10, 500, an effect that might have some economical explanation. The tomography plot for this case is shown in figure 15b and seems to confirm the superlinear behaviour with a convergence of β_loc towards $\hat{β}$ , in agreement with results discussed in [24]. The effective exponent is β_eff = 1.19 ± 0.24 and $f (1 / 2) = 55 %$ confirming this superlinearity.

Figure 15. — (a) Number of patents versus population for US cities (year 2005) [21]. We show here both the power-law fit ( $\hat{β} = 1.35 (r^{2} = 0.85)$ ) and the linear fit of the form a + bP (a = −19.4 and b = 0.0018, r² = 0.85). (b) Tomography plot for this quantity. (Online version in colour.)

The UK, OECD countries and the USA therefore display very different behaviour for the scaling of the number of patents and we summarize the results in table 1.

Table 1.

Results for patents for different regions: UK, USA and OECD countries. For each case, we give the fitting exponent $\hat{β}$ (and the corresponding r²-value), the effective exponent β_eff together with the value f(1/2), and the conclusions of our analysis.

region	fit $\hat{β}$ (r²)	β_eff (f(1/2))	conclusion
UK	1.06 (0.88)	0.96 (62%)	linear slightly sublinear
USA	1.35 (0.85)	1.19 (55%)	superlinear
OECD	1.285 (0.53)	1.43 (37%)	superlinear other scaling form?

Open in a new tab

A possible reason for these different behaviours is that the level of aggregation is not the same for these three cases: the OECD is a collection of very different countries, the USA is composed of states with various levels of activity, and the UK is a much smaller set of countries. A study focused on the scaling of this quantity across different countries, at various level of aggregation might reveal more information and is certainly an interesting direction for future research.

4.3. Inconclusive cases

For some of the datasets studied in [36], standard tools could not lead to a clear conclusion about whether the scaling is linear or not. There are mainly two reasons for this. The first one is that β can be larger or smaller than one depending on the assumption used for describing the fluctuations. The second reason is that for some cases the best model for fluctuations improves only marginally the statistics compared to the linear fit. The datasets of [36] in question are the following. For Europe, the cinema capacity and usage (reason 2), and the number of theatres (for the first reason), and in Brazil the number of deaths caused by external causes (second reason). These cases therefore represent interesting playgrounds for testing other methods. We will use the tools developed in this paper and will show that our method can bring some new conclusions or a new perspective such as the existence of a threshold for example.

4.3.1. Cinema capacity and usage (Europe)

We start with cinema capacity (total number of seats) in European cities. The naive fit gives a linear behaviour with $\hat{β} = 0.99 (r^{2} = 0.71)$ . The tomography plot confirms this: there is a convergence of β_loc to 1 (figure 16a). The calculation of the effective exponent gives β_eff = 0.98 and for this value the fraction of cities with a prediction in [0.5, 2.0] is $f (1 / 2) = 74 %$ . All these results point in favour of a linear behaviour. Even if the statistical evidence found in [36] for this behaviour seemed to be insufficient, we have here an objective 74% of cities whose cinema capacity is correctly predicted using an exponent equal to 0.98.

Figure 16. — Cinema capacity in European cities. (a) Tomography plot and (b) ratio Y_predicted/Y_data versus the population. The grey area represents the fraction of cities with ratio in [0.5, 2.0] and which is about 74% here. (Online version in colour.)

In the case of cinema usage computed as the attendance in cinemas in the year 2011, the power-law fit gives the exponent $\hat{β} = 1.46 (r^{2} = 0.64)$ indicating a strongly nonlinear behaviour. The tomography plot is shown in figure 17a, and shows that for most pairs of cities the local exponent is larger than one, except for very large ratios $r ≳ 60$ . This suggests that there is a tendency towards a nonlinear behaviour in agreement with the value of the effective exponent that we find β_eff = 1.17. For this value however, the ratio Y_predicted/Y_data (shown in figure 17b) indicates large fluctuations with only about 50% of cities with a ratio Y_predicted/Y_data in the range [0.5, 2.0]. The other 50% display a ratio either in [0.1, 0.5] or much larger, up to 10² (a picture that is be confirmed by the slow increase of the function f(1/ɛ, ɛ) with ɛ, not shown). In this respect, with such large fluctuations it is indeed a bit hard to conclude, although the superlinear behaviour with β_eff = 1.17 accounts for half of the cities.

Figure 17. — Cinema usage in European cities for the year 2011. (a) Tomography plot. (b) Ratio Y_predicted/Y_data versus the population. The grey area represents the fraction of cities with ratio in [0.5, 2.0] and which is here about 50%. (Online version in colour.)

4.3.2. Theatres in Europe

This dataset contains the number of theatres in European cities (for the year 2011). This case was classified as inconclusive in [36] as the exponent value for β could be either larger or smaller than one depending on the assumptions about the fluctuations. Despite large fluctuations, we can try a power-law fit and the corresponding exponent is $\hat{β} = 0.91 (r^{2} = 0.74)$ (figure 18a). The tomography plot (figure 18b) confirms indeed that this is a difficult case: for $r ≲ 40$ , the local exponent is around 1 while for larger values we observe local exponents smaller than 1 and even smaller than $\hat{β}$ . There is therefore no clear convergence towards the fitting value and this might explain why this case, despite a relatively clear sublinearity, was considered as inconclusive in [36]. This forces us to reconsider the validity of the power-law fit, knowing that we have essentially one decade of variations which is far from being enough for a good fit. We note here that a fit of the form a + bP (or obviously a more complex one of the form a + bP^β) with a = −0.51, and b = 2.510⁻⁵ (r² = 0.68) is also consistent with data. This last fit implies a threshold value P_c ≈ 20 400 above which the number of theatres is non-zero. We note that a threshold effect is here somehow expected: indeed the appearance of theatres only occurs in large cities. If we however try to compute the effective exponent we obtain β_eff = 0.95 and the corresponding fraction is $f (1 / 2) = 71 %$ , suggesting here a slightly sublinear behaviour. This effective exponent together with the tomography plot therefore suggest a slight sublinear behaviour, but we cannot exclude the possibility of a threshold effect (which are not mutually exclusive properties).

Figure 18. — (a) Number of theatres in European cities versus their population. The red line is the power-law fit with exponent $\hat{β} = 0.91$ (r² = 0.74). The green line is the fit of the form a + bP with a = −0.51, and b = 2.510⁻⁵ (r² = 0.68). This fit implies a threshold value P_c ≈ 20 400. (b) Tomography plot: local exponent versus population ratio r for the number of theatres in European cities. (Online version in colour.)

4.3.3. Brazil: death by external causes

This database is provided by Brazil’s Health Ministry for the year 2010 and gives the number of deaths by external causes. In this case too, the authors of [36] found that there were not enough statistical evidence in order to conclude. We show this number versus the population in figure 19a for various fits. The power law is not too bad and predicts a linear behaviour $\hat{β} = 1.03$ . The forms a + bP and $a + b P^{β}$ however do not produce consistent results about the existence of a threshold effect: for the linear fit, there is no threshold while for the second fit (of the form a + bP^β′) there would be a small threshold value P_c = 6, 500 (and β′ = 0.90). It is therefore hard to conclude at this stage but the tomography plot shown in figure 19b is rather clear and points to a linear behaviour: the local exponent converges quickly towards 1 and its average is equal to one for all values of r (within error bars). The effective exponent computed for this case is β_eff = 0.99 and the fraction of cities whose number of deaths is correctly predicted with this value is about 82%. Despite the difficulties with fitting the original data we have here an interesting case where the local exponent analysis is clear and all evidence point to a linear scaling.

Figure 19. — Death by external causes in Brazil (year 2010). (a) Number of deaths versus the population. We show here various fits including the power-law fit $a P^{β}$ with β = 1.03 (r² = 0.91), and fits of the form a + bP (with a = 2.31, b = 0.00068, r² = 0.97) or $a + b P^{β}$ (with a = −9.40, b = 0.0035, β = 0.90, r² = 0.98). (b) Tomography plot showing a convergence towards an exponent equal to 1. (Online version in colour.)

5. Discussion

We summarize all our results in table 2 where we compare them to the conclusions of [36]. We proposed here simple tools for analysing data that could help to understand their scaling behaviour. Although these tools do not replace the standard statistical analysis, they enable a more practical view of the system’s behaviour: if we had to use the scaling form for making predictions, what would be the most reliable exponent? One advantage of this approach is that the answer to this question does not depend on some assumptions, such as the nature of the noise for example. In cases where noise is small and the number of available decades is large, our analysis simply confirms standard tools such as fitting methods. It is in more complex cases where it is difficult to decide which model describes the best the data that our method could be of some help. The analysis of the local exponent gives a precise picture of how different systems of different sizes are related to each other. In some cases, it enables more confident conclusions about the nonlinear or linear behaviour, but in other cases it also signals the failure of a simple scaling. This failure could happen due to a threshold effect for example, but more generally, we could expect that the system is described by a more complex function with more than one exponent, for example. It would be interesting to apply this method at various levels of aggregation for a given quantity, but also to test the temporal evolution of a system as it might reveal some information about its dynamics.

Table 2.

Exponent $\hat{β}$ obtained by fitting the data with the corresponding r²-value and if needed the value of a possible threshold value P_c; effective exponent obtained by minimizing the error in predicting quantities for cities; fraction f(1/2) of cities for which the ratio predicted value/actual value is in the range [0.5, 2]; conclusions of the statistical analysis of [36] and our conclusions.

data	fit $\hat{β}$ (r²)	β_eff	f(1/2)	conclusions of [36]	our conclusions
UK
income	1.01 (0.99)	0.966 ± 0.035	100%	linear	linear/slightly sublinear
railroads	0.50 (0.76), P_c = 30 084	0.12 ± 0.17	85%	linear	threshold effect
patents	1.06 (0.88)	0.96 ± 0.13	62%	linear	slightly sublinear
USA
GDP	1.11 (0.98)	1.13 ± 0.07	98%	superlinear	superlinear
roads	0.85 (0.91)	0.80 ± 0.1	91%	sublinear	sublinear
Europe
cinema (capacity)	0.99 (0.71)	0.98 ± 0.29	74%	inconclusive	linear
cinema (usage)	1.46 (0.64)	1.17 ± 0.55	50%	inconclusive	superlinear (large fluctuations)
museum (usage)	1.42 (0.69)	1.64 ± 0.47	45%	superlinear	superlinear (large fluctuations)
theatres	0.91 (0.74), P_c = 20 400	0.95 ± 0.45	71%	inconclusive	slightly sublinear/threshold effect
libraries	0.80 (0.59)	0.17 ± 0.32	55%	sublinear	fluctuations too large
OECD
GDP	1.12 (0.91)	1.06 ± 0.16	90%	superlinear	superlinear
patents	1.285 (0.53)	1.43 ± 0.71	37%	linear	superlinear/no simple form
Brazil
GDP	1.04 (0.86)	1.21 ± 0.11	63%	superlinear	superlinear
AIDS	0.74 (0.81), P_c = 10 080	1.03 ± 0.13	67%	sublinear	linear/threshold effect
external	1.03 (0.91)	0.99 ± 0.08	82%	inconclusive	linear

Open in a new tab

We could summarize this analysis by proposing the following set of necessary conditions in order to trust the fitting value $\hat{β}$ :

(i)
We need the convergence of β_loc towards $\hat{β}$ .
(ii)
The value of the effective exponent β_eff should be consistent with $\hat{β}$ . In general, the value β_eff should be preferred over $\hat{β}$ , in particular if the value f(1/2) is large (see (iii)).
(iii)
The value of f(1/2) should be at least 50%. This value could of course be debated but at least we should observe a rapid increase of f(ɛ, 1/ɛ) with decreasing ɛ.

If these conditions are not satisfied, we can safely reject the value obtained $\hat{β}$ obtained by the power-law fit. In this case, it suggests, for example, that either fluctuations are too large or that the simple power-law scaling form is not valid.

The discussion was done here on urban data but this method could obviously be applied to any system that displays scaling. In addition, we could probably envision other measures here, but we believe that this sort of bootstrapping could help to better understand the scaling in complex systems, to circumvent lengthy and often non-convergent debates about the quality of a fit.

Acknowledgements

I thank Luis Bettencourt, José Lobo, Scott Ortman, Michael Smith for having organized a workshop entitled ‘Integrating Views on Urban Scaling: Foundations, Criticisms, and Extensions’, at the Santa Fe Institute on 15–17 May 2019. During this workshop, I had the chance to discuss with Elsa Arcaute, Luis Bettencourt, Markus Hamilton, José Lobo, Scott Ortman, Celine Rozenblat, Diego Rybski, Michael Smith, Deborah Strumsky, Geoffrey West and David White, and I thank them for stimulating and challenging debates. Finally, I thank Elsa Arcaute, Luis Bettencourt, Diego Rybski and Deborah Strumsky for having shared their data with me.

Data accessibility

All the data used in this paper are available from [36].

Funding

No funding has been received for this article.

References

1.Kleiber M. 1932. Body size and metabolism. Hilgardia 6, 315–353. ( 10.3733/hilg.v06n11p315) [DOI] [Google Scholar]
2.Kleiber M. 1947. Body size and metabolic rate. Physiol. Rev. 27, 511–541. ( 10.1152/physrev.1947.27.4.511) [DOI] [PubMed] [Google Scholar]
3.De Gennes PG, Gennes PG. 1979. Scaling concepts in polymer physics. Ithaca, NY: Cornell University Press. [Google Scholar]
4.Goldenfeld N. 1992. Lectures on phase transitions and the renormalization group. Boca Raton, FL: CRC Press. [Google Scholar]
5.Barenblatt GI. 2003. Scaling. Cambridge, UK: Cambridge University Press. [Google Scholar]
6.West GB, Brown JH, Enquist BJ. 1997. A general model for the origin of allometric scaling laws in biology. Science 276, 122–126. ( 10.1126/science.276.5309.122) [DOI] [PubMed] [Google Scholar]
7.Pumain D. 2000. Scaling laws and urban systems. Working paper SFI: 2004-02-002. See https://sfi-edu.s3.amazonaws.com/sfi-edu/production/uploads/sfi-com/dev/uploads/filer/23/4d/234dc29d-faf5-4715-bd7e-a418c8cc9e9a/04-02-002.pdf.
8.Bettencourt LM, Lobo J, Helbing D, Khnert C, West GB. 2007. Growth, innovation, scaling, and the pace of life in cities. Proc. Natl Acad. Sci. USA 104, 7301–7306. ( 10.1073/pnas.0610172104) [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Batty M. 2013. The new science of cities. New York, NY: MIT Press. [Google Scholar]
10.Barthelemy M. 2016. The structure and dynamics of cities. Cambridge, UK: Cambridge University Press. [Google Scholar]
11.Barthelemy M. 2019. The statistical physics of cities. Nat. Rev. Phys. 1, 406–415. ( 10.1038/s42254-019-0054-2) [DOI] [Google Scholar]
12.Arcaute E, Hatna E, Ferguson P, Youn H, Johansson A, Batty M. 2015. Constructing cities, deconstructing scaling laws. J. R. Soc. Interface 12, 20140745 ( 10.1098/rsif.2014.0745) [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Samaniego H, Moses ME. 2008. Cities as organisms: allometric scaling of urban road networks. J. Transp. Land Use 1, 21–39. ( 10.5198/jtlu.v1i1.29) [DOI] [Google Scholar]
14.Fuller RA, Gaston KJ. 2009. The scaling of green space coverage in European cities. Biol. Lett. 5, 352–355. ( 10.1098/rsbl.2009.0010) [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Kuhnert C, Helbing D, West GB. 2006. Scaling laws in urban supply networks. Physica A 363, 96–103. ( 10.1016/j.physa.2006.01.058) [DOI] [Google Scholar]
16.Fragkias M, Lobo J, Strumsky D, Seto KC. 2013. Does size matter? Scaling of CO₂ emissions and US urban areas. PLoS ONE 8, e64727 ( 10.1371/journal.pone.0064727) [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Rybski D, Sterzel T, Reusser DE, Winz AL, Fichtner C, Kropp JP. 2013. Cities as nuclei of stability? ArXiv (https://arxiv.org/abs/1304.4406).
18.Oliveira EA, Andrade JS, Makse HA. 2014. Large cities are less green. Sci. Rep. 4, 4235 ( 10.1038/srep04235) [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Ribeiro HV, Rybski D, Kropp JP. 2019. Effects of changing population or density on urban carbon dioxide emissions. Nat. Commun. 10, 1–9. ( 10.1038/s41467-019-11184-y) [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Rybski D, Buldyrev SV, Havlin S, Liljeros F, Makse HA. 2009. Scaling laws of human interaction activity. Proc. Natl Acad. Sci. USA 106, 12 640–12 645. ( 10.1073/pnas.0902667106) [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Bettencourt LM, Lobo J, Strumsky D, West GB. 2010. Urban scaling and its deviations: revealing the structure of wealth, innovation and crime across cities. PLoS ONE 5, e13541 ( 10.1371/journal.pone.0013541) [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Alves LGA, Ribeiro HV, Lenzi EK, Mendes RS. 2013. Distance to the scaling law: a useful approach for unveiling relationships between crime and urban metrics. PLoS ONE 8, e69580 ( 10.1371/journal.pone.0069580) [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Lobo J, Bettencourt LM, Strumsky D, West GB. 2013. Urban scaling and the production function for cities. PLoS ONE 8, e58407 ( 10.1371/journal.pone.0058407) [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Bettencourt LM, Lobo J, Strumsky D. 2007. Invention in the city: increasing returns to patenting as a scaling function of metropolitan size. Res. Policy 36, 107–120. ( 10.1016/j.respol.2006.09.026) [DOI] [Google Scholar]
25.Nomaler O, Frenken K, Heimeriks G. 2014. On scaling of scientific knowledge production in U.S. Metropolitan areas. PLoS ONE 9, e110805 ( 10.1371/journal.pone.0110805) [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Strano E, Sood V. 2016. Rich and poor cities in Europe. An urban scaling approach to mapping the European economic transition. PLoS ONE 11, e0159465 ( 10.1371/journal.pone.0159465) [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Caminha C, Furtado V, Pequeno TH, Ponte C, Melo HP, Oliveira EA, Andrade JS Jr. 2017. Human mobility in large cities as a proxy for crime. PLoS ONE 12, e0171609 ( 10.1371/journal.pone.0171609) [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Pumain D, Paulus F, Vacchiani-Marcuzzo C, Lobo J. 2006. An evolutionary theory for interpreting Urban scaling laws. Cybergeo: Eur. J. Geogr. vol. 343. ( 10.4000/cybergeo.2519) [DOI] [Google Scholar]
29.Bettencourt LM. 2013. The origins of scaling in cities. Science 340, 1438–1441. ( 10.1126/science.1235823) [DOI] [PubMed] [Google Scholar]
30.Bettencourt LMA, Lobo J, Youn H. 2013. The hypothesis of urban scaling: formalization, implications and challenges. ArXiv (https://arxiv.org/abs/1301.5919).
31.Louf R, Barthelemy M. 2013. Modeling the polycentric transition of cities. Phys. Rev. Lett. 111, 198702 ( 10.1103/PhysRevLett.111.198702) [DOI] [PubMed] [Google Scholar]
32.Verbavatz V, Barthelemy M. 2019. Critical factors for mitigating car traffic in cities. PLoS ONE 14, e0219559 ( 10.1371/journal.pone.0219559) [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Molinero C, Thurner S. 2019. How the geometry of cities explains urban scaling laws and determines their exponents. ArXiv (https://arxiv.org/abs/1908.07470).
34.Louf R, Barthelemy M. 2014. How congestion shapes cities: from mobility patterns to scaling. Sci. Rep. 4, 5561 ( 10.1038/srep05561) [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Bureau of Economic Analysis: https://apps.bea.gov/iTable/iTable.cfm?isuri=1&reqid=70&step=1#isuri=1&reqid=70&step=1 (accessed 20 August 2019).
36.Leitao JC, Miotto JM, Gerlach M, Altmann EG. 2016. Is this scaling nonlinear? R. Soc. open sci. 3, 150649 ( 10.1098/rsos.150649) [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Shalizi CR. 2011. Scaling and hierarchy in urban economies. ArXiv (https://arxiv.org/abs/1102.4101).
38.Cottineau C, Hatna E, Arcaute E, Batty M. 2017. Diverse cities or the systematic paradox of urban scaling laws. Comput. Environ. Urban Syst. 63, 80–94. ( 10.1016/j.compenvurbsys.2016.04.006) [DOI] [Google Scholar]
39.Louf R, Barthelemy M. 2014. Scaling: lost in the smog. Environ. Plann. B: Plann. Des. 41, 767–769. ( 10.1068/b4105c) [DOI] [Google Scholar]
40.Depersin J, Barthelemy M. 2018. From global scaling to the dynamics of individual cities. Proc. Natl Acad. Sci. USA 115, 2317–2322. ( 10.1073/pnas.1718690115) [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Ockham W. 1990. Philosophical writings. Indianapolis, IN: Hackett Publishing. [Google Scholar]
42.Bettencourt L. 2019. Complex networks and fundamental urban processes. Mansueto Institute for Urban Innovation Research Paper.
43.Coniglio A, Zannetti M. 1989. Multiscaling in growth kinetics. Europhys. Lett. 10, 575–580. ( 10.1209/0295-5075/10/6/012) [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

All the data used in this paper are available from [36].

[RSIF20190602C1] 1.Kleiber M. 1932. Body size and metabolism. Hilgardia 6, 315–353. ( 10.3733/hilg.v06n11p315) [DOI] [Google Scholar]

[RSIF20190602C2] 2.Kleiber M. 1947. Body size and metabolic rate. Physiol. Rev. 27, 511–541. ( 10.1152/physrev.1947.27.4.511) [DOI] [PubMed] [Google Scholar]

[RSIF20190602C3] 3.De Gennes PG, Gennes PG. 1979. Scaling concepts in polymer physics. Ithaca, NY: Cornell University Press. [Google Scholar]

[RSIF20190602C4] 4.Goldenfeld N. 1992. Lectures on phase transitions and the renormalization group. Boca Raton, FL: CRC Press. [Google Scholar]

[RSIF20190602C5] 5.Barenblatt GI. 2003. Scaling. Cambridge, UK: Cambridge University Press. [Google Scholar]

[RSIF20190602C6] 6.West GB, Brown JH, Enquist BJ. 1997. A general model for the origin of allometric scaling laws in biology. Science 276, 122–126. ( 10.1126/science.276.5309.122) [DOI] [PubMed] [Google Scholar]

[RSIF20190602C7] 7.Pumain D. 2000. Scaling laws and urban systems. Working paper SFI: 2004-02-002. See https://sfi-edu.s3.amazonaws.com/sfi-edu/production/uploads/sfi-com/dev/uploads/filer/23/4d/234dc29d-faf5-4715-bd7e-a418c8cc9e9a/04-02-002.pdf.

[RSIF20190602C8] 8.Bettencourt LM, Lobo J, Helbing D, Khnert C, West GB. 2007. Growth, innovation, scaling, and the pace of life in cities. Proc. Natl Acad. Sci. USA 104, 7301–7306. ( 10.1073/pnas.0610172104) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C9] 9.Batty M. 2013. The new science of cities. New York, NY: MIT Press. [Google Scholar]

[RSIF20190602C10] 10.Barthelemy M. 2016. The structure and dynamics of cities. Cambridge, UK: Cambridge University Press. [Google Scholar]

[RSIF20190602C11] 11.Barthelemy M. 2019. The statistical physics of cities. Nat. Rev. Phys. 1, 406–415. ( 10.1038/s42254-019-0054-2) [DOI] [Google Scholar]

[RSIF20190602C12] 12.Arcaute E, Hatna E, Ferguson P, Youn H, Johansson A, Batty M. 2015. Constructing cities, deconstructing scaling laws. J. R. Soc. Interface 12, 20140745 ( 10.1098/rsif.2014.0745) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C13] 13.Samaniego H, Moses ME. 2008. Cities as organisms: allometric scaling of urban road networks. J. Transp. Land Use 1, 21–39. ( 10.5198/jtlu.v1i1.29) [DOI] [Google Scholar]

[RSIF20190602C14] 14.Fuller RA, Gaston KJ. 2009. The scaling of green space coverage in European cities. Biol. Lett. 5, 352–355. ( 10.1098/rsbl.2009.0010) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C15] 15.Kuhnert C, Helbing D, West GB. 2006. Scaling laws in urban supply networks. Physica A 363, 96–103. ( 10.1016/j.physa.2006.01.058) [DOI] [Google Scholar]

[RSIF20190602C16] 16.Fragkias M, Lobo J, Strumsky D, Seto KC. 2013. Does size matter? Scaling of CO₂ emissions and US urban areas. PLoS ONE 8, e64727 ( 10.1371/journal.pone.0064727) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C17] 17.Rybski D, Sterzel T, Reusser DE, Winz AL, Fichtner C, Kropp JP. 2013. Cities as nuclei of stability? ArXiv (https://arxiv.org/abs/1304.4406).

[RSIF20190602C18] 18.Oliveira EA, Andrade JS, Makse HA. 2014. Large cities are less green. Sci. Rep. 4, 4235 ( 10.1038/srep04235) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C19] 19.Ribeiro HV, Rybski D, Kropp JP. 2019. Effects of changing population or density on urban carbon dioxide emissions. Nat. Commun. 10, 1–9. ( 10.1038/s41467-019-11184-y) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C20] 20.Rybski D, Buldyrev SV, Havlin S, Liljeros F, Makse HA. 2009. Scaling laws of human interaction activity. Proc. Natl Acad. Sci. USA 106, 12 640–12 645. ( 10.1073/pnas.0902667106) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C21] 21.Bettencourt LM, Lobo J, Strumsky D, West GB. 2010. Urban scaling and its deviations: revealing the structure of wealth, innovation and crime across cities. PLoS ONE 5, e13541 ( 10.1371/journal.pone.0013541) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C22] 22.Alves LGA, Ribeiro HV, Lenzi EK, Mendes RS. 2013. Distance to the scaling law: a useful approach for unveiling relationships between crime and urban metrics. PLoS ONE 8, e69580 ( 10.1371/journal.pone.0069580) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C23] 23.Lobo J, Bettencourt LM, Strumsky D, West GB. 2013. Urban scaling and the production function for cities. PLoS ONE 8, e58407 ( 10.1371/journal.pone.0058407) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C24] 24.Bettencourt LM, Lobo J, Strumsky D. 2007. Invention in the city: increasing returns to patenting as a scaling function of metropolitan size. Res. Policy 36, 107–120. ( 10.1016/j.respol.2006.09.026) [DOI] [Google Scholar]

[RSIF20190602C25] 25.Nomaler O, Frenken K, Heimeriks G. 2014. On scaling of scientific knowledge production in U.S. Metropolitan areas. PLoS ONE 9, e110805 ( 10.1371/journal.pone.0110805) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C26] 26.Strano E, Sood V. 2016. Rich and poor cities in Europe. An urban scaling approach to mapping the European economic transition. PLoS ONE 11, e0159465 ( 10.1371/journal.pone.0159465) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C27] 27.Caminha C, Furtado V, Pequeno TH, Ponte C, Melo HP, Oliveira EA, Andrade JS Jr. 2017. Human mobility in large cities as a proxy for crime. PLoS ONE 12, e0171609 ( 10.1371/journal.pone.0171609) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C28] 28.Pumain D, Paulus F, Vacchiani-Marcuzzo C, Lobo J. 2006. An evolutionary theory for interpreting Urban scaling laws. Cybergeo: Eur. J. Geogr. vol. 343. ( 10.4000/cybergeo.2519) [DOI] [Google Scholar]

[RSIF20190602C29] 29.Bettencourt LM. 2013. The origins of scaling in cities. Science 340, 1438–1441. ( 10.1126/science.1235823) [DOI] [PubMed] [Google Scholar]

[RSIF20190602C30] 30.Bettencourt LMA, Lobo J, Youn H. 2013. The hypothesis of urban scaling: formalization, implications and challenges. ArXiv (https://arxiv.org/abs/1301.5919).

[RSIF20190602C31] 31.Louf R, Barthelemy M. 2013. Modeling the polycentric transition of cities. Phys. Rev. Lett. 111, 198702 ( 10.1103/PhysRevLett.111.198702) [DOI] [PubMed] [Google Scholar]

[RSIF20190602C32] 32.Verbavatz V, Barthelemy M. 2019. Critical factors for mitigating car traffic in cities. PLoS ONE 14, e0219559 ( 10.1371/journal.pone.0219559) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C33] 33.Molinero C, Thurner S. 2019. How the geometry of cities explains urban scaling laws and determines their exponents. ArXiv (https://arxiv.org/abs/1908.07470).

[RSIF20190602C34] 34.Louf R, Barthelemy M. 2014. How congestion shapes cities: from mobility patterns to scaling. Sci. Rep. 4, 5561 ( 10.1038/srep05561) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C35] 35.Bureau of Economic Analysis: https://apps.bea.gov/iTable/iTable.cfm?isuri=1&reqid=70&step=1#isuri=1&reqid=70&step=1 (accessed 20 August 2019).

[RSIF20190602C36] 36.Leitao JC, Miotto JM, Gerlach M, Altmann EG. 2016. Is this scaling nonlinear? R. Soc. open sci. 3, 150649 ( 10.1098/rsos.150649) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C37] 37.Shalizi CR. 2011. Scaling and hierarchy in urban economies. ArXiv (https://arxiv.org/abs/1102.4101).

[RSIF20190602C38] 38.Cottineau C, Hatna E, Arcaute E, Batty M. 2017. Diverse cities or the systematic paradox of urban scaling laws. Comput. Environ. Urban Syst. 63, 80–94. ( 10.1016/j.compenvurbsys.2016.04.006) [DOI] [Google Scholar]

[RSIF20190602C39] 39.Louf R, Barthelemy M. 2014. Scaling: lost in the smog. Environ. Plann. B: Plann. Des. 41, 767–769. ( 10.1068/b4105c) [DOI] [Google Scholar]

[RSIF20190602C40] 40.Depersin J, Barthelemy M. 2018. From global scaling to the dynamics of individual cities. Proc. Natl Acad. Sci. USA 115, 2317–2322. ( 10.1073/pnas.1718690115) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20190602C41] 41.Ockham W. 1990. Philosophical writings. Indianapolis, IN: Hackett Publishing. [Google Scholar]

[RSIF20190602C42] 42.Bettencourt L. 2019. Complex networks and fundamental urban processes. Mansueto Institute for Urban Innovation Research Paper.

[RSIF20190602C43] 43.Coniglio A, Zannetti M. 1989. Multiscaling in growth kinetics. Europhys. Lett. 10, 575–580. ( 10.1209/0295-5075/10/6/012) [DOI] [Google Scholar]

PERMALINK

Tomography of scaling

Marc Barthelemy

Abstract

1. Introduction

2. Problems with fitting

Figure 1.

3. Scaling: simple tools for a thorough examination

3.1. Local exponent across sizes

3.2. Identifying a benchmark city and defining an effective exponent

4. Applications to real-world datasets

4.1. Simple cases

4.1.1. Income and patents in the UK

Figure 2.

Figure 3.

Figure 4.

Figure 5.

4.1.2. Clear nonlinear behaviour: the USA case

Figure 6.

Figure 7.

Figure 8.

4.1.3. Nonlinear behaviour with large fluctuations: museum and libraries in Europe

Figure 9.

Figure 10.

Figure 11.

4.2. Problematic cases

4.2.1. UK railroads and AIDS cases in Brazil: existence of a threshold

Figure 12.

Figure 13.

4.2.2. Patents from cities in the Organization for Economic Co-operation and Development: not a simple scaling function?

Figure 14.

4.2.2.1. A note on scaling for patents

Figure 15.

Table 1.

4.3. Inconclusive cases

4.3.1. Cinema capacity and usage (Europe)

Figure 16.

Figure 17.

4.3.2. Theatres in Europe

Figure 18.

4.3.3. Brazil: death by external causes

Figure 19.

5. Discussion

Table 2.

Acknowledgements

Data accessibility

Funding

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases