Skip to main content
PLOS One logoLink to PLOS One
. 2020 Jul 30;15(7):e0236765. doi: 10.1371/journal.pone.0236765

New framework of Getis-Ord’s indexes associating spatial autocorrelation with interaction

Yanguang Chen 1,*
Editor: Bing Xue2
PMCID: PMC7392341  PMID: 32730303

Abstract

Spatial autocorrelation and spatial interaction are two important analytical processes for geographical analyses. However, the internal relations between the two types of models have not been brought to light. This paper is devoted to integrating spatial autocorrelation analysis and spatial interaction analysis into a logic framework by means of Getis-Ord’s indexes. Based on mathematical derivation and transform, the spatial autocorrelation measurements of Getis-Ord’s indexes are reconstructed in a new and simple form. A finding is that the local Getis-Ord’s indexes of spatial autocorrelation are equivalent to the rescaled potential energy indexes of spatial interaction theory based on power-law distance decay. The normalized scatterplot is introduced into the spatial analysis based on Getis-Ord’s indexes, and the potential energy indexes are proposed as a complementary measurement. The global Getis-Ord’s index proved to be the weighted sum of the potential energy indexes and the direct sum of total potential energy. The empirical analysis of the system of Chinese cities are taken as an example to illustrate the effect of the improved methods and measurements. The mathematical framework newly derived from Getis-Ord’s work is helpful for further developing the methodology of geographical spatial modeling and quantitative analysis.

1 Introduction

Spatial autocorrelation and spatial interaction models represent two theoretical cornerstones and classic contents of geographical analyses. Spatial autocorrelation is based on the concept of correlation coefficient, and the main measurements include Moran’s index [1], Geary’s coefficient [2], and Getis-Ord’s indexes [3, 4]. Spatial interaction is based on the gravity concept, and the chief models and methods including gravity model [57], potential energy formulae [8, 9], and entropy-maximizing model family [1012]. However, the mathematical links between spatial autocorrelation and spatial interaction have not been revealed at present. In fact, there are significant similarities and differences between the two methods. The similarities between spatial autocorrelation and interaction are as follows. First, both of them are based on size measurements and distance decay effect. Second, both of them can be used to describe strength patterns of spatial association between different geographical elements. The principal difference between the two methods rests with the correlation properties. Spatial autocorrelation is focused on the intra-correlation or self-correlation of a group of elements, while the spatial interaction is focused on the inter-correlation or cross-correlation between many different elements, especially two elements. Sometimes, if we examine the same elements in a geographical system by using the same size and distance measurements, auto-correlation and cross-correlation are often weaved into one another. Thus, spatial autocorrelation analysis may be combined with spatial interaction modeling. If so, we can find a new way of spatial analysis for characterizing geographical patterns and processes.

In a sense, spatial autocorrelation analyses are more widely made than spatial interaction analyses in scientific studies. The former is a theory of spatial statistics, while the latter is a geographical theoretical model. The methods of spatial autocorrelation have been developing [4, 1323]. The statistics of spatial autocorrelation such as Moran’s I and Ripley’s K has been applied to spatial association processes in various fields, for example, man-land relationships [24], human diseases [2529], animal disease transmission [30], human fertility and mortality [31, 32], human genome [33], spatial pattern of urbanization [34], ecological patterns [3537], maritime anomaly detection [38], and spatial sampling and data analysis [3943]. In contrast, spatial interaction analysis is mainly confined to geographical research. A discovery will be made in this work that the Getis-Ord’s indexes can be used to connect spatial autocorrelation and spatial interaction based on the power-law decay. If we can express the inherent correlation between them by mathematical equations, we will be able to advance the methodology of spatial analysis. This paper is devoted to reconstructing the mathematical expressions of Getis-Ord’s indexes and thus integrating the spatial interaction into spatial autocorrelation analysis using Getis-Ord’s indexes. Solving this problem results in a series of improvements to the models and measurements based on the Getis-Ord’s indexes. The rest parts are organized as follows. First, a new mathematical framework of spatial autocorrelation based on Getis-Ord’s indexes are proposed, and a scatterplot is introduced into the new framework to visualize the analytical process. Then, the local Getis-Ord’s indexes based on the power-law distance decay are proved to be the rescaled potential energy indexes, and the global Getis-Ord’s index proved to be the weighted sum of the local indexes. Finally, the system of the main Chinese cities are taken as an example to illustrate how to use the new analytical framework of spatial autocorrelation process.

2 Theoretical results

2.1 Reconstructing formulae of Getis-Ord’s indexes

In spatial autocorrelation analysis, Getis-Ord’s indexes are important complement to Moran’s indexes and Geary’s coefficients. Using Getis-Ord’s indexes, we can reveal the inherent relationship between spatial autocorrelation and spatial interaction. Firstly, the mathematical expression of Getis-Ord’s indexes should be reconstructed in a new form. Then, we can reveal the mathematical relationships between Getis-Ord’s indexes and potential indexes. Suppose that there are n geographical elements (e.g., cities) in a regional system (e.g., a network of cities) which can be measured by a size variable x (e.g., city population). A vector of the element sizes is as follows

x=[x1x2xn]T, (1)

where xi is the size measurement of the ith element (i = 1,2,…,n). The sum of xi is as below:

S=i=1nxi. (2)

The unitized vector of x can be given by y = x/S = [y1, y2, …, yn]T, in which the ith entry is

yi=xiS=xi/i=1nxi=xinx¯, (3)

in which x¯ denotes the mean of xi. The unitization processing depends on the mean of size variable, and average value represents the characteristic length of a sample. The concept of unitization based on sum is often confused with the notion of normalization based on range in literature. The variable y meets the condition of unitization such as

i=1nyi=i=1n(xi/i=1nxi)=1Si=1nxi=1. (4)

Thus, Getis-Ord’s index G can be re-expressed in a simple way by means of the unitized variable. Based on a spatial contiguity matrix (SCM), we can construct a spatial weight matrix (SWM). Suppose that there is an n-by-n unitized spatial weights matrix (USWM) such as

W=[wij]n×n, (5)

where i, j = 1,2,…,n. The three properties of the matrix are as follows: (1) Symmetry, i.e., wij = wji; (2) Zero diagonal elements, namely, |wii| = 0; (3) Unitization condition, that is

i=1nj=1nwij=1. (6)

Thus the global Getis-Ord’s index G can be expressed in a quasi-quadratic form as follows

G=yTWy, (7)

which is simple and more convenient than the conventional expression of Getis-Ord’s index. In fact, G is not a really a quadratic form because W is not a positive definite matrix. Expanding Eq (7) yields the original formula of Getis-Ord’s index [3, 4]

G=i=1nj=1nwijyiyj=i=1nj=1nwijxixji=1nj=1nxixj, (8)

where wij denotes the elements of a spatial weight matrix, W [16, 44]. Eq (8) is the common mathematical expression of the global Getis-Ord’s index. The local Getis’s G can be re-written as

G=Wy, (9)

where G = [G1, G2,…, Gn]T. Accordingly, the expanded form is

Gi=j=1nwij(xj/j=1nxj)=j=1nwijyj, (10)

which represents an important measurement of local spatial autocorrelation.

Now, we can investigate the association of spatial autocorrelation with spatial interaction. In fact, if we use the reciprocals of distances between geographical elements (locations) to construct a spatial contiguity matrix, Eq (10) proved to be equivalent to the formula of potential energy. Proposed by Stewart [9, 45, 46], potential energy is a useful measurement in urban geography [47]. In fact, the local Getis’s G reflects a kind of normalized potential energy, and this will be demonstrated next. A normalized potential energy can be defined as follows

Ei=(xi/i=1nxi)j=1n(wijxj/j=1nxj)=yij=1nwijyj, (11)

which bears an analogy with local Moran’s index in form. It can be termed the Local Indicators of Spatial Interaction (LISI), which bears an analogy with the local indicators of spatial association (LISA) [48, 49]. The G value is a relative measurement, while the E value is an absolute measurement for spatial association. It can be proved that

G=i=1nEi=i=1nyij=1nwijyj=i=1nj=1nwijyiyj, (12)

which indicates that the global Getis-Ord’s index G equals the sum of the total potential energy Ei.

Scientific description based on mathematical theory is to utilize characteristic scales, which can be represented by eigenvalues in linear algebra. The theoretical eigen equation of Getis’s index can be derived from the abovementioned definitions. Eq (7) multiplied left by vector y on both sides of the equal sign yields

M*y=yyTWy=Gy, (13)

where

M*=yyTW (14)

can be termed the Ideal Spatial Correlation Matrix (ISCM) in a theoretical sense. ISCM is the outer product correlation matrix (OPCM). In Eq (13), y is the eigenvector (characteristic vector) of M* and Getis-Ord’s index G is just the corresponding maximum eigenvalue (characteristic root). Expanding Eq (13) yields

[y1y2yn][y1y2yn][w11w12w1nw21w22w2nwn1wn2wnn]=[y1j=1nw1jyjy1j=1nw2jyjy1j=1nwnjyjy2j=1nw1jyjy2j=1nw2jyjy2j=1nwnjyjynj=1nw1jyjynj=1nw2jyjynj=1nwnjyj], (15)

which is important for the autocorrelation analysis based on Getis-Ord’s indexes. Comparing Eq (15) with Eq (11) shows that the elements in the diagonal of M* give the normalized potential energy of a geographical system. The trace of M* is equal to the global Getis-Ord’s index, G. The sum of each volume of M* yields the local Getis’ G, that is

Ek=i=1nyij=1nwkjyj, (16)

where i, j, k = 1,2,…,n. Please note that Eq (16) is different from Eq (12). The sum of each row of M* gives the product of yi and the sum of Gi, namely,

yik=1nj=1nwkjyj=yii=1nGi, (17)

which implies

i=1nGi=k=1nj=1nwkjyj=i=1nj=1nwijyj, (18)

where i, j, k = 1,2,…,n. Eqs (16), (17) and (18) can be verified by a simple example. This suggests that we can calculate the normalized potential energy, potential energy indexes, global Getis-Ord’s index, and local Getis-Ord’s indexes by means of the matrix M*.

2.2 Actual spatial correlation matrix

The practical spatial correlation matrix is different from the ideal spatial correlation matrix. In empirical studies, the outer product yyT in Eq (13) can be substituted with the inner product yTy. In fact, the result of yTy is a constant. So we have

yyTy=yTyy=λy, (19)

which suggests that the parameter λ = yTy is the maximum eigenvalue of the outer product matrix yyT, and the unitized size vector y is the corresponding eigenvector. Developing Eq (19) yields

[y1y2yn][y1y2yn][y1y2yn]=[y1i=1nyi2y2i=1nyi2yni=1nyi2]=λ[y1y2yn]. (20)

Further, it can be shown that λ = yTy is the maximum eigenvalue of yyT. For a square matrix, the trace of yyT is

Tr(yyT)=i=1nyi2=λ=λ1+λ2++λn, (21)

where Tr refers to “finding the trace (of yyT)”. If λ1 = λmax = yTy, then we will have

λ={yTy,λ=λmax0,λλmax. (22)

For arbitrary λ, the extended form of yyT is as below:

yyT=[y1y2yn][y1y2yn]=[y1y1y1y2y1yny2y1y2y2y2ynyny1yny1ynyn]. (23)

According to the Cayley-Hamilton theorem, the eigenvalues of any n-by-n matrix are identical to the characteristic roots of the polynomial equation. The characteristic polynomial results from the determinant of the matrix yyT, that is

λEyyT=|λy1y1y1y2y1yny2y1λy2y2y2ynyny1yny1λynyn|=0, (24)

where E denotes the identity/unit matrix. Finding the characteristic roots of Eq (24) yields λ1 = λmax = yTy = y12+y22+…+yn2 and λ2 = λ3 = … = λn = 0.

Now, a practical autocorrelation expression based on the global Getis-Ord’s index can be given by matrixes and vectors. Substituting the maximum eigenvalue λ for the corresponding matrix yyT in Eq (13) products a new mathematical relation. The precondition that Eq (7) comes into existence is

λWy=Gy. (25)

In fact, Eq (25) is left multiplied by yT yields Eq (7). This implies that we can derive Eq (7) from Eq (25). Obviously, Getis-Ord’s index is the maximum eigenvalue of the weight matrix λW, and y is the corresponding eigenvector, which can be normalized as y/√λ. Eq (25) can be re-expressed as a matrix scaling relation such as

My=λWy=yTyWy=Gy, (26)

where

M=λW=yTyW. (27)

In this Eq, M can be termed the Real Spatial Correlation Matrix (RSCM) in the sense of application. RSCM is the inner product correlation matrix (IPCM). The trace of the matrix λW is the eigenvalue with the minimum absolute value, i.e. Tr(λW) = 0. Normalizing the eigenvector yields

yo=yy=yλ. (28)

If we use the mathematical software such as Matlab to calculate the eigenveactor of yyTW or λW, the result will be y° rather than y. Comparing Eq (25) with Eq (13) shows

yyTWy=λWy. (29)

This indicates that the eigenvector G = Wy is still the eigenvector of the outer product matrix yyT, and the corresponding eigenvalue is λ = yTy. Substituting Eq (9) into Eq (29) yields

yyTG=λG, (30)

which suggests that the vector of local Getis-Ord’s index is the eigenvector of yyT corresponding to the eigenvalue λ. Thus we have

(λEyyT)Wy=(λWyyTW)y=0, (31)

in which 0 refers to the zero/null vector. However, Eqs (29) and (31) cannot occur unless the spatial contiguity matrix is a unit matrix. In other words, the vector G is not really an eigenvector of yyT. In empirical analysis, the null vector should be replaced by a residual vector. An approximation relation is as follows

My=λWyyyTWy=M*y, (32)

where the arrow “→” denotes “infinitely approach to” or “be theoretically equal to”. There are always errors between the inner product correlation matrix M = yTyW and the outer product correlation matrix M* = yyTW. Based on the error vector, we can define an index to measure the degree of spatial autocorrelation. The stronger the spatial autocorrelation is, the closer the vector My will be to the vector M*y. A finding is that, according to the Eqs (13) and (26), the global Getis-Ord’s index proved to be the eigenvalue of spatial correlation matrixes. As indicated above, an eigenvalue of a matrix is the characteristic root of the corresponding multinomial of the determinant of the matrix. It represents a characteristic length of spatial analysis. This suggests that, like Moran’s I, Getis-Ord’s G is also a characteristic parameter of geographical spatial modeling.

2.3 Getis-Ord’s scatterplot

The spatial analytical process based on Getis-Ord’s index can be visualized by scatter plots. In order to find new approaches to evaluating Getis-Ord’s indexes and introducing Getis-Ord’s scatterplot into spatial autocorrelation analysis, two vectors based on spatial correlation matrixes should be defined. One is the outer product vector as below

f*=M*y=yyTWy=Gy, (33)

which is based on Eq (13). The other is the inner product vector as follows

f=My=yTyWy=Gy, (34)

which is based on Eq (26). The relationship between y and f* suggests the theoretical autocorrelation trend line, and the dataset of y and f, indicates the scatter points of actual autocorrelation pattern. The residuals of spatial autocorrelation can be defined as

ef=ff*=MyM*y=(λEyyT)Wy, (35)

where ef refers to the errors of the Getis-Ord’s spatial autocorrelation. The squared sum of the residuals Sf is

Sf=efTef=yTW(λEyyT)(λEyyT)Wy0. (36)

The value of ef fluctuates around 0; therefore, the Sf value approaches zero.

By analogy with Moran’s scatterplot, we can employ scatter point graphs to make local spatial autocorrelation analysis based on Getis-Ord’s indexes. If the unitary vector y represents the x-axis, and the corresponding vector λWy represents the y-axis, a Getis-Ord’s scatterplot will be generated. Further, a “trend line” can be added to the plot: the x-axis is still the unitary vector, y, but the y-axis is yyTWy. In other words, the relationship between y and λWy forms the scatter points, while the relationship between y and yyTWy makes the trend line. Differing from Moran’s index which comes between -1 and 1, Getis-Ord’s index ranges from 0 to 1. That is to say, G≥0. As a result, the trend line based on yyTWy does not always match the scatter points based on λWy. In fact, for the positive spatial autocorrelation (Moran’s I>0), a Getis-Ord’s trend line is consistent with its scatter points; however, for the negative spatial autocorrelation (Moran’s I<0), a Getis-Ord’s trend line is inconsistent with its scatter points. In many cases, a trend line of Getis-Ord’s scatter plot serves for a dividing line, and the data points fall into two categories. By means of the scatter points and trend line, we can divide the geographical elements into two groups.

3 Discussion

3.1 Association of autocorrelation with interaction

So far, a series of improvement and development of the spatial autocorrelation analysis based on Getis-Ord’s indexes have been fulfilled. Using the improved expressions of Getis-Ord’s indexes, we can associate spatial autocorrelation analysis with spatial interaction analysis. The main findings and innovations of this work are as follows. First, the computational formulae of Getis-Ord’s indexes are simplified and normalized. Unitizing size vector and spatial weight matrix, we can express Getis-Ord’s index in the simpler way so that the calculations become easier. Second, a scatter plot can be introduced into the analytical process. By analogy with Moran’s scatter plot, we can draw a scatter plot for Getis-Ord’s autocorrelation analysis. Using the scatter plot, we can visualize the spatial patterns and divide geographical elements into several groups. Third, Getis-Ord’s index proved to be an eigenvalue of a spatial correlation matrix. This suggests that Getis-Ord’s index is actually a characteristic length of spatial autocorrelation. Fourth, if we use the reciprocals of geographical distances to define spatial contiguity, Getis-Ord’s index is demonstrated to be equivalent to potential energy. Suppose that spatial contiguity matrix is generated using power-law decay and the distance decay exponent equals 1. Getis-Ord’s index can be converted into local potential energy. Thus, spatial autocorrelation is mathematically associated with spatial interaction.

The precondition of the abovementioned innovations is reconstruction of Getis-Ord’s index formula with matrixes and vectors. It is easy to prove the following relation:

i=1nj=1nxixj=i=1nxij=1nxj, (37)

where

i=1nxi=j=1nxj=const, (38)

in which const denotes a constant. Thus, re-expressing Eq (8) yields

G=i=1nj=1nwijxixji=1nxij=1nxj=i=1nj=1nwij(xii=1nxixjj=1nxj)=i=1nj=1nwijyiyj, (39)

which is equivalent to Eq (7). The relation between the global Getis-Ord’s index and the local Getis-Ord’s index is

G=i=1nxii=1nxij=1nwij(xjj=1nxj)=i=1nyij=1nwijyj=i=1nyiGi, (40)

in which Gi is defined by Eq (9). It is obvious that Eq (40) is equivalent to Eqs (12) and (16). This suggests that the global Getis-Ord’s index is the weighted sum of local Getis-Ord’s index based on the unitized size vector.

By comparison, the relationships and differences between Getis-Ord’s indexes, Moran’s indexes, and potential energy indexes can be made clearer. Getis-Ord’s indexes are different from Moran’s indexes. Getis and Ord proposed the indexes to make up the deficiencies of Moran’s indexes [3]. However, there is an analogy between Getis-Ord’s G and Moran’s I. The similarities are as follows. First, the method of improving the mathematical expressions of Getis-Ord’s index is similar to that of improving the mathematical expressions of Moran’s index. Second, both Moran’s I and Getis-Ord’s G proved to be the eigenvalues of spatial correlation matrixes. Third, both the two computational processes depend on the variable transformation based on average values. The eigenvalues represent the characteristic length of spatial correlation, while average values represent the characteristic length of size samples. A comparison between the two measurements is drawn and tabulated as follows (Table 1). Apparently, both the new forms of the Getis-Ord’s indexes and Moran’s indexes are based on unitized spatial contiguity matrix, W. But the size vector is different in form. The Moran’s indexes are based on standardized size vector, while the corresponding Getis-Ord’s indexes is based on unitized size vector. So, Moran’s index I comes between -1 and 1 (-1≤I≤1), while Getis-Ord’s index G varies from 0 to 1 (0≤G≤1).

Table 1. A comparison of form and structure between Moran’s index, I, and Getis-Ord’s index, G.

Parameter Formula Definition of variable
Global index Local index
Moran’s index, I I = zTWz Ii=zij=1nwijzj zi=(xix¯)/s
Getis-Ord’s index, G G = yTWy Gi=j=1nwijyj yi=xi/i=1nxi=xi/(nx¯)

Next, let’s investigate the relationship between Getis-Ord’s indexes for spatial autocorrelation and the potential energy indexes for spatial interaction. The classical gravity model of geographical spatial interaction is as below [6]:

Iij=Kxixjrijb, (41)

where xi and xj are two size measures (e.g., city population), rij is the distance between the i location and the j location, Iij denotes the attraction force between xi and xj, the parameter K refers to the gravity coefficient, and b to the distance decay exponent (b>1). The distance exponent proved to be a kind of fractal dimension [50]. Thus the mutual energy between the i location and the j location can be defined as [9, 46, 51]

Iijrij=Kxixjrijb1. (42)

Thus, the gravitational potential can be defined as sj = Iijrij/xi [51]. The total mutual energy (TME) between the i location and other locations can be given by

Ei=j=1nIijrij=Kxij=1n1xjrijb1=Kxij=1n1xjrijq. (43)

where q = b-1 denotes distance scaling exponent. The value of Ei reflects the influence power of an element at the ith location in a regional network. Accordingly, the potential energy index (PEI) indicating the total gravitational potential of the i location in a geographical system can be defined as [47]

Vi=Eixi=Kj=1n1xjrijb1=Kj=1n1xjrijq, (44)

which reflects the traffic accessibility of location i. Without loss of generality, let K = 1 and b = 2, then we have q = 1. Suppose that the spatial proximity function (SPF) is vij = 1/rij and xi and xj are replaced by yi and yj. Unitizing the spatial contiguity matrix, we can convert Eq (44) into Eq (10), and transform Eq (43) into Eq (11). This suggests that Getis-Ord’s index is actually normalized potential energy, and spatial autocorrelation analysis and spatial interaction modeling reach the same goal by different routes.

3.2 Equivalence of Getis-Ord’s G to potential energy

In order to further reveal the association of spatial autocorrelation with spatial interaction, the clearer and exacter relation between Getis-Ord’s indexes and potential energy should be shown. Now, let’s change an angle of view to examine them. In fact, by rescaling potential energy of geographical elements, we can obtained local Getis-Ord’s indexes. By the mathematical derivation, we can find practical links between the two approaches of spatial modeling. To make a spatial autocorrelation analysis, a spatial contiguity matrix must be created by applying a weight function to a spatial proximity matrix [52, 53]. For n elements in a geographic system, a spatial contiguity matrix, V, can be expressed as

V=[vij]n×n=[v11v12v1nv21v22v2nvn1vn2vnn], (45)

in which vij is a measure used to reflect the contiguity relationships between location i and location j (i, j = 1,2,…,n). If i = j as given, then vii≡0. This indicates that the diagonal elements must be converted into zero. Thus a unitized spatial weights matrix, W, can be given by

W=VV0=[w11w12w1nw21w22w2nwn1wn2wnn], (46)

where

V0=i=1nj=1nvij,wij=viji=1nj=1nvij,i=1nj=1nwij=1.

.

In above equations, the value vii≡0 results in the value wii≡0. Compared with spatial contiguity matrix V, the unitized spatial weights matrix W make the mathematical form of spatial autocorrelation become simple and graceful. If the spatial contiguity matrix is unitized by row, the result will violate the well-known distance axiom [54]. There are three types of spatial weight function that can be used to construct spatial continuity matrix, that is, inverse power function, negative exponential function, and staircase functions [52]. Among these weight functions, the inverse power function is the common one [55]. This function stemmed from the impedance function of the gravity model [6]. Generally speaking, the inverse power function is as below

vij={rijq,ij0,i=j, (47)

where rij refers to the distance between location i and location j, and q denotes the distance scaling exponent. Generally, we have q = 1 for spatial autocorrelation [56]. A total quantity of spatial continuity can be defined as

S=i=1nj=1nrijq. (48)

Then, we can rescale the spatial distances as follows

dij=(rijqS)1/q. (49)

Based on the unitized size measure yj and rescaled distances dij, the potential energy is

Vi*=j=1nyjdijq, (50)

which can be regarded as rescaled potential energy. Based on the rescaled distances, the unitized weight is as below

wij=dijq=1rijqS=rijqi=1nj=1nrijq. (51)

Substituting Eq (51) into Eq (50) yields the normalized potential energy index

Vi*=j=1nyjdijq=j=1nwijyj=Gi, (52)

which suggests that the rescaled potential energy index Vi* equals local Getis-Ord’s index Gi. Accordingly, the mutual energy index is Ei* = yiVi* = yiGi. That is to say, Getis-Ord’s indexes for spatial autocorrelation are equivalent to the potential energy indexes for spatial interaction based on the gravity model under certain conditions.

This is a theoretical and methodological study for spatial autocorrelation and spatial interaction. Compared with pure autocorrelation measurements based on Getis-Ord’s indexes, the new framework can yield more systematic outputs of calculations and analyses. The equivalence relationship between Getis-Ord’s indexes and potential energy indexes is useful for spatial modeling. We can employ the gravity analysis of a regional network to estimate the distance scaling exponent value of spatial autocorrelation q. What is more, we can use spatial autocorrelation analysis to complement the spatial interaction analysis and vice versa. Getis-Ord’s indexes are abstract and thus difficult to understand, but it is easy to understand the potential energy concept based on the gravity model. The chief shortcomings of this work are as follows. First, the method relies heavily on linear algebra theory. For the readers who are not familiar with linear algebra, especially matrix knowledge, it is hard to understand the methodology developed in this work. Second, the spatial autocorrelation and cross-correlation analyses are not integrated into framework. The spatial autocorrelation measures can be generalized to spatial cross-correlation measures [44]. Using total potential energy, we can associate spatial interaction with spatial autocorrelation and spatial cross-correlation. Due to the limited space, the problem remains to be solved in a companion paper.

4 Materials and methods

4.1 Approaches to Getis-Ord’s indexes

It is difficult for the learners of spatial autocorrelation and spatial interaction to compute Getis-Ord’s index using the complex formulae. Students can calculate Getis-Ord’s G by means of the professional software such as ArcGIS. However, the computational process is a black box for them. If and only if a student knows how to fulfil a set of complete calculation steps of a measurement, he/she will really understand the principle of the mathematical method. Based on the new framework of Getis-Ord’s spatial autocorrelation expressed by linear algebra, a number of approaches to computing global and local Getis-Ord’s indexes are proposed in this section. Each approach has its own advantages and disadvantages (Table 2). Using the calculation results, we can make an analysis of spatial interaction with the potential energy values (Fig 1). Among these approaches, three ones bear analogy with those for Moran’s index [16]. In other words, all the approaches to calculating Moran’s index can be employed to compute global Getis-Ord’s index. The difference lies in the processing way of size measurements. However, for the local Getis-Ord’s indexes, we should address them in the means differing from those for local Moran’s indexes.

Table 2. Comparison of the advantages and disadvantages of different approaches to global and local Getis-Ord’s indexes.

Level Method Simplicity Result Eq
Local Conventional formula Detailed Directly yield Eq (10)
Matrix manipulation Simple Directly yield Eq (9)
Spatial correlation matrix Simple Directly yield Eqs (15) and (16)
Potential energy Moderate Indirectly yield Eqs (47)–(50)
Global Conventional formula Detailed Directly yield Eq (8)
Three-step calculation Very simple Directly yield Eqs (3), (5), and (7)
Matrix scaling Simple Directly yield Eqs (13) or (26)
Linear regression Moderate Directly yield Eqs (33) or (34)
Local weighting Moderate Indirectly yield Eq (40)
Spatial correlation matrix Simple Indirectly yield Eqs (15) and (16)
Outer product sum Simple Directly yield Eqs (33) and (54)

If the utilized variable y is replaced by the standardized variable z, the seven approaches can be employed to evaluate global Moran’s I, for which the seventh method can also be termed standard deviation method.

Fig 1. A flow chart of data processing, parameter estimation, and autocorrelation analysis based on Getis-Ord’s indexes.

Fig 1

The analytical process is similar to that based on Moran’s index and Geary’s coefficient. However, the measurements and conclusions are different.

The main approaches to computing the local Getis-Ord’s indexes are as follows. (1) Conventional formula method. Using Eq (10), we can calculate local Getis-Ord’s indexes step by step. This is the traditional approach used in literature. (2) Matrix manipulation method. The sizes and weights must be unitized by Eqs (3) and (46). Then, in terms of Eq (9), using the unitized weight matrix W to multiple left the unitized size vector y yields the vector of local Getis-Ord’s indexes G. The process is very simple and can be carried out by MS Excel. (3) Spatial correlation matrix method. Suppose that we obtain the ideal spatial correlation matrix, M* = yyTW. According to Eq (16), the sums of the columns of matrix M* give the local Getis indexes. (4) Potential energy method. Local Getis-Ord’s indexes are equal to the rescaled potential energy measurements. Using Eq (3) to unitize size measurements, using Eqs (48) and (49) to rescale distance matrix, and using Eq (52) to calculate the potential energy based on the special distance scaling exponent q = 1, we can obtain the local Getis-Ord’s indexes.

The approaches for calculating global Getis-Ord’s index are more than seven ones, which are summarized as follows. (1) Conventional formula method. Using Eq (8), we can compute the global Getis-Ord’s index by the traditional method. (2) Three-step calculation method. This approach is very simple and the beginners of spatial autocorrelation analysis can master it easily. The three steps of calculating Getis-Ord’s index are as follows. Step 1: unitize the size variable x. In other words, convert the initial variable x based on Eq (1) into the unitized variable in Eq (3). Step 2: compute the unitized spatial weight matrix. The weights matrix is defined in Eqs (5) and (6) and can be calculated by Eqs (45) and (46). Step 3: calculate Getis-Ord’s index. According to Eq (7), the unitized spatial weight matrix is first left multiplied by the transposition of y, and then the vector yTW is right multiplied by y. The final product of the continued multiplication is the global Getis-Ord’s index. (3) Matrix scaling method. This approach is to find the maximum characteristic value of the spatial correlation matrix. If we work out the maximum eigenvalue of the matrix M* = yyTW or M = λW by using Eq (13) or Eq (26), we will gain the global Getis-Ord’s index. (4) Regression analysis method. Based on Eq (13) or Eq (26), a linear regression analysis can be employed to evaluate Getis-Ord’s G. The unitized vector y is treated as an independent variable (i.e., argument), and f* = M*y or f = My as the corresponding dependent variable (response variable). If the constant term (intercept) is fixed to zero, the regression coefficient (slope) will be equal to the global Getis-Ord’s index. (5) Local weighting method. After calculating the local Getis-Ord’s indexes, we can figure out the global index using Eq (40). The elements of the unitized size vector, y, can serve as weight numbers. The global Getis-Ord’s index equals the weighted sum of the local indexes. (6) Spatial correlation matrix method. Using Eq (16), we can generate the ISCM, M* = yyTW. The trace, i.e., the sum of the diagonal elements of matrix M*, give the global Getis-Ord’s index. (7) Outer product sum method. In terms of Eq (4), the sum of y’s elements is 1. According to Eq (33), we have

i=1n(f*)i=Gi=1n(y)i=Gi=1nyi=G. (53)

Thus the value of Getis-Ord’s index can be calculated using the elements in the vector f*, that is

G=i=1nfi*=i=1n(yyTWy)i, (54)

which indicates an alternative approach to working out global Getis-Ord’s index.

4.2 Empirical analysis

The new framework of spatial autocorrelation based on Getis-Ord’s indexes can be applied to China’s cities to make case studies. The study area includes the whole mainland of China, and the time points are 2000 and 2010, respectively (See S1 Data and S2 Data). As an example of illustrating a methodology, the simpler, the better. Therefore, only the capital cities of the 31 provinces, autonomous regions, and municipalities directly under the Central Government of China (CCC) are taken into account. The urban population from the fifth census in 2000 and the sixth census in 2010 can serve as the two size variables (xi), and the railway mileage between any two cities are used as a spatial proximity measurement (rij). Because the cities of Haikou and Lhasa were not connected to Chinese network of cities by railway for a long time, only 29 cities are really considered in the spatial analysis, and thus the size of the spatial sample is n = 29.

Using the methods shown above and the datasets of city sizes and spatial distances, we can calculate the Getis-Ord’s indexes and potential energy measurements of Chinese systems of cities. By means of one of the seven approaches above-shown, we can compute the global Getis-Ord’s index. For example, using the three-step method based on the formula G = yTWy, we have the following results, for 2000 year, G = 0.001299, and for 2010 year, G = 0.001345. By using one of the four approaches displayed above, we can compute the local Getis-Ord’s indexes. On the other, using the formula of potential energy index and mutual energy index (K = 1, q = 1), Eqs (43) and (44), we can compute the potential energy indexes and mutual energy indexes (See S1 File and S1 Code). If K = 1 and q = 1 as given, then the potential energy indexes equal the corresponding the local Getis-Ord’s indexes, and the mutual energy indexes are just the product of unitized size variable and the local Getis-Ord’s indexes. In short, local Getis-Ord’s indexes equal the normalized potential energy indexes, and the sum of the mutual energy indexes equals the global Getis-Ord’s index (Table 3).

Table 3. The main computational results of spatial autocorrelation and spatial interaction based on Getis-Ord’s indexes (2000 & 2010).

City 2000 2010
Variable (yi) Local Gi & PEI (Vi) yGi & MEI (Ei) Variable (yi) Local Gi & PEI (Vi) yGi & MEI (Ei)
Beijing 0.096014 0.001774 0.000170 0.109598 0.001831 0.000201
Changchun 0.027262 0.001172 0.000032 0.023185 0.001162 0.000027
Changsha 0.021463 0.001403 0.000030 0.020274 0.001346 0.000027
Chengdu 0.038637 0.000938 0.000036 0.041530 0.000938 0.000039
Chongqing 0.057390 0.000907 0.000052 0.061105 0.000898 0.000055
Fuzhou 0.020029 0.000925 0.000019 0.018852 0.000915 0.000017
Guangzhou 0.069445 0.000784 0.000054 0.065137 0.000776 0.000051
Guiyang 0.018497 0.001008 0.000019 0.017128 0.001009 0.000017
Hangzhou 0.024784 0.001985 0.000049 0.031087 0.001969 0.000061
Harbin 0.034932 0.000931 0.000033 0.032845 0.000911 0.000030
Hefei 0.014790 0.001580 0.000023 0.021679 0.001594 0.000035
Hohhot 0.010019 0.001082 0.000011 0.010124 0.001106 0.000011
Jinan 0.026145 0.001690 0.000044 0.023697 0.001751 0.000042
Kunming 0.025059 0.000705 0.000018 0.022152 0.000704 0.000016
Lanzhou 0.018354 0.000931 0.000017 0.016780 0.000934 0.000016
Nanchang 0.016881 0.001512 0.000026 0.013512 0.001490 0.000020
Nanjing 0.034852 0.001766 0.000062 0.039725 0.001785 0.000071
Nanning 0.013695 0.000812 0.000011 0.017085 0.000798 0.000014
Shanghai 0.128610 0.001205 0.000155 0.124315 0.001278 0.000159
Shenyang 0.043929 0.001130 0.000050 0.039929 0.001139 0.000045
Shijiazhuang 0.019519 0.002036 0.000040 0.019428 0.002084 0.000040
Taiyuan 0.025663 0.001529 0.000039 0.021558 0.001565 0.000034
Tianjin 0.053723 0.002228 0.000120 0.062410 0.002345 0.000146
Urumqi 0.017468 0.000420 0.000007 0.019647 0.000420 0.000008
Wuhan 0.066318 0.001269 0.000084 0.051300 0.001277 0.000066
Xi'an 0.036855 0.001200 0.000044 0.034418 0.001204 0.000041
Xining 0.008639 0.000890 0.000008 0.008041 0.000883 0.000007
Yinchuan 0.005847 0.000938 0.000005 0.007895 0.000937 0.000007
Zhengzhou 0.025183 0.001665 0.000042 0.025565 0.001660 0.000042
Sum 1.000000 0.036414 0.001299 1.000000 0.036710 0.001345
Mean 0.034483 0.001256 0.000045 0.034483 0.001266 0.000046

The sum of the Ei values is equal to the global Getis-Ord’s index.

Furthermore, we can draw the Getis-Ord’s scatterplots by means of the scaling relation between the unitized size vectors and the spatial correlation matrixes. Using Eqs (33) and (34), we have two variables f = λWy and f* = yyTWy (Table 4). The relationships between y and f(y) give a scatter plot, and relationships between y and f*(y) yields a trend line in the scatter plot (Fig 2). The scatter plot has at least three uses. First, it can be used to estimate the global Getis-Ord’s index. The slope of the trend line is equal to global Getis-Ord’s G. Second, it can be used to reflect the spatial distribution feature of a geographical system. Third, it can be used to make a simple classification for the research objects. If the points are above the trend line, the actual values of the potential energy indexes are greater than the expected values; if the points are below the trend line, the actual potential energy index values are less than the expected values. Specially, if the points are on the trend line, the actual values are close to the expected values of the potential energy indexes. A discriminant index for the simple classification can be defined as

hi=fifi*=(yTyWy)i(yyTWy)i, (55)

where hi denotes the discriminant index. If hi>1, the ith point is above the trend line, otherwise, the point is beneath the trend line. By the way, the trend line represents the conditional mean value, and the potential energy indexes are equal to the local Getis-Ord’s indexes and indicate accessibility.

Table 4. The computational results of spatial autocorrelation for Getis-Ord’s scattered plots (2000 & 2010).

City 2000 2010
Variable (y) yTyWy (f) yyTWy (f*) Variable (y) yTyWy (f) yyTWy (f*)
Beijing 0.096014 0.000098 0.000125 0.109598 0.000103 0.000147
Changchun 0.027262 0.000065 0.000035 0.023185 0.000065 0.000031
Changsha 0.021463 0.000078 0.000028 0.020274 0.000076 0.000027
Chengdu 0.038637 0.000052 0.000050 0.041530 0.000053 0.000056
Chongqing 0.057390 0.000050 0.000075 0.061105 0.000050 0.000082
Fuzhou 0.020029 0.000051 0.000026 0.018852 0.000051 0.000025
Guangzhou 0.069445 0.000044 0.000090 0.065137 0.000044 0.000088
Guiyang 0.018497 0.000056 0.000024 0.017128 0.000057 0.000023
Hangzhou 0.024784 0.000110 0.000032 0.031087 0.000110 0.000042
Harbin 0.034932 0.000052 0.000045 0.032845 0.000051 0.000044
Hefei 0.014790 0.000088 0.000019 0.021679 0.000089 0.000029
Hohhot 0.010019 0.000060 0.000013 0.010124 0.000062 0.000014
Jinan 0.026145 0.000094 0.000034 0.023697 0.000098 0.000032
Kunming 0.025059 0.000039 0.000033 0.022152 0.000040 0.000030
Lanzhou 0.018354 0.000052 0.000024 0.016780 0.000052 0.000023
Nanchang 0.016881 0.000084 0.000022 0.013512 0.000084 0.000018
Nanjing 0.034852 0.000098 0.000045 0.039725 0.000100 0.000053
Nanning 0.013695 0.000045 0.000018 0.017085 0.000045 0.000023
Shanghai 0.128610 0.000067 0.000167 0.124315 0.000072 0.000167
Shenyang 0.043929 0.000063 0.000057 0.039929 0.000064 0.000054
Shijiazhuang 0.019519 0.000113 0.000025 0.019428 0.000117 0.000026
Taiyuan 0.025663 0.000085 0.000033 0.021558 0.000088 0.000029
Tianjin 0.053723 0.000124 0.000070 0.062410 0.000132 0.000084
Urumqi 0.017468 0.000023 0.000023 0.019647 0.000024 0.000026
Wuhan 0.066318 0.000070 0.000086 0.051300 0.000072 0.000069
Xi'an 0.036855 0.000067 0.000048 0.034418 0.000068 0.000046
Xining 0.008639 0.000049 0.000011 0.008041 0.000050 0.000011
Yinchuan 0.005847 0.000052 0.000008 0.007895 0.000053 0.000011
Zhengzhou 0.025183 0.000092 0.000033 0.025565 0.000093 0.000034
Sum 1.000000 0.002020 0.001299 1.000000 0.002059 0.001345
Mean 0.034483 0.000070 0.000045 0.034483 0.000071 0.000046

The sum of the fi* values is equal to the global Getis-Ord’s index.

Fig 2.

Fig 2

The scatterplots of spatial auto-correlation based on Getis-Ord’s measurement for the main cities of China ((A) 2000 & (B) 2010). The trend line is added to the trend points based on the outer product correlation, yyTWy, and we have perfect fit, R2 = 1. This implies that the connection line of the scattered points yielded by the linear relation between y and yyTWy is just the trend line.

About the Getis-Ord’s scatter plot, it is necessary to explain the two aspects. First, generally speaking, the scattered points are not consistent with the trend line. If we fit Eq (34) to the dataset based on the relationship between λWy and y, the slope of the trend line gives the regression coefficient, which represents the expected global Getis-Ord’s index. Second, there is an alternative form for the scatter plot. If we substitute the original x-axis represented by y with f* = yyTWy, the pattern of the scattered points have no change. In other words, we can use the relationships between f* and f to replace the relationships between y and f (Fig 3). The relative spatial relationships between the scattered points do not change despite the variable substitution. The difference is that the trend line is superseded by the diagonal line from the lower left corner to the upper right corner (f* = f). The scatterplots show that 5 or 6 points are prominent. In 2000, five points are significantly below the trend lines, and these points represent Beijing, Chongqing, Guangzhou, Shanghai, and Wuhan; in 2010, six cities are significantly below the trend line, that is, Beijing, Chongqing, Guangzhou, Shanghai, Chengdu, and Urumqi. Among these cities below the trend line, three ones are the municipalities directly under CCC: Beijing, Chongqing, and Shanghai. Among the four municipalities directly under CCC, Tianjin is a special case or exception. The point representing Tianjin is significantly above the trend line, indicating the highest potential energy index.

Fig 3.

Fig 3

The alternative forms of the scatterplots of spatial auto-correlation based on Getis-Ord’s measurement for the main cities of China ((A) 2010 & (B) 2010). This scatter plot is equivalent to the ones display in Fig 4, but the variable y used as a horizontal axis is replaced by the new variable f* = yyTWy. In this case, the original trend line is replaced by a diagonal line.

Fig 4.

Fig 4

The normal parameter values and abnormal goodness of fit in the scatterplots of spatial auto-correlation based on Getis-Ord’s indexes for the main cities of China ((A) 2000 & (B) 2010). The trend line is added to the scattered points based on inner product correlation, λWy, and the intercept is set as 0. The slope of the trend line give the global Getis-Ord’s index, and the value of goodness of fit, R2, is defined by cosine instead of Pearson correlation. The horizontal line represent absolute average line.

The abovementioned trend line represents conditional mean. Moreover, the arithmetic mean result represents the absolute mean. The absolute mean forms a horizontal average line. The scatterplot can be divided into four “quadrants” by using the conditional mean and absolute mean. In 2000, the absolute mean of the potential energy indexes is about 0.000070, and in 2010, the absolute mean is around 0.000071. If we add the average line reflecting absolute means to a Getis-Ord’s scatterplot, the 29 main cities of China will fall into four sub-regions. The meanings of the four sub-region are as follows. (I) The first region is the upper right part, representing high-high type quadrant (H-H type). The potential energy index of a city is high, so are the potential indexes of surrounding cities. The typical city is Beijing, the national capital of China. (II) The second region is the upper left part, representing high-low type quadrant (H-L type). The potential energy index of a city is high, and there are cities with low potential indexes around it. The typical cities are Tianjin and Hangzhou. (III) The third region is the lower left part, representing the low-low type quadrant (L-L type). The potential energy index of a city is low, and there are cities with low potential indexes around it. The typical cities are Kunming and Nanning. (V) The fourth region is the lower right part, representing the low-high quadrant (L-H type). The potential energy index of a city is low, and there are cities with high potential index around it. The typical cities are Chongqing and Guangzhou. Of course, the high and low potential energy indexes are relative to one another. From 2000 to 2010, only Shanghai, Wuhan, Chengdu, and Urumqi have changed their situations. In fact, Chengdu and Urumqi are near the trend line, their h values are close to 1. This means that their category characteristics are not obvious. Nevertheless, this classification outlines a clear map of urban location and spatial correlation of cities in Mainland China (Table 5).

Table 5. Chinese city classification based on conditional mean (trend line) and absolute mean (average line) (2000 & 2010).

Quadrant 2000 2010
I (H-H) Beijing, Wuhan Beijing, Shanghai
II (H-L) Tianjin, Shijiazhuang, Hangzhou, Nanjing, Jinan, Zhengzhou, Hefei, Taiyuan, Nanchang, Changsha Tianjin, Shijiazhuang, Hangzhou, Nanjing, Jinan, Zhengzhou, Hefei, Taiyuan, Nanchang, Changsha, Wuhan
III (L-L) Xi'an, Changchun, Shenyang, Hohhot, Guiyang, Chengdu, Yinchuan, Lanzhou, Harbin, Fuzhou, Xining, Nanning, Kunming, Urumqi Xi'an, Changchun, Shenyang, Hohhot, Guiyang, Yinchuan, Lanzhou, Harbin, Fuzhou, Xining, Nanning, Kunming
V (L-H) Shanghai, Chongqing, Guangzhou Chongqing, Guangzhou, Chengdu, Urumqi

The locational properties and the spatial association of the 29 Chinese cities can be evaluated by the potential energy indexes and mutual energy indexes. The local Getis-Ord’s indexes are equivalent to the normalized potential energy indexes, and the sum of the mutual energy index equals the global Getis-Ord’s index. By way of potential and mutual energy concepts, we can understand Getis-Ord’s statistics deeply. Using local Getis-Ord’s indexes or potential energy indexes of Chinese cities, we can evaluate the traffic accessibility of these cities. The main features are as follows. First, if the size of a city is relatively small, but there is big cities near the city, then its potential index is high. The typical cities are Tianjin, Shijiazhuang, Hangzhou, and Nanjing. Tianjin and Shijiazhuang are adjacent to the megacity, Beijing, while Hangzhou and Nanjing are adjacent to the megacity, Shanghai. Second, if a city is in the center of the network of cities, then its potential energy index is relatively high to some extent. The typical city is Zhengzhou. The location of Wuhan is also superior, but its size is too large to increase its potential index. Third, the cities in remote areas bear lower potential indexes due to being far from the city network of Chinese mainland. The typical city is Urumqi in Xinjiang, northwestern China, having the lowest potential index. The next one to last is Kunming in Yunnan, located in southwestern China. Although Guangzhou is an economically developed city, due to its location on the southern sea coast, its potential index is also in the bottom. Fourth, during the period from 2000 to 2010, the potential energy indexes of these cities have no significant change. This suggests that the potential indexes of the main Chinese cities are very stable (Fig 5). An interesting phenomenon is that because there are no other large cities around Urumqi, it turned into a high-low type of city in 2000.

Fig 5. The potential energy indexes and local Getis-Ord’s indexes of the main cities in Mainland China (2000 & 2010).

Fig 5

The potential energy index depends on the location of a city in an urban network, but it has nothing to do directly with the size of the city itself. So the potential energy indexes and thus local Getis-Ord’s indexes reflect the spatial association rather than spatial influence. Reflecting the influence power of a city in a network of cities, the mutual energy indexes are function of city size and potential energy indexes. As indicated above, the potential energy index implies a city's accessibility of transportation and the superiority of geographical location in an urban network. Using the mutual energy indexes of the 29 Chinese cities, we can illustrate the absolute positions of these cities in the urban network (Fig 6). The top cities of spatial influence are Beijing, Shanghai, and Tianjin, which are the old municipalities directly under the Central Government of China. From 2000 to 2010, the mutual energy indexes of the three municipalities have significant change. After the three old municipalities, the cities with higher mutual energy index values include Nanjing, Hangzhou, Wuhan, Hangzhou, and Chongqing, which have superior geographic locations and large city sizes. The cities in marginal areas, such as Xining, Yinchuan, Hohhot, Nanning, Kunming, Lanzhou, Fuzhou, and Guiyang bear lower mutual energy indexes due to small city sizes and geographical locations away from the center of urban network. The cities like Xi’an, Shijiazhuang, Chengdu, Harbin, and so on, have middle mutual energy indexes owing to one of advantages in city size or geographical location. The mutual energy index of Hefei went up fast because of city population size doubled from 2000 to 2010.

Fig 6. The mutual energy indexes based on census population of the main cities in Mainland China (2000 & 2010).

Fig 6

5 Conclusions

Scientific research involves two elements, that is, description and understanding. Getis-Ord’s indexes are a type of statistic measurements for spatial description. So, geographical explanation is not the main aim of this study. As a work of methodology research, this paper is devoted to normalizing, developing, and improving the analytical process and techniques of the spatial autocorrelation modeling based on Getis-Ord’s indexes. The chief contributions of this work to geographical spatial analysis lie in four aspects: (1) the computational process is significantly simplified and diversified, (2) the scatter plot is introduced into the analytical process, (3) the parameter characters of the global and local Getis-Ord’s indexes are illustrated, and (4), the relationship between Getis-Ord’s index and potential energy is revealed. If the spatial contiguity matrix is generated using power-law decay function, the local Getis-Ord’s indexes proved to be equivalent to potential energy measurements. Based on these results and findings, we can reach the main conclusions as follows. First, the prerequisite for the effective use of Getis-Ord’s indexes is that the spatial distributions and size distribution possess characteristic scales. The global Getis-Ord’s index, which is a weighted sum of local indexes, is an eigenvalue of spatial correlation matrix, and the local indexes form an eigenvector of the outer product matrix of the unitized size vector. This suggests that the global index is a characteristic length of spatial correlation. For the scale-free geographical processes and patterns, the Getis-Ord’s index is no longer valid. What is more, the unitization processing of size variable depends the average value, where represents the characteristic length of statistical analysis. This implies that we need new measurement for scale-free spatial autocorrelation. Second, the spatial autocorrelation and spatial interaction can be integrated into an analytical framework. The Getis-Ord’s indexes are the measurements for spatial autocorrelation, while the potential energy indexes are the measurement based on spatial interaction. However, the two kinds of measurements are equivalent to one another if the distance decay function is an inverse power law. By unitizing size vector and rescaling spatial distances, we can obtain Getis-Ord’s indexes by calculating potential energy indexes. This indicates that we can unify spatial autocorrelation and spatial interaction to a degree by means of spatial correlation functions. Third, the spatial analytical processes based on Getis-Ord’s indexes can be visualized by normalized scatterplot. The scatterplot similar to Moran’s plot can be employed to make both spatial autocorrelation and spatial interaction analyses in the new framework. The scatterplot can provide a visual pattern for spatial modeling results. Using the scattered points indicating observational values, the trend line indicating predicted values, and the average line indicating absolute mean of local potential energy indexes, we can make a simple spatial cluster for geographical elements in a study area. In practice, different researchers may obtain different types of geographical information from the scatter plots and the related cluster results.

Supporting information

S1 Data. Datasets of urban population and railway distances for calculating Getis-Ord’s indexes in 2000.

This file contains the original or preliminarily processed data of 2000 used in this paper. It provides four complete processes of computing the Getis-Ord’s indexes based on power-law decay and exponential decay, respectively.

(XLSX)

S2 Data. Datasets of urban population and railway distances for calculating Getis-Ord’s indexes in 2010.

This file contains the original or preliminarily processed data of 2010 used in this paper.

(XLSX)

S1 File. A simple approach to calculating Getis-Ord’s indexes using MS Excel.

It illustrates how to use new methods to calculate Getis-Ord’s indexes step by step using MS Excel.

(PDF)

S1 Code. Two Matlab programs for spatial potential energy analysis based on Getis-Ord indexes.

It provides two MatLab programs for calculating Getis-Ord’s indexes: one is based on the datasets of 2000, and the other is based on the datasets of 2010. Readers can employ the programs to carry out spatial potential energy analysis by substituting author’s data with their own data.

(M)

Acknowledgments

I would like to thank the two anonymous reviewers whose constructive comments were helpful in improving the quality of this paper.

Data Availability

Supporting Information files.

Funding Statement

Yes. This research was sponsored by the National Natural Science Foundation of China (Grant No. 41671167. See: http://isisn.nsfc.gov.cn/egrantweb/).

References

  • 1.Moran PAP. The interpretation of statistical maps. Journal of the Royal Statistical Society, Series B. 1948; 37(2): 243–251. [Google Scholar]
  • 2.Geary RC. The contiguity ratio and statistical mapping. The Incorporated Statistician. 1954; 5: 115–145. [Google Scholar]
  • 3.Getis A, Ord JK. The analysis of spatial association by use of distance statistic. Geographical Analysis. 1992. 24(3):189–206. [Google Scholar]
  • 4.Ord JK, Getis A. Local spatial autocorrelation statistics: distributional issues and an application. Geographical Analysis. 1995. 27(4): 286–306. [Google Scholar]
  • 5.Fotheringham AS, O'Kelly ME. Spatial interaction models: Formulations and applications. Boston: Kluwer Academic Publishers; 1989. [Google Scholar]
  • 6.Haggett P, Cliff AD, Frey A. Locational analysis in human geography. 2nd ed. London: Arnold. [Google Scholar]
  • 7.Haynes KE, Fotheringham AS. Gravity and spatial interaction models. London: SAGE Publications; 1984. [Google Scholar]
  • 8.Stewart JQ. A measure of the influence of population at a distance. Sociometry. 1942; 5(1): 63–71. [Google Scholar]
  • 9.Stewart JQ. Demographic gravitation: evidence and applications. Sociometry. 1948; 11(1–2): 31–58. [Google Scholar]
  • 10.Wilson AG. Modelling and systems analysis in urban planning. Nature. 1968; 220: 963–966. 10.1038/220963a0 [DOI] [PubMed] [Google Scholar]
  • 11.Wilson AG. Entropy in urban and regional modelling. London: Pion Press; 1970. [Google Scholar]
  • 12.Wilson AG. Complex spatial systems: The modelling foundations of urban and regional analysis. Singapore: Pearson Education; 2000. [Google Scholar]
  • 13.Anselin L. A local indicator of multivariate spatial association: Extending Geary's c. Geographical Analysis. 2019; 51(2): 133–150. [Google Scholar]
  • 14.Bivand RS, Müller W, Reder M. Power calculations for global and local Moran’s I. Computational Statistics and Data Analysis. 2009; 53: 2859–2872. [Google Scholar]
  • 15.Carrijo TB, da Silva AR. Modified Moran's I for small samples. Geographical Analysis. 2017; 49(4): 451–467. [Google Scholar]
  • 16.Chen YG. New approaches for calculating Moran’s index of spatial autocorrelation. PLoS ONE. 2013; 8(7): e68336 10.1371/journal.pone.0068336 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Griffith DA. Spatial Autocorrelation and spatial filtering: Gaining understanding through theory and scientific visualization. New York: Springer; 2003. [Google Scholar]
  • 18.Haining RP. Spatial autocorrelation and the quantitative revolution. Geographical Analysis. 2009; 41 (4): 364–374. [Google Scholar]
  • 19.Lee J, Li SW. Extending Moran's index for measuring spatiotemporal clustering of geographic events. Geographical Analysis. 2017; 49(1): 36–57. [Google Scholar]
  • 20.Li H, Calder CA, Cressie N. Beyond Moran’s I: Testing for spatial dependence based on the spatial autoregressive model. Geographical Analysis. 2007; 39(4): 357–375. [Google Scholar]
  • 21.Liu L, Tong DQ, Liu X. Measuring spatial autocorrelation of vectors. Geographical Analysis. 2015; 47(3): 300–319. [Google Scholar]
  • 22.Sokal RR, Oden NL. Spatial autocorrelation in biology. 1. Methodology. Biological Journal of the Linnean Society. 1978; 10(2): 199–228. [Google Scholar]
  • 23.Tiefelsdorf M. The saddle point approximation of Moran’s I and local Moran’s Ii reference distributions and their numerical evaluation. Geographical Analysis. 2002; 34(3): 187–206. [Google Scholar]
  • 24.Beck J, Sieber A. Is the spatial distribution of mankind’s most basic economic traits determined by climate and soil alone? PLoS ONE. 2010; 5(5): e10416 10.1371/journal.pone.0010416 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Lai CWK, Law HKW. Mammographic breast density in Chinese women: spatial distribution and autocorrelation patterns. PLoS ONE. 2015; 10(9): e0136881 10.1371/journal.pone.0136881 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Melo HA, Rossoni DF, Teodoro U. Spatial distribution of cutaneous leishmaniasis in the state of Paraná, Brazil. PLoS ONE. 2017; 12(9): e0185401 10.1371/journal.pone.0185401 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Ng I-C, Wen T-H, Wang J-Y, Fang C-T. Spatial dependency of tuberculosis incidence in Taiwan. PLoS ONE. 2012; 7(11): e50740 10.1371/journal.pone.0050740 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Rogerson PA. Maximum Getis–Ord statistic adjusted for spatially autocorrelated data. Geographical Analysis. 2015; 47(1): 20–33. [Google Scholar]
  • 29.Wang L, Xing J-N, Chen F-F, Yan R-X, Ge L, Qin Q-Q, et al. Spatial analysis on hepatitis C virus infection in Mainland China: from 2005 to 2011. PLoS ONE. 2014; 9(10): e110861 10.1371/journal.pone.0110861 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.de la Cruz ML, Perez A, Bezos J, Pages E, Casal C, Carpintero J, et al. Spatial dynamics of bovine tuberculosis in the autonomous community of Madrid, Spain (2010–2012). PLoS ONE. 2014; 9(12): e115632 10.1371/journal.pone.0115632 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Kumar C, Singh PK, Rai RK. Under-five mortality in high focus states in India: A district level geospatial analysis. PLoS ONE. 2012; 7(5): e37515 10.1371/journal.pone.0037515 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Weeks JR, Getis A, Hill AG, Gadalla MS, Rashed T. The fertility transition in Eqypt: Intraurban patterns in Cairo. Annals of the Association of American Geographers. 2004; 94(1): 74–93. [Google Scholar]
  • 33.Koester B, Rea TJ, Templeton AR, Szalay AS, Sing CF. Long-range autocorrelations of CpG islands in the human genome. PLoS ONE. 2012; 7(1): e29889 10.1371/journal.pone.0029889 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Scheuer S, Haase D, Volk M. On the nexus of the spatial dynamics of global urbanization and the age of the city. PLoS ONE. 2016; 11(8): e0160471 10.1371/journal.pone.0160471 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Braun A, Auerswald K, Geist J. Drivers and spatio-temporal extent of hyporheic patch variation: Implications for sampling. PLoS ONE. 2012; 7(7): e42046 10.1371/journal.pone.0042046 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Oreska MPJ, McGlathery KJ, Porter JH. Seagrass blue carbon spatial patterns at the meadow-scale. PLoS ONE. 2017; 12(4): e0176630 10.1371/journal.pone.0176630 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Wilson EA, Sullivan PJ, Dickinson JL. Spatial distribution of oak mistletoe as it relates to habits of oak woodland frugivores. PLoS ONE. 2014; 9(11): e111947 10.1371/journal.pone.0111947 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Roberts SA. A shape‐based local spatial association measure (LISShA): a case study in maritime anomaly detection. Geographical Analysis. 2019; 51(4): 403–425. [Google Scholar]
  • 39.Deblauwe V, Kennel P, Couteron P. Testing pairwise association between spatially autocorrelated variables: A new approach using surrogate lattice data. PLoS ONE. 2012; 7(11): e48766 10.1371/journal.pone.0048766 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Jung PH, Thill J-C, Issel M. Spatial autocorrelation statistics of areal prevalence rates under high uncertainty in denominator data. Geographical Analysis. 2019; 51(3): 354–380. [Google Scholar]
  • 41.Mateo-Tomás P, Olea PP. Anticipating knowledge to inform species management: predicting spatially explicit habitat suitability of a colonial vulture spreading its range. PLoS ONE. 2010. 5(8): e12374 10.1371/journal.pone.0012374 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Wang JF, Stein A, Gao BB, Ge Y. A review of spatial sampling. Spatial Statistics. 2012; 2(1): 1–14. [Google Scholar]
  • 43.Westerholt R, Steiger E, Resch B, Zipf A. Abundant topological outliers in social media data and their effect on spatial analysis. PLoS ONE. 2016; 11(9): e0162360 10.1371/journal.pone.0162360 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Chen YG. A new methodology of spatial cross-correlation analysis. PLoS ONE. 2015; 10(5): e0126158 10.1371/journal.pone.0126158 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Stewart JQ. The development of social physics. American Journal of Physics. 1950; 18: 239–253. [Google Scholar]
  • 46.Stewart JQ. Potential of population and its relationship to marketing In: Cox R W. Alderson W. editors. Theory in Marketing. Homewood, IL: Richard D. Irwin; 1950. pp.19–40. [Google Scholar]
  • 47.Zhou YX. Urban geography. Beijing: The Commercial Press; 1995. [In Chinese]. [Google Scholar]
  • 48.Anselin L. Local indicators of spatial association—LISA. Geographical Analysis. 1995; 27(2): 93–115. [Google Scholar]
  • 49.Anselin L. The Moran scatterplot as an ESDA tool to assess local instability in spatial association In: Fischer M, Scholten HJ, Unwin D. editors. Spatial Analytical Perspectives on GIS. London: Taylor & Francis; 1996. pp.111–125. [Google Scholar]
  • 50.Chen YG. The distance-decay function of geographical gravity model: power law or exponential law? Chaos, Solitons & Fractals. 2015; 77: 174–189. [Google Scholar]
  • 51.Stewart JQ, Warntz W. Macrogeography and social science. Geographical Review. 1958; 48(2): 167–184. [Google Scholar]
  • 52.Chen YG. On the four types of weight functions for spatial contiguity matrix. Letters in Spatial and Resource Sciences. 2012; 5(2): 65–72. [Google Scholar]
  • 53.Getis A. Spatial weights matrices. Geographical Analysis. 2009; 41 (4): 404–410. [Google Scholar]
  • 54.Chen YG. Spatial autocorrelation approaches to testing residuals from least squares regression. PLoS ONE. 2016; 11(1): e0146865 10.1371/journal.pone.0146865 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Cliff AD, Ord JK. Spatial Autocorrelation. London: Pion; 1973. [Google Scholar]
  • 56.Cliff AD, Ord JK. Spatial processes: Models and applications. London: Pion; 1981. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Data. Datasets of urban population and railway distances for calculating Getis-Ord’s indexes in 2000.

This file contains the original or preliminarily processed data of 2000 used in this paper. It provides four complete processes of computing the Getis-Ord’s indexes based on power-law decay and exponential decay, respectively.

(XLSX)

S2 Data. Datasets of urban population and railway distances for calculating Getis-Ord’s indexes in 2010.

This file contains the original or preliminarily processed data of 2010 used in this paper.

(XLSX)

S1 File. A simple approach to calculating Getis-Ord’s indexes using MS Excel.

It illustrates how to use new methods to calculate Getis-Ord’s indexes step by step using MS Excel.

(PDF)

S1 Code. Two Matlab programs for spatial potential energy analysis based on Getis-Ord indexes.

It provides two MatLab programs for calculating Getis-Ord’s indexes: one is based on the datasets of 2000, and the other is based on the datasets of 2010. Readers can employ the programs to carry out spatial potential energy analysis by substituting author’s data with their own data.

(M)

Data Availability Statement

Supporting Information files.


Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES