Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2014 Apr 23;9(4):e94985. doi: 10.1371/journal.pone.0094985

Interrelations of Graph Distance Measures Based on Topological Indices

Matthias Dehmer 1,2,*, Frank Emmert-Streib 3,*, Yongtang Shi 4,5,*
Editor: Matjaz Perc6
PMCID: PMC3997355  PMID: 24759679

Abstract

In this paper, we derive interrelations of graph distance measures by means of inequalities. For this investigation we are using graph distance measures based on topological indices that have not been studied in this context. Specifically, we are using the well-known Wiener index, Randić index, eigenvalue-based quantities and graph entropies. In addition to this analysis, we present results from numerical studies exploring various properties of the measures and aspects of their quality. Our results could find application in chemoinformatics and computational biology where the structural investigation of chemical components and gene networks is currently of great interest.

Introduction

Methods to determine the structural similarity or distance between graphs have been applied in many areas of sciences. For example, in mathematics [1], [2], [3], in biology [4], [5], [6], in chemistry [7], [8] and in chemoinformatics [9]. Other application-oriented areas where graph comparison techniques have been employed can be found in [10], [11], [12]. Note that the terms ‘graph similarity’ or ‘graph distance’ are not unique and strongly depend on the underlying concept. The two main concepts which have been explored extensively are exact and inexact graph matching, see [13], [3]. Exact graph matching [2], [3] relates to match graphs based on isomorphic relations. An important example is the so-called Zelinka distance [3] which requires computing the maximum common subgraphs of two graphs with the same number of vertices. However, it is evident that this technique is computationally demanding as the subgraph graph isomorphism problem is NP-complete [14]. In contrast to this, inexact or approximative techniques for comparing graphs match graphs in an error-tolerant way, see [13]. A highlight of this development has been the well-known graph edit distance (GED) due to Bunke [15]. String-based techniques also fit into the scheme of approximative graph comparison techniques [1], [16]. This approach aims to derive string representations which capture structural information of the underlying networks. By using string alignment techniques, one is able to compute similarity scores of the derived strings instead of matching the graphs by using classical techniques. Concrete examples thereof can be found in [1], [16].

As mentioned, numerous graph similarity and distance measures have been explored. But in fact, there is still a lack of a mathematical framework to explore interrelations of these measures. Suppose let Inline graphic and Inline graphic be two comparative graph measures (i.e., graph similarity or distance measures) which are defined on the graph class Inline graphic. Typical questions in this idea group would be to prove interrelations of the measures by means of inequalities such as Inline graphic. For instance, inequalities involving graph complexity measures have been inferred by Dehmer et al. [17], [18].

The main contribution of this paper is to infer interrelations of graph distance measures. To the best of our knowledge, this problem has not been tackled so far when using graph distance measures. However, interrelations of topological indices interpreted as complexity measures have been studied, see [7], [19], [20], [17], [18]. For instance, Bonchev and his co-workers investigated interrelations of branching measures by means of inequalities [7], [19], [20]. Dehmer [17] examined relations between information-theoretic measures which are based on information functionals and between classical and parametric graph entropies [18]. We here put the emphasis on graph distance measures which are based on so-called topological indices. These measures themselves have not yet been studied. Note that we only consider distance measures (without loss of generality) as they can be easily transformed into graph similarity measures [21]. In order to define these measures concrete, we employ an existing distance measure (see Eq. (6)) and the well-known Randić index [22], the Wiener index [23], eigenvalue-based measures [24], and graph entropies [17], [25]. Also, we discuss quality aspects of the measures and state conjectures evidenced by numerical results.

Methods and Results

Topological Indices and Preliminaries

In this section, we introduce the topological indices which are used in the paper. A topological index [23] is a graph invariant, defined by

graphic file with name pone.0094985.e005.jpg (1)

Simple invariants are for instance the number of vertices, the number of edges, vertex degrees, degree sequences, the matching number, the chromatic number and so forth, see [26].

We emphasize that topological indices are graph invariants which characterize its topology. They have been used for examining quantitative structure-activity relationships (QSARs) extensively in which the biological activity or other properties of molecules are correlated with their chemical structures [27]. Topological graph measures have also been applied in ecology [28], biology [29] and in network physics [30], [31]. Note that various properties of topological graph measures such as their uniqueness and correlation ability have been examined too [32], [33].

Suppose Inline graphic is a connected graph. The distance between the vertices Inline graphic and Inline graphic of Inline graphic is denoted by Inline graphic. The Wiener index of Inline graphic is denoted by Inline graphic and defined by

graphic file with name pone.0094985.e013.jpg (2)

The name Wiener index or Wiener number for the quantity defined is common in the chemical literature, since Wiener [34] in 1947 seems was the first who considered it. For more results on the Wiener index of trees, we refer to [35].

In 1975, Randić [36] proposed the topological index Inline graphic (Inline graphic and Inline graphic) by using the name branching index or connectivity index, suitable for measuring the extent of branching of the carbon-atom skeleton of saturated hydrocarbons. Nowadays this index is also called the Randić index. In 1998, Bollobás and Erdös [37] generalized this index by replacing Inline graphic by any real number Inline graphic, which is called the general Randić index. In fact, the Randić index and the general Randić index became the most popular and most frequently employed structure descriptors used in structural chemistry [38]. For a graph Inline graphic, the Randić index Inline graphic of Inline graphic has been defined as the sum of Inline graphic over all edges Inline graphic of Inline graphic, i.e.,

graphic file with name pone.0094985.e025.jpg (3)

where Inline graphic is degree of a vertex Inline graphic of Inline graphic. The zeroth-order Randić index due to Kier and Hall [6] is

graphic file with name pone.0094985.e029.jpg (4)

For more results on the Randić index and the zeroth-order Randić index, we refer to [39], [22], [38].

For a given graph Inline graphic with Inline graphic vertices, Inline graphic are the eigenvalues of Inline graphic. The energy of a graph Inline graphic, denoted by Inline graphic, has been defined by

graphic file with name pone.0094985.e036.jpg (5)

due to Gutman in 1977 [40]. For more results on the graph energy, we refer to [41], [24], [42].

Novel Graph Distance Measures

Now we define the distance measure [21]

graphic file with name pone.0094985.e037.jpg (6)

which is a mapping Inline graphic. Obviously it holds Inline graphic, Inline graphic, and Inline graphic. In order to translate this concept to graphs, we employ topological indices and obtain

graphic file with name pone.0094985.e042.jpg (7)

Further we infer a relation between the maximum value of Inline graphic and the extremal values of Inline graphic.

Observation 1

Let Inline graphic be a class of graphs. Suppose Inline graphic, then Inline graphic are the two graphs attaining the maximum value of Inline graphic if and only if Inline graphic are the graphs attaining the maximum and minimum value of Inline graphic, respectively.

Proof. Let Inline graphic, then Inline graphic is a monotone increasing function on Inline graphic. Therefore, the maximum value of Inline graphic is attained if and only if the maximum value of Inline graphic is attained. Inline graphic

From Observation 1 and some existing extremal results of topological indices, we obtain some sharp upper bounds of Inline graphic for some classes of graphs. As an example, we list some of those results for trees.

Theorem 1

Let Inline graphic and Inline graphic be two trees with Inline graphic vertices. Denote by Inline graphic and Inline graphic the star graph and path graph with Inline graphic vertices, respectively.

Inline graphic. The maximum value of Inline graphic is attained when Inline graphic and Inline graphic are Inline graphic and Inline graphic, respectively.

Inline graphic. The maximum value of Inline graphic is attained when Inline graphic and Inline graphic are Inline graphic and Inline graphic, respectively.

Inline graphic. The maximum value of Inline graphic is attained when Inline graphic and Inline graphic are Inline graphic and Inline graphic, respectively.

Inline graphic. The maximum value of Inline graphic is attained when Inline graphic and Inline graphic are Inline graphic and Inline graphic, respectively.

Interrelations of Graph Distance Measures

Observe that Inline graphic, which implies that Inline graphic. Some trivial properties of Inline graphic are as follows. Let Inline graphic be a class of graphs and Inline graphic. We get

graphic file with name pone.0094985.e093.jpg (8)
graphic file with name pone.0094985.e094.jpg (9)
graphic file with name pone.0094985.e095.jpg (10)

However, Inline graphic is not a metric graph distance measure, since the triangle inequality Inline graphic for Inline graphic, does not hold generally. Actually, we obtain a modified version of the triangle inequality.

Theorem 2

Let Inline graphic be a topological index. Let Inline graphic be a class of graphs and Inline graphic. If

graphic file with name pone.0094985.e102.jpg (11)

then we have Inline graphic.

Proof. We now suppose Inline graphic, since the proof of the other case is similar.

From the inequality Inline graphic, we get

graphic file with name pone.0094985.e106.jpg (12)

Since Inline graphic, together with Eq. (12), we have

graphic file with name pone.0094985.e108.jpg (13)

Therefore, we have the following inequality,

graphic file with name pone.0094985.e109.jpg (14)

i.e., Inline graphic. Inline graphic

We emphasize if the Inequalities 11 are satisfied, the modified triangle inequality holds. In practice, the triangle inequality may not be absolutely necessary (e.g., for clustering and classification problems) and is often required to prove properties of the measures.

Theorem 3

Let Inline graphic and Inline graphic be two topological indices. Let Inline graphic be a class of graphs and Inline graphic. If

graphic file with name pone.0094985.e116.jpg (15)

then

graphic file with name pone.0094985.e117.jpg (16)

where Inline graphic is a constant.

Proof. Since

graphic file with name pone.0094985.e119.jpg (17)

we obtain

graphic file with name pone.0094985.e120.jpg (18)

Thus

graphic file with name pone.0094985.e121.jpg (19)

i.e.,

graphic file with name pone.0094985.e122.jpg (20)

Thus,

graphic file with name pone.0094985.e123.jpg (21)

The proof is complete. Inline graphic

Suppose Inline graphic is also a topological index. Then if

graphic file with name pone.0094985.e126.jpg (22)

we derive similarly

graphic file with name pone.0094985.e127.jpg (23)

where Inline graphic is a constant. Therefore, we obtain the following theorem.

Theorem 4

Let Inline graphic and Inline graphic be three topological indices. Let Inline graphic be a class of graphs and Inline graphic. If

graphic file with name pone.0094985.e133.jpg (24)

then we infer

graphic file with name pone.0094985.e134.jpg (25)

where Inline graphic are constants.

Theorem 5

Let Inline graphic and Inline graphic be two topological indices. Let Inline graphic be a class of graphs and Inline graphic. If

graphic file with name pone.0094985.e140.jpg (26)

then we get

graphic file with name pone.0094985.e141.jpg (27)

where Inline graphic is a constant.

Proof. Since

graphic file with name pone.0094985.e143.jpg (28)

we infer

graphic file with name pone.0094985.e144.jpg (29)

And therefore,

graphic file with name pone.0094985.e145.jpg (30)
graphic file with name pone.0094985.e146.jpg (31)
graphic file with name pone.0094985.e147.jpg (32)

Hence,

graphic file with name pone.0094985.e148.jpg (33)

From the definition of Inline graphic, i.e.,

graphic file with name pone.0094985.e150.jpg (34)

we obtain that

graphic file with name pone.0094985.e151.jpg (35)

Finally, by substituting (35) into (33), we get the desired result. Inline graphic

Suppose Inline graphic is also a topological index. Then if

graphic file with name pone.0094985.e154.jpg (36)

we have

graphic file with name pone.0094985.e155.jpg (37)

where Inline graphic is a constant. Therefore, we obtain the following theorem.

Theorem 6

Let Inline graphic and Inline graphic be three topological indices. Let Inline graphic be a class of graphs and Inline graphic. If

graphic file with name pone.0094985.e161.jpg (38)

then we have

graphic file with name pone.0094985.e162.jpg (39)

and

graphic file with name pone.0094985.e163.jpg (40)

where Inline graphic are constants.

Theorem 7

Let Inline graphic and Inline graphic be three topological indices. Let Inline graphic be a class of graphs and Inline graphic. If

graphic file with name pone.0094985.e169.jpg (41)

then we infer

graphic file with name pone.0094985.e170.jpg (42)

Proof. Since

graphic file with name pone.0094985.e171.jpg (43)

we derive

graphic file with name pone.0094985.e172.jpg (44)

And therefore,

graphic file with name pone.0094985.e173.jpg (45)

i.e., Inline graphic. Hence we obtain

graphic file with name pone.0094985.e175.jpg (46)

which implies that

graphic file with name pone.0094985.e176.jpg (47)

By substituting (35) into (47), we easily obtain the assertion of the theorem. Inline graphic

By performing a similar proof as in Theorem 7, we obtain a more general result.

Theorem 8

Let Inline graphic be topological indices. Let Inline graphic be a class of graphs and Inline graphic. If

graphic file with name pone.0094985.e181.jpg (48)

we infer

graphic file with name pone.0094985.e182.jpg (49)

Theorem 9

Let Inline graphic and Inline graphic be three topological indices. Let Inline graphic be a class of graphs and Inline graphic. If

graphic file with name pone.0094985.e187.jpg (50)

where Inline graphic, then we get

graphic file with name pone.0094985.e189.jpg (51)

Proof. Since

graphic file with name pone.0094985.e190.jpg (52)

we derive

graphic file with name pone.0094985.e192.jpg (53)

Therefore,

graphic file with name pone.0094985.e193.jpg (54)
graphic file with name pone.0094985.e194.jpg (55)

which implies

graphic file with name pone.0094985.e195.jpg (56)

By applying the substitutions

graphic file with name pone.0094985.e196.jpg (57)

and

graphic file with name pone.0094985.e197.jpg (58)

into (56), we obtain the final result. Inline graphic

By performing a similar proof as in Theorem 9, we obtain a more general result again.

Theorem 10

Let Inline graphic be topological indices. Let Inline graphic be a class of graphs and Inline graphic. If

graphic file with name pone.0094985.e202.jpg (59)

where Inline graphic for Inline graphic, then we infer

graphic file with name pone.0094985.e205.jpg (60)

Graph Distance Measures Based on Randić Index

In this section, we consider the values of the graph distance measure based on the Randić index and other topological indices for some classes of graphs. Denote by Inline graphic and Inline graphic the Wiener index and Randić index, respectively.

Theorem 11

Let Inline graphic be a class of regular graphs with Inline graphic vertices and Inline graphic is an arbitrary topological index. For two graphs Inline graphic, we infer

graphic file with name pone.0094985.e212.jpg (61)

Proof. Let Inline graphic and Inline graphic be two regular graphs of order Inline graphic. By the definition of the Randić index, we obtain that Inline graphic, which implies that Inline graphic. Therefore, we infer Inline graphic. Since Inline graphic for any topological index, then we obtain the desired inequality.

By using the definition of the zeroth-order Randić index for two graphs with the same degree sequences, we obtain that Inline graphic. Therefore, we get the following theorem.

Theorem 12

Let Inline graphic be a class of graphs with the same degree sequences and Inline graphic is an arbitrary topological index. Then for two graphs Inline graphic, we infer

graphic file with name pone.0094985.e224.jpg (62)

For a given graph Inline graphic of order Inline graphic, we get Inline graphic (see [39]). Thus,

graphic file with name pone.0094985.e228.jpg (63)

From (63), we infer an upper bound for Inline graphic.

Theorem 13

Let Inline graphic and Inline graphic be two connected graphs of order Inline graphic. Then we get

graphic file with name pone.0094985.e233.jpg (64)

The equality holds if and only if Inline graphic and Inline graphic are Inline graphic and a regular graph, respectively.

A path Inline graphic is pendent if Inline graphic, Inline graphic and Inline graphic for all Inline graphic. Especially, a vertex Inline graphic is pendent if Inline graphic. Suppose Inline graphic and Inline graphic are two pendent vertices, and Inline graphic the unique neighbor of Inline graphic. We define an operation as follows: deleting the edge Inline graphic and adding the edge Inline graphic. We call this operation “transfer Inline graphic to Inline graphic”.

Theorem 14

Let Inline graphic be a graph with Inline graphic vertices. Denote by Inline graphic and Inline graphic the two pendent paths attaching to the same vertex such that Inline graphic. Denote by Inline graphic the graph obtained by transferring the pendent vertex of Inline graphic to the pendent vertex of Inline graphic. Then we have

graphic file with name pone.0094985.e260.jpg (65)

Proof. Let Inline graphic be a graph with Inline graphic vertices. Suppose Inline graphic and Inline graphic with Inline graphic. Since Inline graphic and Inline graphic are two pendent paths attaching to the same vertex, then we get

graphic file with name pone.0094985.e268.jpg (66)

By using the definition of Inline graphic, we infer Inline graphic. By using the definition of Inline graphic, we only need to show

graphic file with name pone.0094985.e272.jpg (67)

Observe that Inline graphic. We will discuss the difference of the distances between two vertices in Inline graphic and Inline graphic. Let Inline graphic and Inline graphic be two vertices of Inline graphic. If Inline graphic, then we have Inline graphic. Now we suppose Inline graphic. If Inline graphic, then

graphic file with name pone.0094985.e283.jpg (68)

Observe that

graphic file with name pone.0094985.e284.jpg (69)

Therefore, we have

graphic file with name pone.0094985.e285.jpg (70)

i.e,

graphic file with name pone.0094985.e286.jpg (71)

For Inline graphic, it is easy to verify Inline graphic. Therefore Inline graphic holds.

For Inline graphic, from (66), we have Inline graphic and Inline graphic. By performing some elementary calculations, we get

graphic file with name pone.0094985.e293.jpg (72)

i.e.,

graphic file with name pone.0094985.e294.jpg (73)

for Inline graphic and each value of Inline graphic. Therefore, from (63), we infer Inline graphic.

For Inline graphic, from (66), we have Inline graphic and Inline graphic. By performing some elementary calculations, we obtain

graphic file with name pone.0094985.e301.jpg (74)

i.e.,

graphic file with name pone.0094985.e302.jpg (75)

for Inline graphic and each value of Inline graphic. Therefore, from (63), we infer Inline graphic. The proof is complete. Inline graphic

This theorem can be used to compare the values of the distance measure by using trees. Let Inline graphic be the set of trees with Inline graphic vertices and

graphic file with name pone.0094985.e309.jpg (76)

Observe that for every Inline graphic, there must be a tree Inline graphic such that Inline graphic can be obtained from Inline graphic by repeatedly transferring pendent vertices. Therefore, we obtain the following corollary.

Corollary 1

Let Inline graphic, there exists a tree Inline graphic such that Inline graphic.

Actually, numerical experiments show that for any two trees Inline graphic, the inequality Inline graphic holds. We state the result as a conjecture.

Conjecture 1

Let Inline graphic and Inline graphic be any two trees with Inline graphic vertices. Then

graphic file with name pone.0094985.e322.jpg (77)

holds.

As an example, we consider (all) Inline graphic trees with 8 vertices and calculate all possible values of Inline graphic (blue) and Inline graphic (red) as shown in Figure 1. From Figure 1, we observe that Inline graphic holds for each pair of trees Inline graphic and Inline graphic.

Figure 1. All the values of Inline graphic (blue) and Inline graphic (red).

Figure 1

The Y-axis denotes the values of the distance measure and the X-axis denotes the graph pairs.

Graph Distance Measures Based on Graph Entropy

In this section, we consider graph distance measures which are based on graph entropy and other topological indices for some classes of graphs.

In order to start, we reproduce the definition of Shannon's entropy [43]. Let Inline graphic be a probability vector, namely, Inline graphic and Inline graphic. The Shannon's entropy of Inline graphic has been defined by

graphic file with name pone.0094985.e335.jpg (78)

We denote by Inline graphic the graph distance measure based on Inline graphic.

In the following, we infer an upper bound for Inline graphic.

Theorem 15

Let Inline graphic and Inline graphic be two graphs with the same vertex set. Denote by Inline graphic and Inline graphic be the probability vectors of Inline graphic and Inline graphic, respectively. If Inline graphic for each Inline graphic, then we infer

graphic file with name pone.0094985.e347.jpg (79)

where Inline graphic.

Proof. Since Inline graphic for each Inline graphic, then we obtain Inline graphic and Inline graphic. Then we have

graphic file with name pone.0094985.e353.jpg (80)
graphic file with name pone.0094985.e354.jpg (81)
graphic file with name pone.0094985.e355.jpg (82)
graphic file with name pone.0094985.e356.jpg (83)

Therefore, we get the inequality,

graphic file with name pone.0094985.e357.jpg (84)

i.e., Inline graphic. Hence,

graphic file with name pone.0094985.e359.jpg (85)

The desired inequality holds. Inline graphic

In [25], Dehmer and Mowshowitz generalized the definition of graph entropy by using information functionals. Let Inline graphic be a connected graph. For a vertex Inline graphic, we define

graphic file with name pone.0094985.e363.jpg (86)

where Inline graphic represents an arbitrary information functional. By substituting Inline graphic to (78), we have

graphic file with name pone.0094985.e366.jpg (87)

We denote by Inline graphic the graph distance measure based on Inline graphic.

Relations between Inline graphic and Inline graphic

Denote by Inline graphic the eigenvalues of a graph Inline graphic. By setting Inline graphic in (87), we obtain a new expression of the graph entropy namely

graphic file with name pone.0094985.e374.jpg (88)

Recall that the energy of Inline graphic is defined as Inline graphic. Then we infer

graphic file with name pone.0094985.e377.jpg (89)

From the definition of Inline graphic, it is interesting to investigate the relation between the graph distance measures Inline graphic and Inline graphic.

Theorem 16

Let Inline graphic and Inline graphic be two graphs of order Inline graphic with Inline graphic. Denote by Inline graphic and Inline graphic the eigenvalues of Inline graphic and Inline graphic, respectively. Let Inline graphic and Inline graphic. Then we get

graphic file with name pone.0094985.e391.jpg (90)

where Inline graphic is a constant.

Proof. Let Inline graphic and Inline graphic be two graphs of order Inline graphic. Let Inline graphic and Inline graphic with Inline graphic. Then we get

graphic file with name pone.0094985.e399.jpg (91)
graphic file with name pone.0094985.e400.jpg (92)
graphic file with name pone.0094985.e401.jpg (93)
graphic file with name pone.0094985.e402.jpg (94)

where Inline graphic. Thus,

graphic file with name pone.0094985.e404.jpg (95)
graphic file with name pone.0094985.e405.jpg (96)
graphic file with name pone.0094985.e406.jpg (97)
graphic file with name pone.0094985.e407.jpg (98)

i.e.,

graphic file with name pone.0094985.e408.jpg (99)

Taking logarithm for the two sides of the above inequality, we have

graphic file with name pone.0094985.e409.jpg (100)

The required inequality holds. Inline graphic

Actually, numerical experiments show that for any two distinct trees Inline graphic, Inline graphic holds. See Figure 2 as an example, in which we consider (all) Inline graphic trees with 8 vertices and calculate all possible values of Inline graphic (red) and Inline graphic (blue). We state this observation as a conjecture.

Figure 2. Values of Inline graphic (red) and Inline graphic (blue).

Figure 2

The Y-axis denotes the values of the distance measure and the X-axis denotes the graph pairs.

Conjecture 2

Let Inline graphic and Inline graphic be any two distinct trees with Inline graphic vertices. Then

graphic file with name pone.0094985.e421.jpg (101)

holds.

Using a similar proof method of Theorem 16, we can obtain a generalization for the distance measure based on Inline graphic (see Eq. (87)). Let Inline graphic be an arbitrary information functional and Inline graphic be a topological index.

Theorem 17

Let Inline graphic and Inline graphic be two graphs of order Inline graphic with Inline graphic. Let Inline graphic and Inline graphic. Then we have

graphic file with name pone.0094985.e431.jpg (102)

where Inline graphic is a constant.

Dehmer and Mowshowitz [44] introduced a new class of measures (called here generalized measures) that derive from functions such as those defined by Rényi's entropy and Daròczy's entropy. Let Inline graphic be a graph of order Inline graphic. Then

graphic file with name pone.0094985.e435.jpg (103)

If we let Inline graphic, then we can obtain the new generalized entropy based on eigenvalues. We denote the entropy by

graphic file with name pone.0094985.e437.jpg (104)

For a given graph Inline graphic with Inline graphic vertices, denote by Inline graphic the eigenvalues of Inline graphic. By substituting Inline graphic into equality (104), we have

graphic file with name pone.0094985.e443.jpg (105)
graphic file with name pone.0094985.e444.jpg (106)
graphic file with name pone.0094985.e445.jpg (107)

The last equality holds since Inline graphic. By the following theorem, we study the relation between Inline graphic and Inline graphic.

Theorem 18

Let Inline graphic be a class of graphs with Inline graphic vertices and Inline graphic edges. For two graphs Inline graphic, let Inline graphic and Inline graphic. Then we get

graphic file with name pone.0094985.e455.jpg (108)

and

graphic file with name pone.0094985.e456.jpg (109)

where Inline graphic is a constant.

Proof. Let Inline graphic and Inline graphic be two graphs with Inline graphic vertices and Inline graphic edges. Without loss of generality, we suppose Inline graphic.

To show the first inequality, it suffices to prove

graphic file with name pone.0094985.e463.jpg (110)

Then from (107), we derive

graphic file with name pone.0094985.e464.jpg (111)

If we want to prove

graphic file with name pone.0094985.e465.jpg (112)

we only need to show

graphic file with name pone.0094985.e466.jpg (113)

From a well-known bound of energy Inline graphic, we have Inline graphic and Inline graphic. Therefore, Inline graphic holds.

Now we show the second inequality. From (111), we have

graphic file with name pone.0094985.e471.jpg (114)
graphic file with name pone.0094985.e472.jpg (115)
graphic file with name pone.0094985.e473.jpg (116)
graphic file with name pone.0094985.e474.jpg (117)

Therefore, we have

graphic file with name pone.0094985.e475.jpg

From the definition of the distance measure, by some elementary calculations, we finally infer

graphic file with name pone.0094985.e476.jpg (118)
graphic file with name pone.0094985.e477.jpg (119)
graphic file with name pone.0094985.e478.jpg (120)
graphic file with name pone.0094985.e479.jpg (121)

where Inline graphic is a constant.

The proof is complete. Inline graphic

Relations between Inline graphic and Inline graphic

Let Inline graphic be a connected graph with Inline graphic vertices, Inline graphic edges and degree sequence Inline graphic, where Inline graphic for Inline graphic. By setting Inline graphic in (87), we can obtain the new entropy based on degree powers, denoted by Inline graphic

graphic file with name pone.0094985.e492.jpg (122)

For Inline graphic, the expression Inline graphic is just the zeroth-order Randić index Inline graphic. Then by using Theorem 17, we obtain the following result.

Theorem 19

Let Inline graphic and Inline graphic be two graphs of order Inline graphic with Inline graphic. Let

graphic file with name pone.0094985.e500.jpg (123)

Then we have

graphic file with name pone.0094985.e501.jpg (124)

where Inline graphic is a constant.

For Inline graphic, we get

graphic file with name pone.0094985.e504.jpg (125)

Furthermore, by the definition of Inline graphic, for two graphs with the same degree sequences, we obtain that Inline graphic. Therefore, we get the following result.

Theorem 20

Let Inline graphic be a class of graphs with the same degree sequences and Inline graphic is an arbitrary topological index. Then for two graphs Inline graphic, we infer

graphic file with name pone.0094985.e510.jpg (126)

By using the similar proof method applied in Theorem 14, we obtain a weaker result.

Theorem 21

Let Inline graphic be a tree with Inline graphic vertices. Denote by Inline graphic and Inline graphic two pendent paths attaching to the same vertex such that Inline graphic. Denote by Inline graphic the tree obtained by transferring the pendent vertex of Inline graphic to the pendent vertex of Inline graphic. Then we have

graphic file with name pone.0094985.e519.jpg (127)

Proof. Let Inline graphic be a tree with Inline graphic vertices. Suppose Inline graphic and Inline graphic with Inline graphic. Denote by Inline graphic the degree of Inline graphic, i.e., Inline graphic. Since Inline graphic and Inline graphic are two pendent paths attaching to the same vertex, then we have Inline graphic. By using the definition of Inline graphic, we have Inline graphic. By using the definition of Inline graphic, we only need to show

graphic file with name pone.0094985.e534.jpg (128)

For a tree Inline graphic with Inline graphic vertices, we get Inline graphic. By performing elementary calculations, we get

graphic file with name pone.0094985.e538.jpg (129)

Observe that Inline graphic. We first discuss the difference of the distances between two vertices in Inline graphic and Inline graphic. Let Inline graphic and Inline graphic be two vertices of Inline graphic. If Inline graphic, then we have Inline graphic. Now we suppose Inline graphic. If Inline graphic, then Inline graphic Observe that

graphic file with name pone.0094985.e550.jpg (130)

Therefore, we get Inline graphic

For Inline graphic, it is easy to verify that Inline graphic, i.e., Inline graphic. Then,

graphic file with name pone.0094985.e555.jpg (131)

In the following, we suppose Inline graphic.

We obtain Inline graphic and Inline graphic. By performing elementary calculations, we get

graphic file with name pone.0094985.e559.jpg (132)

for Inline graphic and each value of Inline graphic. Therefore, Inline graphic

To prove the other inequality, we need more detailed discussion. By using the definition of graph entropy, we get

graphic file with name pone.0094985.e563.jpg (133)

Let Inline graphic be the set of the neighbors of vertex Inline graphic, which does not contain Inline graphic and Inline graphic. Denote by Inline graphic the degree of a vertex in Inline graphic, where Inline graphic. If Inline graphic, then

graphic file with name pone.0094985.e572.jpg (134)

By performing some calculations, we can show that for Inline graphic and Inline graphic,

graphic file with name pone.0094985.e575.jpg (135)

i.e., Inline graphic for Inline graphic. For smaller Inline graphic, we verify this inequality directly. If Inline graphic, then we have

graphic file with name pone.0094985.e580.jpg (136)

We can show that for Inline graphic and Inline graphic,

graphic file with name pone.0094985.e583.jpg (137)

i.e., Inline graphic for Inline graphic. For smaller Inline graphic, we verify this inequality directly. Now suppose Inline graphic, then there is only one vertex in Inline graphic whose degree is at most Inline graphic. Therefore by using (133) and (136), we get

graphic file with name pone.0094985.e590.jpg (138)

and

graphic file with name pone.0094985.e591.jpg (139)

We can verify

graphic file with name pone.0094985.e592.jpg (140)

for each Inline graphic, i.e., Inline graphic. Inline graphic

From Theorem 14 and 21, we obtain the following corollary.

Corollary 2

Let Inline graphic be a tree with Inline graphic vertices. Denote by Inline graphic and Inline graphic the two pendent paths attaching to the same vertex such that Inline graphic. Denote by Inline graphic the tree obtained by transferring the pendent vertex of Inline graphic to the pendent vertex of Inline graphic. Then we have

graphic file with name pone.0094985.e604.jpg (141)

Therefore, we obtain a similar result to comparing the values of distance measures of trees.

Corollary 3

Let Inline graphic, there exists a tree Inline graphic such that Inline graphic.

Actually, our numerical results (see section ‘Numerical Results’) show that for any two trees Inline graphic, the following inequality may hold.

Conjecture 3

Let Inline graphic and Inline graphic be any two trees with Inline graphic vertices. Then

graphic file with name pone.0094985.e612.jpg (142)

holds.

By way of example, we consider all Inline graphic trees of 8 vertices and calculate all possible values of Inline graphic (blue) and Inline graphic (red), respectively, as shown in Figure 3. From Figure 3, we observe that

Figure 3. Values of Inline graphic (blue) and Inline graphic (red).

Figure 3

The Y-axis denotes the values of the distance measure and the X-axis denotes the graph pairs.

graphic file with name pone.0094985.e618.jpg (143)

holds for each pair of trees Inline graphic and Inline graphic.

Numerical Results

In this section, we interpret the numerical results. First, we consider all trees with Inline graphic vertices. The number of trees is Inline graphic and the number of pairs is Inline graphic (see [45]). From the curves shown by Figure 1, we see that both measures Inline graphic (blue) and Inline graphic (red) satisfy the inequality Eq. (77). From the curves shown by Figure 2, we observe that both measures Inline graphic (red) and Inline graphic (blue) satisfy the inequality Eq. (101). From the curves shown by Figure 3, we also learn that both measures Inline graphic (blue) and Inline graphic (red) fulfill the inequality Eq. (143). By using this method, several other inequalities could be generated and verified graphically.

Figures 4 and 5 show the numerical results by using the graph distance measures based on graph energy Inline graphic, the Wiener index Inline graphic and the Randić index Inline graphic, respectively. We consider all trees with Inline graphic vertices. The number of trees is Inline graphic and the number of pairs is Inline graphic (see [45]). By Figure 4, we depict the distributions of the ranked distance values, that is, Inline graphic (red), Inline graphic (blue), and Inline graphic (yellow). First and foremost, we see that the measured values of all three measures cover the entire interval Inline graphic. This indicates that the measures are generally useful as they are well defined. By considering Inline graphic, we observe that only a relatively little number of pairs have a measured value Inline graphic 0.8. But a large number of pairs possess distance values Inline graphic 0.8. When considering Inline graphic, the situation is reverse. The distance values of Inline graphic seem to slightly increase with some up- and downturns. However, Figure 4 does not comment on the ability of the graph distance measures to classify graphs efficiently. This needs to be examined in the future and would far beyond the scope of this paper.

Figure 4. Distributions of the ranked values of the distance measure Inline graphic (red), Inline graphic (blue), Inline graphic (yellow).

Figure 4

The X-axis denotes the values of the distance measure. The Y-axis denotes the number of graph pairs.

Figure 5. The X-axis denotes the values of the distance measures Inline graphic (red), Inline graphic (blue), Inline graphic (yellow).

Figure 5

The Y-axis represents the percentage rate of all graphs studied.

Furthermore, we have computed the cumulative distributions by using the measures Inline graphic (red), Inline graphic (blue), Inline graphic (yellow), respectively, as shown in Figure 5. In general, the computation of the cumulative distribution may serve as a preprocessing step when analyzing graphs structurally. In fact, we see how many percent of the 235 graphs have a distance value which is less or equal Inline graphic. Also, Figure 5 shows that the value distributions are quite different. From Figure 5, we see that the curve for Inline graphic strongly differs from Inline graphic and Inline graphic. When considering Inline graphic, we also observe that about 80% of the Inline graphic trees have a distance value approximately Inline graphic 0.5. That means most of the trees are quite dissimilar according to Inline graphic. For Inline graphic, the situation is absolutely reverse. Here 80% of the trees have a distance value approximately Inline graphic 0.98. Finally evaluating the graph distance measure Inline graphic on these trees reveals that about 80% of the trees possess a distance value approximately Inline graphic 0.85. In summary, we conclude from Figure 5 that all three measures capture the distance between the graphs quite differently. But nevertheless, this does not imply that the quality of one measure may be worse than another. Again, an important issue of quality is fulfilled as the measures turned out to be well defined, see Figure 4. Another crucial issue would be evaluating the classification ability which is future work.

Summary and Conclusion

In this paper, we have studied interrelations of graph distance measures which are based on distinct topological indices. In order to do so, we employed the Wiener index, the Randić index, the zeroth-order Randić index, the graph energy, and certain graph entropies [25]. In particular, we have obtained inequalities involving the novel graph distance measures. Evidenced by a numerical analysis we also found three conjectures dealing with relations between the distance measures on trees.

From Theorem 1, we see that the star graph and the path graph maximize Inline graphic among all trees with a given number of vertices, for any topological index we considered here. Actually, this also holds for some other topological indices, such as the Hosoya index [46], [47], the Merrifield-Simmons index [48], [49], [47], the Estrada index [50], [51], [52], and the Szeged index [53], [54]. All other theorems we have proved in this paper shed light on the problem of proving interrelations of the measures. We believe that such statements help to understand the measures more thoroughly and, finally, they are useful to establish new applications employing quantitative graph theory [55]. We emphasize that the star graph and the path graph are apparently the two most dissimilar trees among all trees. Similar observations can also be obtained for unicyclic graphs or bicyclic graphs. Therefore, in the future, we would like to explore which classes of graphs have this property, i.e., identifying graphs (such as the path graph and the star graph) which maximize or minimize Inline graphic.

Another direction for future work is to compare the values of Inline graphic where Inline graphic are general graphs. For example, we could assume that Inline graphic and Inline graphic are obtained by only one graph edit operation, i.e., GED(Inline graphic) = 1, see [15]. Then, all the graph which fulfill this equation are (by definition) similar. This construction could help to study the sensitivity of the measures thoroughly. Note that similar properties of topological indices have already been investigated, see [56]. As a conclusive remark, we mention that dynamics models on spatial graphs have been studied by Perc and Wang and other researchers, see [57], [58]. It would be interesting to study the distance measures in this mathematical framework as well.

Supporting Information

Supporting Information S1

CSV file containing descriptor values of 235 trees by using the Randić index.

(CSV)

Supporting Information S2

CSV file containing descriptor values of 235 trees by using graph energy.

(CSV)

Supporting Information S3

CSV file containing descriptor values of 235 trees by using the Wiener index.

(CSV)

Funding Statement

Matthias Dehmer and Yongtang Shi thank the Austrian Science Funds for supporting this work (project P26142). Yongtang Shi has also been supported by the National Science Foundation of China. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1. Dehmer M, Mehler A (2007) A new method of measuring similarity for a special class of directed graphs. Tatra Mountains Mathematical Publications 36: 39–59. [Google Scholar]
  • 2. Sobik F (1982) Graphmetriken und klassifikation strukturierter objekte, ZKI-informationen. Akad Wiss DDR 2: 63–122. [Google Scholar]
  • 3. Zelinka B (1975) On a certain distance between isomorphism classes of graphs. Časopis propest Mathematiky 100: 371–373. [Google Scholar]
  • 4. Emmert-Streib F (2007) The chronic fatigue syndrome: A comparative pathway analysis. J Comput Biology 14: 961–972. [DOI] [PubMed] [Google Scholar]
  • 5.Junker B, Schreiber F (2008) Analysis of Biological Networks. Wiley-Interscience. Berlin.
  • 6. Kier L, Hall L (2002) The meaning of molecular connectivity: A bimolecular accessibility model. Croat Chem Acta 75: 371–382. [Google Scholar]
  • 7. Bonchev D, Trinajstić N (1977) Information theory, distance matrix and molecular branching. J Chem Phys 67: 4517–4533. [Google Scholar]
  • 8. Skvortsova M, Baskin I, Stankevich I, Palyulin V, Zefirov N (1998) Molecular similarity. 1. analytical description of the set of graph similarity measures. J Chem Inf Comput Sci 38: 785–790. [Google Scholar]
  • 9. Varmuza K, Scsibrany H (2000) Substructure isomorphism matrix. J Chem Inf Comput Sci 40: 308–313. [DOI] [PubMed] [Google Scholar]
  • 10. Mehler A, Weiβ P, Lücking A (2010) A network model of interpersonal alignment. Entropy 12: 1440–1483. [Google Scholar]
  • 11. Hsieh S, Hsu C (2008) Graph-based representation for similarity retrieval of symbolic images. Data Knowl Eng 65: 401–418. [Google Scholar]
  • 12. Dehmer M, Mehler A (2007) A new method of measuring similarity for a special class of directed graphs. Tatra Mountains Mathematical Publications 36: 39–59. [Google Scholar]
  • 13.Bunke H (2000) Recent developments in graph matching. In: 15-th International Conference on Pattern Recognition. pp. 117–124.
  • 14.Garey MR, Johnson DS (1979) Computers and Intractability: A Guide to the Theory of NP Completeness. Series of Books in the Mathematical Sciences. W. H. Freeman.
  • 15. Bunke H (1983) What is the distance between graphs? Bulletin of the EATCS 20: 35–39. [Google Scholar]
  • 16.Robles-Kelly A, Hancock R (2003) Edit distance from graph spectra. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 234–241.
  • 17. Dehmer M (2008) Information processing in complex networks: Graph entropy and information functionals. Appl Math Comput 201: 82–94. [Google Scholar]
  • 18. Dehmer M, Mowshowitz A, Emmert-Streib F (2011) Connections between classical and parametric network entropies. PLoS ONE 6: e15733. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Polansky OE, Bonchev D (1986) The Wiener number of graphs. I. General theory and changes due to graph operations. MATCH Communications in Mathematical and in Computer Chemistry 21: 133–186. [Google Scholar]
  • 20. Polansky OE, Bonchev D (1990) Theory of the wiener number of graphs. II. Transfer graphs and some of their metric properties. MATCH Communications in Mathematical and in Computer Chemistry 25: 3–40. [Google Scholar]
  • 21.Schädler C (1999) Die Ermittlung struktureller Ähnlichkeit und struktureller Merkmale bei komplexen Objekten: Ein konnektionistischer Ansatz und seine Anwendungen. Ph.D. thesis, Technische Universität Berlin.
  • 22.Li X, Shi Y, Wang L (2008) An updated survey on the randić index. In: I Gutman BF, editor, Recent Results in the Theory of Randic Index, University of Kragujevac and Faculty of Science Kragujevac. pp. 9–47.
  • 23.Todeschini R, Consonni V, Mannhold R (2002) Handbook of Molecular Descriptors. Wiley-VCH. Berlin.
  • 24.Gutman I, Li X, Zhang J (2009) Graph energy. In: Dehmer M, Emmert-Streib F, editors, Analysis of Complex Networks. From Biology to Linguistics, Wiley–VCH. pp. 145–174. Weinheim.
  • 25. Dehmer M, Mowshowitz A (2011) A history of graph entropy measures. Inform Sciences 181: 57–78. [Google Scholar]
  • 26.Harary F (1969) Graph Theory. Addison Wesley Publishing Company. Reading, MA, USA.
  • 27.Dehmer M, Varmuza K, Bonchev D, editors (2012) Statistical Modelling of Molecular Descriptors in QSAR/QSPR. Quantitative and Network Biology. Wiley-Blackwell.
  • 28. Ulanowicz RE (2001) Information theory in ecology. Computers and Chemistry 25: 393–399. [DOI] [PubMed] [Google Scholar]
  • 29. Emmert-Streib F, Dehmer M (2011) Networks for systems biology: Conceptual connection of data and function. IET Systems Biology 5: 185–207. [DOI] [PubMed] [Google Scholar]
  • 30. Wilhelm T, Hollunder J (2007) Information theoretic description of networks. Physica A 388: 385–396. [Google Scholar]
  • 31.Solé RV, Valverde S (2004) Information theory of complex networks: On evolution and architectural constraints. In: Lecture Notes in Physics. volume 650, pp. 189–207.
  • 32. Bonchev D, Mekenyan O, Trinajstić N (1981) Isomer discrimination by topological information approach. J Comp Chem 2: 127–148. [Google Scholar]
  • 33. Dehmer M, Emmert-Streib F, Tripathi S (2013) Large-scale evaluation of molecular descriptors by means of clustering. PLoS ONE 8: e83956. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Wiener H (1947) Structural determination of paran boiling points. J Amer Chem Soc 69: 17–20. [DOI] [PubMed] [Google Scholar]
  • 35. Dobrynin A, Entringer R, Gutman I (2001) Wiener index of trees: theory and application. Acta Appl Math 66: 211–249. [Google Scholar]
  • 36. Randić M (1975) On characterization of molecular branching. J Amer Chem Soc 97: 6609–6615. [Google Scholar]
  • 37. Bollobás B, Erdös P (1998) Graphs of extremal weights. Ars Combin 50: 225–233. [Google Scholar]
  • 38.Li X, Gutman I (2006) Mathematical Aspects of Randi'c-Type Molecular Structure Descriptors. University of Kragujevac and Faculty of Science Kragujevac.
  • 39. Li X, Shi Y (2008) A survey on the randić index. MATCH Commun Math Comput Chem 59: 127–156. [Google Scholar]
  • 40. Gutman I (1977) Acylclic systems with extremal hückel π-electron energy. Theor Chim Acta 45: 79–87. [Google Scholar]
  • 41.Gutman I (2001) The energy of a graph: old and new results. In: Betten A, Kohnert A, Laue R, Wassermann A, editors, Algebraic Combinatorics and Applications, Springer–Verlag. pp. 196–211.
  • 42.Li X, Shi Y, Gutman I (2012) Graph Energy. Springer. New York.
  • 43.Shannon C, Weaver W (1949) The Mathematical Theory of Communication. University of Illinois Press. Urbana, USA.
  • 44. Dehmer M, Mowshowitz A (2011) Generalized graph entropies. Complexity 17: 45–50. [Google Scholar]
  • 45.Read R, Wilson R (19988) An Atlas of Graphs. Clarendon Press. Oxford.
  • 46. Hosoya H (1971) Topological index. a newly proposed quantity characterizing the topological nature of structural isomers of saturated hydrocarbons. Bull Chem Soc Jpn 4: 2332–2339. [Google Scholar]
  • 47. Wagner S, Gutman I (2010) Maxima and minima of the hosoya index and the merrifield-simmons index: A survey of results and techniques. Acta Appl Math 112: 323–346. [Google Scholar]
  • 48. Merrifield R, Simmons H (1980) The structure of molecular topological spaces. Theor Chim Acta 55: 55–75. [Google Scholar]
  • 49.Merrifield R, Simmons H (1989) Topological Methods in Chemistry. Wiley. New York.
  • 50. Estrada E (2000) Characterization of 3d molecular structure. Chem Phys Lett 319: 713–718. [Google Scholar]
  • 51.Deng H, Radenković S, Gutman I (2009) The estrada index. In: Cvetković D, Gutman I, editors, Applications of Graph Spectra, Math. Inst. pp. 123–140. Belgrade.
  • 52.Gutman I, Deng H, Radenković S (2011) The estrada index: An updated survey. In: Cvetković D, Gutman I, editors (2011) Selected Topics on Applications of Graph Spectra, Math. Inst. pp. 155–174.
  • 53. Gutman I (1994) A formula for the wiener number of trees and its extension to graphs containing cycles. Graph Theory Notes of New York 27: 9–15. [Google Scholar]
  • 54. Simić S, Gutman I, Baltić V (2000) Some graphs with extremal szeged index. Math Slovaca 50: 1–15. [Google Scholar]
  • 55.Dehmer M, Emmert-Streib F (2014) Quantitative Graph Theory. Theory and Applications. CRC Press. In press.
  • 56. Furtula B, Gutman I, Dehmer M (2013) On structure-sensitivity of degree-based topological indices. Applied Mathematics and Computation 219: 8973–8978. [Google Scholar]
  • 57. Perc M, Wang Z (2010) Heterogeneous aspirations promote cooperation in the prisoner's dilemma game. PLOS ONE 5: e15117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Jin Q, Wang L, Xia C, Wang Z (2014) Spontaneous symmetry breaking in interdependent networked game. Scientific Reports 4: 4095. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information S1

CSV file containing descriptor values of 235 trees by using the Randić index.

(CSV)

Supporting Information S2

CSV file containing descriptor values of 235 trees by using graph energy.

(CSV)

Supporting Information S3

CSV file containing descriptor values of 235 trees by using the Wiener index.

(CSV)


Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES