The heat kernel as the pagerank of a graph

Fan Chung

doi:10.1073/pnas.0708838104

. 2007 Dec 5;104(50):19735–19740. doi: 10.1073/pnas.0708838104

The heat kernel as the pagerank of a graph

Fan Chung ^1,^†

PMCID: PMC2148367

Abstract

The concept of pagerank was first started as a way for determining the ranking of Web pages by Web search engines. Based on relations in interconnected networks, pagerank has become a major tool for addressing fundamental problems arising in general graphs, especially for large information networks with hundreds of thousands of nodes. A notable notion of pagerank, introduced by Brin and Page and denoted by PageRank, is based on random walks as a geometric sum. In this paper, we consider a notion of pagerank that is based on the (discrete) heat kernel and can be expressed as an exponential sum of random walks. The heat kernel satisfies the heat equation and can be used to analyze many useful properties of random walks in a graph. A local Cheeger inequality is established, which implies that, by focusing on cuts determined by linear orderings of vertices using the heat kernel pageranks, the resulting partition is within a quadratic factor of the optimum. This is true, even if we restrict the volume of the small part separated by the cut to be close to some specified target value. This leads to a graph partitioning algorithm for which the running time is proportional to the size of the targeted volume (instead of the size of the whole graph).

In the development of quantitative ranking for Web pages, many mathematical methods have come into play. The Hub-and-Authority algorithm by Kleinberg (1) uses eigenvectors. The PageRank introduced by Brin and Page (2) basically uses random walks. These pagerank algorithms mainly rely on the network structure of the Web. The viewpoint is to regard the Web as a graph, with vertices to be Web pages and edges as links between pairs of Web pages. Various notions of pagerank are computed using the Webgraph, which are then used for numerous applications, such as identifying communities or finding hot spots in various information networks. Another example is to use PageRank to derive a local graph partitioning algorithm (3), which can be computed very efficiently in the sense that the cost of computing is proportional to the size of the small part of the partition, in contrast with the generic partitioning algorithm having cost depending on the size of the whole graph.

In this paper, we introduce a notion of pagerank by using the heat kernel of a graph. Similar to PageRank, the heat kernel pagerank is based on random walks but having the extra benefit of satisfying the heat equation. Originally rooted in spectral geometry (4), the heat equation for graphs involves a parameter t, the heat, which allows additional control of the rate of diffusion (see detailed definitions later). Using the heat equation, the heat kernel pagerank is amenable to various mathematical analyses of the graph. A key isoperimetric invariant of a graph is the Cheeger constant which provides an evaluation of how good a cut can be found. The classical Cheeger inequality concerns the relationship between the Cheeger constant and eigenvalues of the (normalized) Laplacian of a graph. (A graph can be viewed as a discrete version of a manifold where the original Cheeger inequality applies, ref. 5.) Here we will prove several variations of the Cheeger inequality, establishing relationships between the Cheeger constant and the heat kernel pagerank. One of the consequences of the local Cheeger inequality is that, for a given value s, the minimum Cheeger ratio of subsets of volume at most s can be approximated up to a quadratic factor by focusing on subsets obtained by using heat kernel pageranks.

A byproduct of the classical Cheeger inequality is a fast partition algorithm using eigenvectors. Here, we will show that the heat kernel pagerank leads to efficient local partitioning algorithms with several advantages over previous algorithms. Instead of partitioning graphs into almost equal parts for divide-and-conquer approaches, in a large graph, a local partition algorithm seeks a set with volume bounded by some target size near some specified seed. Recently, there has been progress on developing local partitioning algorithms. Spielman and Teng (6) gave a local partitioning algorithm based on a result of Lovász and Simonovits on rapidly mixing random walks (7, 8). By using PageRank, an improved local partitioning algorithm was given in refs. 3 and 9. All of these partitioning algorithms are so-called “one-sweep” algorithms that focus on the subsets consisting of the highest j vertices, for some j, according to some linear ordering. In this paper, we will give a local partitioning algorithm, which, for a given target size s, is a one-sweep algorithm using a truncated version of heat kernel pagerank with support at most s, and thus further improves previous work. The algorithm is based on a local version of the Cheeger inequality.

Preliminaries

The starting point of the heat kernel pagerank is a typical random walk. In a graph G, the transition probability matrix W of a typical random walk on a graph G = (V,E) is a matrix with columns and rows indexed by V and is defined by

graphic file with name zpq05007-8298-m01.jpg

Before we proceed to define heat kernel pagerank, we will first describe PageRank, as defined by Brin and Page (2). The PageRank involves a preference vector f [which can be viewed as the probabilistic distribution of the seed(s)] and a jumping constant α. For example, if we have one starting seed denoted by vertex u, then f can be written as the (0, 1)-indicator function χ_u of u. Another example is to take f to be the constant function with value 1/n at every vertex as in the original definition in Brin and Page (2). The version of PageRank we discuss here is often called the personalized PageRank. Throughout this paper, a real-valued function f : V → ℝ is taken to be a row vector so that W can act on f from the right by matrix multiplication. The PageRank pr(α,f), with the scale paramenter α and the preference vector f, is defined by

The heat kernel pagerank also has two parameters: t, a nonnegative value (the temperature), and f (a preference vector), defined as follows:

If we compare Eq. 1 with Eq. 2, we can see that ρ(·) is just an exponential sum instead of pr(·), which is a geometrical sum. It is often the case (e.g., in dealing with generating functions) that the exponential sum converges more rapidly.

We note that an equivalent definition of the PageRank is given by the following recurrence:

Instead, the heat kernel pagerank as defined in Eq. 2 satisfies the following heat equation:

Let us define L = I − W. Then the definition of the heat kernel pagerank in Eq. 2 can be rewritten as

graphic file with name zpq05007-8298-m06.jpg

For a vertex v in G, the degree of v, denoted by d_v, is the number of vertices to which v is adjacent. Let A denote the adjacency matrix of G and let D represent the diagonal degree matrix. We can write W = D⁻¹ A and H_t = e^−t(I−W). The discrete heat kernel first introduced in ref. 10 is a symmetric version of H_t.

It is known that a random walk has a stationary distribution if it is irreducible and nonperiodic. In graph-theoretical terms, a random walk on a graph G has a stationary distribution π if G is connected and nonbipartite with π satisfying π(u) = d_u/Σ_v d_v.

From the above definition, we have the following immediate facts for ρ_{t, f}. Namely, ρ_{0, f} = f, ρ_t,π = π, and ρ_{t, f}1* = f1* = 1 if f satisfies Σ_v f (v) = 1. Here, 1 denotes the all 1's function and x* denotes the transpose of x. Also, we have DH_t = H*_t D and H_t = H_t/2H_t/2 = H_t/2D ⁻¹H*_t/2D.

To approximate the heat kernel pagerank, one might choose an additive approximation by taking a finite sum (cf. Eq. 2). If one prefers a multiplicative approximation, there is a formula, given by Euler (11), as a sum of two infinite products:

graphic file with name zpq05007-8298-m07.jpg

Isoperimetric Properties of the Heat Kernel

For a subset S of vertices in G, the volume of S, denoted by vol(S), is Σ_u∈Sd_u. Also the volume of a graph G, denoted by vol (G) is vol (G) = Σ_u d_u. The edge boundary of S, denoted by ∂S is defined by

Let S̄ = V \S denote the complement of S. Clearly, ∂S = ∂S̄. The Cheeger ratio of S, denoted by h_S, is defined by

and the Cheeger constant of a graph G is h_G = min_S⊆V h_S.

For a given set S, we consider the distribution f_S with f_S(u) = d_u/vol(S) if u ∈ S, and 0 otherwise. Note that f_S can be written as 1/vol(S) χ_SD where χ_S is the indicator function for S. For any function g : V → ℝ, we define g(S) = Σ_v∈S g(v).

We will use the heat equation to derive the following isoperimetric inequality for the heat kernel pagerank:

Lemma 1.

For a subset S with vol(S) ≤ vol(G)/2, we have

where the sum is over all unordered pairs of vertices {u, v} in E and

To prove this, we see that

graphic file with name zpq05007-8298-m12.jpg

Here, we use the fact (see ref. 12) that, for any f, g : V → ℝ, we have

In a similar way, it can be easily checked that ∂²/∂t² ρ_{t, f_s}(S) ≥ 0. Therefore,

as desired.

A Mixing Inequality for the Heat Kernel Pagerank

For a function f : V → ℝ, we order the vertices of G so that

Let S_i denote the set consisting of v₁, …, v_i. Let h_f denote the least Cheeger ratio h_{S_i} over all S_i. We say that h_f is the Cheeger ratio determined by a sweep of f. Our goal is to establish a rapid mixing estimate for the heat kernel page rank in terms of the associated Cheeger ratios. For a vertex u, we consider ρ_tχu, which will also be written as ρ_{t, u}.

Theorem 1.

In a graph G, for t ≥ 0, the heat kernel pagerank satisfies

where κ_t is the minimum Cheeger ratio of all Cheeger ratios obtained by sweeps of ρ_t,w over all vertices w in G.

The theorem follows from the following, slightly stronger, statement:

Theorem 2.

where κ_t,u is the minimum Cheeger ratio determined by a sweep of ρ_t,u.

Proof: We note that

graphic file with name zpq05007-8298-m18.jpg

because ρ_t,u (u) − π(u) = d_uΣ_w(ρ_t/2,u(w) − π(w))²d_w⁻¹.

It is enough to show that

We use Lemma 1 and consider the following:

graphic file with name zpq05007-8298-m20.jpg

Now we relabel all the vertices so that ρ_t/2,u(v₁)/d_v1 ≥ ρ_t/2,u(v₂)/d_v2 ≥ … ≥ ρ_t/2,u(v_n)/d_vn. Let τ be the largest integer such that vol(S_r)≤vol(G)/2. We note that

graphic file with name zpq05007-8298-m21.jpg

Therefore, we have

graphic file with name zpq05007-8298-m22.jpg

We consider two functions, f₊ and f₋, defined by

graphic file with name zpq05007-8298-m23.jpg

We also define

graphic file with name zpq05007-8298-m24.jpg

Then we have

graphic file with name zpq05007-8298-m25.jpg

Note that each of the above two sums, involving f₊ and f₋, respectively, are nontrivial. Without loss of generality, we may assume that the minimum is achieved by the sum involving f₊. Therefore, we have

graphic file with name zpq05007-8298-m26.jpg

Let Inline graphic (S_i) denote the minimum of vol(S_i) and vol(S̄_i). Then we have

graphic file with name zpq05007-8298-m28.jpg

by using the convention that S₀ = ∅. This implies that

By solving the above equation, we have

Because ρ(t, u)(u) ≤ 1 and lim_t→∞ ρ(t, u)(u) − π(u) = 0, we can choose c₁ = 0 and c₂ = 1. We have completed the proof for Theorems 1 and 2.

A Cheeger Inequality Using Heat Kernel Pagerank

The classical Cheeger inequality states that

where α_G denotes the minimum Cheeger ratios using a sweep over an eigenvector associated with the spectral gap λ₁ of the (normalized) Laplacian (12). Here we will give a local versions of the Cheeger inequality, which relates the Cheeger ratio of a subset to the heat kernel pagerank with seeds as the vertices in the subset.

Theorem 3.

In a graph G, for a subset S of vertices in G with vol(S) ≤ vol(G)/2 and a real value t ≥ 0, the Cheeger ratio of S satisfies the following:

where κ_t,S denotes the minimum Cheeger ratio over all sweeps of ρ_t/2,u for all u ∈ S and log is the natural logarithm.

Proof: From Theorem 2, we have

graphic file with name zpq05007-8298-m33.jpg

Next, we wish to establish a lower bound for ρ_{t,f_S} (S) − π (S). We want to show that

This implies that the first derivative of −log(ρ_{t,f_S}(S) − π(S)) is decreasing. If this is true, we can use Lemma 1 to get

graphic file with name zpq05007-8298-m35.jpg

This implies

Combining this with the lower bound in Eq. 8, we have

as claimed.

It remains to prove Eq. 9. We consider

graphic file with name zpq05007-8298-m38.jpg

It suffices to show that

This can be proved by using the Cauchy–Schwarz inequality as follows:

graphic file with name zpq05007-8298-m40.jpg

The proof is complete.

A Local Cheeger Inequality

In a large graph, given a seed and a target volume s of a set, the goal is to find a good cut separating a subset of volume at most s near the seed. It is desirable to have a local algorithm which has a running time proportional to the target size s instead of generically in terms of the total number of vertices in the graph. In order to do so, we can not afford to consider the minimum Cheeger ratio of a full sweep of a function defined on all vertices of C. Instead, we define an s-local Cheeger ratio of a sweep f, denoted by h_f,s to be the minimum Cheeger ratio of the segment S_i with 0 ≤ vol(S_i) ≤ 2s. If no such segment exists, then we set h_f,s to be 0. We note that in order to compute the local s-Cheeger ratio, we can ignore most of the entries of f except for those with largest values of f(u)/d_u with total volume not exceeding 2s. We will prove the following local Cheeger inequality, which is weaker than the previous Cheeger inequality by a small constant factor.

Theorem 4.

In a graph G with a subset S with volume s, with s ≤ vol(G)/4, for any vertex u in G, we have

where κ_t,u,s denote the minimum s-local Cheeger ratio of cuts over a sweep of ρ_t,u that separate sets of volume between 0 and 2s.

Proof: For a function f: V → ℝ, we define f (u, v) = f (u)/d_v if u is adjacent to v and 0 otherwise. For an integer x, 0 ≤ x ≤ vol(G)/2, we define

We can extend f to all real x = k + r, with 0 ≤ r < 1, k ∈ ℤ⁺ by defining f (x) = (1 − r)f (k) + rf (k + 1). If x = vol(S_i), where S_i consists of vertices with the i highest values of f (u)/d_u, then it follows from the definition that f (x) = Σ_u∈S∈i f (u). Also, f (x) is concave in x

We consider the lazy walk W = (I + W)/2. Then

graphic file with name zpq05007-8298-m43.jpg

This can be straightforwardly extended to real x with 0 ≤ x ≤ vol(G)/2. In particular, we focus on x satisfying 0 ≤ x ≤ 2s ≤ vol(G)/2 and we choose f_t = ρ_t,u − π. Then

We now consider for x ∈ |0, 2s|,

graphic file with name zpq05007-8298-m45.jpg

by the concavity of f_t. Suppose g_t(x) is a solution of the equation in Eq. 11 satisfying f₀(x) ≤ g₀(x), f_t(0) = g_t(0), and ∂/∂t f_t(x)|_t=0 ≤ ∂/∂t g_t(x)|_t=0. Then, we have f_t(x) ≥ g_t(x). It is easy to check that $g_{t} (x) \leq e^{- t κ_{t, u, s}^{2} / 4 \sqrt{x / d_{u}}}$ using $- 2 + \sqrt{1 + x} + \sqrt{1 - x} \leq - x^{2} / 4$ . Thus,

as desired.

Let h_s denote the minimum Cheeger ratio h_s with 0 ≤ vol(S) ≤ 2s. Also let κ_t,2s denotes the minimum of κ_{t, u,2s} over all u. Combining Theorem 4 and Eq. 10, we have

As an immediate consequence, we have

Theorem 5.

For s ≤ vol(G)/4, we have

A Local Partition Algorithm

The Cheeger inequalities are closely associated with graph partition algorithms which have applications in a wide range of areas, in particular for the divide-and-conquer approaches (ref. 13, see also ref. 14). The spectral partition algorithm using eigenvectors has a long history and is widely used. However it has several disadvantages. For example, the spectral partition algorithm exercises no control over the size of the small part of the partition (although it can be used recursively to achieve a partition of a desired proportion). In a large graph with hundreds of thousand of nodes, it is prohibitively costly to compute eigenvectors. For very large graphs, it is imperative to develop local partition algorithms that can reduce the cost to be proportional to the volume of the smaller separated part of the cut.

A local partition algorithm has inputs including a vertex as the seed, the volume s of the target set, and a target value φ for the Cheeger ratio of the target set. The local Cheeger inequality in Theorem 4 suggests the following local partition algorithm. To find the set achieving the minimum s-local Cheeger ratio, one can simply consider a sweep of heat kernel pageranks with further restrictions to the cuts with smaller parts of volume between 0 and 2s.

How fast is the above local partition algorithm? The running time is basically dominated by the running time of computing the heat kernel pagerank with a seed. Indeed, it is enough to find an approximation of the pagerank with a finite support (no more than 2s).

How good is the above local partition algorithm? The following theorem shows that there are many seeds (with total volume at least half of S) so that the heat kernel pagerank with such a seed will find a partition with Cheeger ratio at most of order $\sqrt{h_{s} \log s}$ . We omit the proof here.

Theorem 6.

In a graph G, for a set S with volume s ≤ vol(G)/4, and Cheeger ratio h_S ≤ φ², there is a subset S′ ⊆ S with vol(S′) ≥ vol(S)/2 such that for any u ∈ S′, the sweep by using the heat kernel pagerank ρ_t,u, with t = [φ⁻²/4], will find a set T with s-local Cheeger ratio at most φ $φ \sqrt{\log s}$ .

We remark that another version of a local algorithm involves restriction to a specified subset and its boundary, which is usually called Dirichlet boundary problem. A variation of a local Cheeger inequality involving Dirichlet eigenvalues is examined in ref. 15. In this paper, we considered heat kernel pagerank without any specified boundary condition.

Summary

We introduced the heat kernel pagerank for a graph and established a local Cheeger inequality. This local Cheeger inequality establishes the relations between the Cheeger ratio of a set and the local Cheeger ratios over the sweeps of heat kernel pageranks. Consequently, it leads to a local partition algorithm using heat kernel pagerank with cost proportional to the volume of the smaller separated part. If there is a subset of vertices with volume s and having Cheeger ratio h_S, our algorithm using heat kernel pagerank generates a set with volume between 0 and 2s and having Cheeger ratio at most $\sqrt{h_{s} \log s}$ . This local partition algorithm can also be used as a subroutine for declustering algorithms or for finding balanced partitions.

Acknowledgments

This work was supported in part by National Science Foundation Grants DMS 0457215 and 1TR 0426858.

Footnotes

The author declares no conflict of interest.

This article is a PNAS Direct Submission.

References

1.Kleinberg J. J Assoc Comput Machinery. 1999;46:604–632. [Google Scholar]
2.Brin S, Page L. Comput Networks ISDN Syst. 1998;30:107–117. [Google Scholar]
3.Andersen R, Chung F, Lang K. Proceedings of the 47th Annual IEEE Symposium on Founation of Computer Science (FOCS'2006); New York: IEEE; 2006. pp. 475–486. [Google Scholar]
4.Schoen RM, Yau S-T. Differential Geometry. Cambridge, MA: International; 1994. [Google Scholar]
5.Cheeger J. In: Problems in Analysis. Gunning RC, editor. Princeton: Princeton Univ. Press; 1970. pp. 195–199. [Google Scholar]
6.Spielman D, Teng S-H. Proceedings of the 36th Annual ACM Symposium on Theory of Computing; New York: ACM; 2004. pp. 81–90. [Google Scholar]
7.Lovász L, Simonovits M. 31st IEEE Annu Symp Found Comput Sci. 1990:346–354. [Google Scholar]
8.Lovász L, Simonovits M. Random Struct Algorithms. 1993;4:359–412. [Google Scholar]
9.Andersen R, Chung F, Lang K. Theory and Applications of Models of Computation, Proceedings of TAMC 2007; Berlin: Springer; 2007. pp. 1–12. [Google Scholar]
10.Chung F, Yau S-T. Electronic J Combinatorics. 1999;6:R12. [Google Scholar]
11.Euler L. Introductio in Analysin Infinitorum. Lausanne: Bosquer; 1748. [Google Scholar]
12.Chung F. Spectral Graph Theory. New York: AMS; 1997. [Google Scholar]
13.Chung F. (J Assoc Comput Machinery) 2007;423:22–32. [Google Scholar]
14.Chung F. Proceedings of ICCM; Boston: International; 2007. in press. [Google Scholar]
15.Kannan R, Vempala S, Vetta A. Linear Algebra Appl. 2004;51:497–515. [Google Scholar]

[B1] 1.Kleinberg J. J Assoc Comput Machinery. 1999;46:604–632. [Google Scholar]

[B2] 2.Brin S, Page L. Comput Networks ISDN Syst. 1998;30:107–117. [Google Scholar]

[B3] 3.Andersen R, Chung F, Lang K. Proceedings of the 47th Annual IEEE Symposium on Founation of Computer Science (FOCS'2006); New York: IEEE; 2006. pp. 475–486. [Google Scholar]

[B4] 4.Schoen RM, Yau S-T. Differential Geometry. Cambridge, MA: International; 1994. [Google Scholar]

[B5] 5.Cheeger J. In: Problems in Analysis. Gunning RC, editor. Princeton: Princeton Univ. Press; 1970. pp. 195–199. [Google Scholar]

[B6] 6.Spielman D, Teng S-H. Proceedings of the 36th Annual ACM Symposium on Theory of Computing; New York: ACM; 2004. pp. 81–90. [Google Scholar]

[B7] 7.Lovász L, Simonovits M. 31st IEEE Annu Symp Found Comput Sci. 1990:346–354. [Google Scholar]

[B8] 8.Lovász L, Simonovits M. Random Struct Algorithms. 1993;4:359–412. [Google Scholar]

[B9] 9.Andersen R, Chung F, Lang K. Theory and Applications of Models of Computation, Proceedings of TAMC 2007; Berlin: Springer; 2007. pp. 1–12. [Google Scholar]

[B10] 10.Chung F, Yau S-T. Electronic J Combinatorics. 1999;6:R12. [Google Scholar]

[B11] 11.Euler L. Introductio in Analysin Infinitorum. Lausanne: Bosquer; 1748. [Google Scholar]

[B12] 12.Chung F. Spectral Graph Theory. New York: AMS; 1997. [Google Scholar]

[B13] 13.Chung F. (J Assoc Comput Machinery) 2007;423:22–32. [Google Scholar]

[B14] 14.Chung F. Proceedings of ICCM; Boston: International; 2007. in press. [Google Scholar]

[B15] 15.Kannan R, Vempala S, Vetta A. Linear Algebra Appl. 2004;51:497–515. [Google Scholar]

PERMALINK

The heat kernel as the pagerank of a graph

Fan Chung

Abstract

Preliminaries