Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Mar 1.
Published in final edited form as: IEEE Trans Automat Contr. 2016 Nov 9;62(9):4675–4682. doi: 10.1109/TAC.2016.2626796

Robust transport over networks

Yongxin Chen 1, Tryphon Georgiou 2, Michele Pavon 3, Allen Tannenbaum 4
PMCID: PMC5600536  NIHMSID: NIHMS851694  PMID: 28924302

Abstract

We consider transportation over a strongly connected, directed graph. The scheduling amounts to selecting transition probabilities for a discrete-time Markov evolution which is designed to be consistent with initial and final marginal constraints on mass transport. We address the situation where initially the mass is concentrated on certain nodes and needs to be transported in a certain time period to another set of nodes, possibly disjoint from the first. The random evolution is selected to be closest to a prior measure on paths in the relative entropy sense–such a construction is known as a Schrödinger bridge between the two given marginals. It may be viewed as an atypical stochastic control problem where the control consists in suitably modifying the prior transition mechanism. The prior can be chosen to incorporate constraints and costs for traversing specific edges of the graph, but it can also be selected to allocate equal probability to all paths of equal length connecting any two nodes (i.e., a uniform distribution on paths). This latter choice for prior transitions relies on the so-called Ruelle-Bowen random walker and gives rise to scheduling that tends to utilize all paths as uniformly as the topology allows. Thus, this Ruelle-Bowen law (𝔐RB) taken as prior, leads to a transportation plan that tends to lessen congestion and ensures a level of robustness. We also show that the distribution 𝔐RB on paths, which attains the maximum entropy rate for the random walker given by the topological entropy, can itself be obtained as the time-homogeneous solution of a maximum entropy problem for measures on paths (also a Schrödinger bridge problem, albeit with prior that is not a probability measure). Finally we show that the paradigm of Schrödinger bridges as a mechanism for scheduling transport on networks can be adapted to graphs that are not strongly connected, as well as to weighted graphs. In the latter case, our approach may be used to design a transportation plan which effectively compromises between robustness and other criteria such as cost. Indeed, we explicitly provide a robust transportation plan which assigns maximum probability to minimum cost paths and therefore compares favourably with Optimal Mass Transportation strategies.

I. Introduction

Transport over networks has been the focus of a rapidly expanding literature due to its intrinsic relevance in a wide range of applications that include power transmission, traffic, financial transactions, biological systems and so on [1], [2], [3], [4]. Furthermore, the topic relates to a host of other questions pertaining to the connectivity of graphs and the relative significance of their nodes as in the Google PageRank problem [5] and the study of interaction between genes in biological networks [6].

Our starting point is an important insight on the relation between the topological structure of a network and the entropy rate of a random walker on the graph [7], [8]. As it turns out, there is a unique way to specify transition probabilities at each node in such a way so that all paths of equal length joining any two particular nodes have equal probability. Thereby, a measure is placed on the family of paths between graph nodes that maximizes the entropy rate of a random walker, and this is a characteristic of the network. So far, the use of this concept has been to assign significance to each node in relation to the corresponding occupancy stationary distribution (centrality measures).

The focus in our paper is on how to schedule transportation plans across a network. The novel framework that we propose is that of the so-called Schrödinger bridge problem, where a flow is specified in agreement with an initial and a final marginal distribution on the nodes while, at the same time, the probability law on the paths is the closest possible to a prior in the relative entropy sense. The Ruelle-Bowen random walk provides a natural notion of “uniform” prior which gives equal importance to all paths. As a result, the transportation flow that is selected to agree with specified initial and final marginals tends to spread across all available paths as much as possible given the topological structure of the network. Thereby, such a flow leads to relatively low probability of conflict and congestion, and ensures a certain degree of inherent robustness of the transport plan. It is well appreciated that, typically, robustness, efficiency and cost are conflicting criteria when designing networks.

By extending our approach to weighted graphs, we show that the choice of a prior distribution may be used to ensure that the resulting transportation attains a satisfactory compromise between robustness and other criteria such as cost. Indeed, we exhibit a robust transportation plan which assigns maximum probability to all minimum cost paths. It appears attractive when compared to Optimal Mass Transportation strategies which are not necessarily robust and where the minimum cost of transportation between any two nodes is supposed to be given. Thus, the approach to scheduling transport based on Schrödinger bridges affords great flexibility. Moreover, it appears computationally attractive in view of the iterative algorithm proposed in [9].

The paper is outlined as follows. In Section II, we present the solution to a general Schrödinger bridge problem (SBP), where the prior measure is not necessarily a probability measure, as a straightforward extension of the results in [10], [9]. Section III is devoted to solutions of the SBP with equal initial and final marginals which have a time-invariant transition mechanism so that they admit invariant measures. We establish the surprising result (Theorem 3.4) that there is only one such bridge. This measure on paths can be constructed generalizing a classical result by Parry [11]. In Section IV, considering the special case of a prior transition given by the adjacency matrix, we describe the most important features of the Ruelle-Bowen random walker along the lines of [7]. We observe that this measure 𝔐RB on trajectories can be viewed as a solution to a “time-homogeneous” Schrödinger bridge problem where the prior transition mechanism is given by the adjacency matrix. Section V describes our procedure to produce a robust transportation plan over a given strongly connected network: We take the Ruelle-Bowen distribution 𝔐RB as prior in a Schrödinger bridge problem with prescribed initial and final marginals. We also prove that the optimal transportation can also be obtained in one step by taking the rescaled adjacency matrix as prior transition mechanism (Proposition 5.2). In Section VI, we outline the extension of our approach to the cases of weighted and not strongly connected graphs. Finally, in Section VII we illustrate our approach on a simple unweighted and weighted graph.

II. The Discrete Schr Ödinger bridge problem

We first describe the “ingredients” of the discrete Schrödinger Bridge problem (SBP) considered in [10], [9]. In fact, we will consider a slight generalization, where the “prior” is not necessarily a probability law. The goal is to determine a time-evolution of probability distributions vt(·) having support on a discrete space

X={1,,n},

e.g., the nodes of a network, over a time-indexing set

T={0,1,,N}

in a way such that it matches the specified marginal distributions v0(·) and vN(·) and the resulting random evolution is closest to the “prior” in a suitable sense. Regarding notation, we use μt(·), vt(·) for distributions, where typically, μ relates to a “prior” law while v represents a “new” distribution with end-points specified and obtained by solving the SBP.

The prior law is induced by the Markovian evolution

μt+1(xt+1)=xtXμt(xt)mxtxt+1 (1)

for nonnegative distributions μt(·) over 𝒳 with tT. Throughout, we assume that mij ≥ 0 for all indices i, j ∈ 𝒳 and for simplicity, for the most part, that the matrix

M=[mij]i,j=1n

does not depend on t. In this case, we will often assume that all entries of MN are positive. The rows of the transition matrix M do not necessarily sum up to one, in which case the “total transported mass” is not necessarily preserved. This is the case, in particular, of a Markov chain with “creation” and “killing”. In fact, M may simply encode the topological structure of a directed network with mij being zero or one, depending whether a certain transition is allowed.

The evolution (1), together with measure μ0(·), which we assume positive on 𝒳, i.e.,

μ0(x)>0forallxX, (2)

induces a measure 𝔐 on 𝒳N+1 as follows. It assigns to a path x = (x0, x1, … , xN) ∈ 𝒳N+1 the value

M(x0,x1,,xN)=μ0(x0)mx0x1mxN-1xN, (3)

and gives rise to a flow of one-time marginals

μt(xt)=xtM(x0,x1,,xN),tT.

The “prior” distribution 𝔐 on the space of paths may be at odds with a pair of specified marginals v0 and vN in that one or possibly both,

μ0(x0)ν0(x0),μN(xN)νN(xN).

We denote by (v0, vN) the family of probability distributions on 𝒳N+1 having the prescribed marginals. We seek a distribution in this set which is closest to the prior 𝔐 in a suitable entropic sense. To this end, let us first recall the definition of relative entropy for probability distributions.

Definition 2.1

The Relative Entropy between the probability distributions P and Q is

D(PQ):={xP(x)logP(x)Q(x),Supp(P)Supp(Q),+,Supp(P)Supp(Q),

where, by definition, 0 · log 0 = 0 and the summation is over the common set where they are supported.

As is well known, 𝔻(P||Q) is not symmetric and does not satisfy the triangle inequality. It does, however, satisfy 𝔻(P||Q) ≥ 0 and 𝔻(P||Q) = 0 if and only if Q = P, see, e.g., [12]. It can also be extended to positive measures that are not probability distributions. In fact, it is quite common to consider Q to be a uniform measure that may not be a probability measure, such as the Lebesgue measure or the stationary Wiener measure [13]. Naturally, while the value of 𝔻(P||Q) may turn out negative due to miss-match of scaling, the relative entropy is always jointly convex. We view the prior 𝔐 (specified by M and μ0) in a similar manner, and consider the Schrödinger Bridge problem:

Problem 2.2

Determine

M[ν0,νN]=argmin{D(PM)PP(ν0,νN)}. (4)

Provided all entries of MN are positive, the problem has a solution, which is unique due to strict convexity. This is stated next.

Theorem 2.3

Assume that MN has all positive elements. There exist nonnegative functions φ(·) and φ̂(·) on [0,N]×𝒳 satisfying for t ∈ [0,N − 1] the system

ϕ(t,i)=jmijϕ(t+1,j), (5a)
ϕ^(t+1,j)=imijϕ^(t,i), (5b)

with the boundary conditions

ϕ(0,x0)·ϕ^(0,x0)=ν0(x0) (5c)
ϕ(N,xN)·ϕ^(N,xN)=νN(xN), (5d)

for all x0, xN ∈ 𝒳. Moreover, the solution 𝔐*[v0, vN] to Problem 2.2 is unique and obtained by

M[ν0,νN](x0,,xN)=ν0(x0)πx0x1(0)πxN-1xN(N-1),

where1

πij(t):=mijϕ(t+1,j)ϕ(t,i). (6)

Equation (6) specifies one-step transition probabilities that are well defined.

Proof

The argument in [10, Theorem 4.1] and [9, Section III] applies verbatim to this setting which is slightly more general in that M does not prescribe a probability kernel. The system (5a–5d) is known as a Schrödinger system. The existence of solution is shown in [9, Section III] by establishing that the composition

ϕ^(0,x0)(MT)Nϕ^(N,xN)(5d)ϕ(N,xN)MNϕ(0,x0)(5c)(ϕ^(0,x0))next (7)

is contractive in the Hilbert metric [14], [15], [16], [17]. The fact that πij(t) in (6) satisfy Σj πij(t) = 1 follows from (5a).

Notice that φ and φ̂ are unique up to multiplication of φ by a positive constant and division of φ̂ by the same constant. This is a direct consequence of the proof above as the Hilbert metric is a metric on the projective space. The statement of the theorem is analogous to results for the classical Schrödinger system (5) of diffusions that have been established by Fortet, Beurling, Jamison and Föllmer [18], [19], [20], [13]. The requirement that MN has only positive entries can be slightly relaxed and replaced by a suitable condition that guarantees existence of solution for the particular v0 and vN. The case when M is time-varying can also be readily established along the lines of [10, Theorem 4.1] and [9, Theorem 2].

Finally, to simplify the notation, let φ(t) and φ̂(t) denote the column vectors with components φ(t, i) and φ̂(t, i), respectively, with i ∈ 𝒳. In matricial form, (5a), (5b) and (6) read

ϕ(t)=Mϕ(t+1),ϕ^(t+1)=MTϕ^(t), (8a)

and

Π(t)=[πij(t)]=diag(ϕ(t))-1Mdiag(ϕ(t+1)). (8b)

III. Time-homogeneous bridges

In this section, we consider the case of Schrödinger bridge problems when the marginals are identical, namely, v0 = vN = v. In particular, we are interested in the case when the solution of the SBP corresponds to a time-homogeneous Markov evolution. Note that, from Theorem 2.3, 𝔐*[v, v] is in general time inhomogeneous. We first recall the following celebrated result on the spectral properties of nonnegative matrices [21].

Theorem 3.1 (Perron-Frobenius)

Let A = (aij) be an n × n matrix with nonnegative elements. Suppose there exists N such that AN has only positive elements, and let λA be its spectral radius. Then

  1. λA > 0 is an eigenvalue of A;

  2. λA is a simple eigenvalue;

  3. there exists an eigenvector v corresponding to λA with strictly positive entries;

  4. v is the only non-negative eigenvector of A;

  5. let B = [bij ] be an n × n matrix with nonnegative elements. If aijbij, ∀i, jn and A ≠ = B, then λA < λB.

Since the nonnegative matrix M is such that MN has only positive elements, by the above Perron-Frobenius Theorem, M has a unique positive eigenvalue λM which is equal to the spectral radius. Let ϕ and ϕ̂ be the corresponding right and left eigenvectors and denote their entries by ϕ(x) and ϕ̂(x) with x ∈ 𝒳, respectively. Then both have only positive entries. We normalize ϕ and ϕ̂ so that

xXφ(x)φ^(x)=1.

This leads to a special probability distribution

ν¯(x)=φ(x)φ^(x). (9)

It turns out that is the only probability measure such that the associated SBP has a time-homogeneous solution; we shall name it the time-homogeneous bridge associated with M. It admits the following variational characterization.

Proposition 3.2

Let M be a nonnegative matrix such that MN has only positive elements, and 𝔐 the measure on 𝒳N+1 given by (3) with μ0 satisfying (2). Then the solution to the Schrödinger bridge problem

M[ν¯,ν¯]=argmin{D(PM)PP(ν¯,ν¯)}, (10)

where is as in (9), has the time-invariant transition matrix

Π¯=λM-1diag(φ)-1Mdiag(φ) (11)

and invariant measure .

Proof

Since ϕ and ϕ̂ are the right and left eigenvectors of M associated with eigenvalue λM, the nonnegative functions φ and φ̂ defined by

ϕ(t,x)=λMtφ(x),ϕ^(t,x)=λM-tφ^(x)

satisfy the Schrödinger system (5). By Theorem 2.3, the solution 𝔐*[v̄, v̄] of the Schrödinger bridge problem (10) then has the transition matrix (see (8b))

Π¯=diag(ϕ(0))-1Mdiag(ϕ(1))=λM-1diag(φ)-1Mdiag(φ),

which is exactly (11). Moreover, since

Π¯Tν¯=λM-1diag(φ)MTφ^=ν¯,

it follows that is the corresponding invariant measure.

In particular, notice that 𝔐*[v̄, v̄], and its extension to infinite paths x = (x0, x1, x2, …) through (11), is stationary. Indeed, we have the following more general result which is of independent interest.

Proposition 3.3

Let P(v, v) be a Markovian measure on 𝒳N+1 having time-invariant transition matrix Π. Then v is invariant for Π, i.e. ΠT v = v.

Proof

Let ΠT v = m. Then

dH(ν,m)=dH((ΠT)Nν,(ΠT)Nm)λdH(ν,m)

where dH is the Hilbert distance [15], [9] and λ < 1 is the contraction ratio of the map (ΠνT)N. Since both v and m are probability distributions, it follows that m = v and v is invariant.

We show next that, under mild assumptions, there is only one time-homogeneous bridge between equal marginals. In the following result, we shall use the following notation. As before, let 𝔐 be given by (3) with μ0 satisfying (2). We denote by 𝔐(2N) the unique extension of 𝔐 to all of the discrete interval [0, 2N] by its time-invariant transition mechanism. We also denote by 𝔐*[v, v](2N) the Schrödinger bridge with prior 𝔐(2N) and equal marginals v at times t = 0 and t = 2N.

Theorem 3.4

Let M be a nonnegative matrix such that MN has only positive elements. Let v be a probability measure. Suppose N >1 and that the transition matrix Πν of 𝔐*[v, v](2N) does not depend on time. Then v = and Πν = Π̄.

Proof

Let φν(t) = ν(t + 1) be the space-time harmonic function associated to the minimizer 𝔐*[v, v]. Suppose first that M has only positive elements and consider times t = N −2,N −1,N. By (8) and the time invariance of Πν, we must have

Πν=diag(ϕν(N-2))-1Mdiag(ϕν(N-1))=diag(ϕν(N-1))-1Mdiag(ϕν(N)).

It follows that

M=Dν(N-1)MDν(N)-1,

where

Dν(t)=diag(ϕν(t))diag(ϕν(t-1))-1=diag(d1ν(t),,dnν(t))

is diagonal for all t. Hence,

mij=eiTMej=di(N-1)mijdj(N)-1,i,j.

Varying j for a fixed i, since mij ≠ 0, we get that D(N) is a scalar matrix, say λI, not dependent on t and φ(N) is a right eigenvector of M. By the Perron-Frobenius Theorem, it follows that φ(N) corresponds to λM. It readily follows that φ̂(0) is an eigenvector of MT with positive components corresponding to the same eigenvalue λM. By (5c)–(5d), v is equal to .

A similar argument establishes the result when M has merely nonnegative elements. Indeed, looking at the N-step transition matrix Πν(N)=ΠνN on the time intervals [0,N] and [N, 2N] the same argument as in the full positive case gives that φ(2N) is a right eigenvector of MN with positive components. But so is φ. By Theorem 3.1, iv), they can be taken to be equal.

Consider now the following special case. We have a strongly connected, aperiodic directed graph (𝒱, ℰ) with vertex set 𝒱 = {1, 2, …, n} and edge set ℰ 𝒱 ×𝒱. Let A be the adjacency matrix of the graph so that aij = 1 if there is an edge from i to j and aij = 0 otherwise. Then, there exists N such that AN has all positive elements. As we shall see in the next section, the Schrödinger bridge problem (10) just considered with M = A as prior transition turns out to have as solution the Ruelle-Bowen measure 𝔐RB [7, Section III]. This probability measure has a number of useful properties, in particular it gives the same probability to paths of the same length between any two given nodes. All of this is discussed in the next section.

IV. The Ruelle-Bowens random walk

In this section, we follow closely the beautiful paper [7] by Delvenne and Libert, which explains the Ruelle-Bowens (RB) random walk. The RB random walk amounts to a Markovian evolution on a directed graph that assigns equal probabilities to all paths of equal length between any two nodes. The motivation of [7] was to assign a natural invariant probability to nodes based on relations that are encoded by a graph, and thereby determine a centrality measure, akin to Google Page ranking, yet more robust and discriminating. Our motivation is quite different. The RB random walk provides a uniform distribution on paths. Therefore, it represents a natural distribution to serve as prior in the SBP in order to achieve a maximum spreading of the mass transported over the available paths. In this section, besides reviewing basics on the RB random walk, we show that the RB distribution is itself a solution to the Schrödinger bridge problem 2.2.

We consider a strongly connected, directed graph

G=(V,).

The idea in Google Page ranking the nodes is based on a random walk where a jump takes place from one node to any of its neighbors with equal probability. The alternative proposed in [7] is an entropy ranking, based on the stationary distribution of the RB random walk [11], [22]. The transition mechanism is such that it induces a uniform distribution on paths of equal length joining any two nodes. This distribution is characterized as the one maximizing the entropy rate [12] for the random walker. Let us briefly recall the relevant concept. The Shannon entropy for paths of length t is at most

log{pathsoflengtht}.

Hence, the entropy rate is bounded by the topological entropy rate

HG=limsupt[log{pathsoflengtht}/t].

Here |{·}| denotes the cardinality of a set. Notice that H𝒢 only depends on the graph 𝒢 and not on the probability distribution on paths. More specifically, if A denotes the adjacency matrix of the graph, the number of paths of length t is the sum of all the entries of At. Thus, it follows that H𝒢 is the logarithm of the spectral radius of A, namely the maximum of the absolute values of the eigenvalues of A, that is

HG=log(λA). (12)

We next construct the Rulle-Bowen random walk. Let A as in the Perron-Frobenius Theorem 3.1 and let u and v be its left and right eigenvectors2 with positive components corresponding to λA, so that

ATu=λAu,Av=λAv. (13)

Suppose u and v are chosen so that

u,v:=iuivi=1.

As in the previous section, it is readily seen that their componentwise multiplication

νRB(i)=uivi (14)

defines a probability distribution which is invariant under the transition matrix

R=[rij],rij=vjλAviaij. (15)

that is,

RTνRB=νRB. (16)

If A in (13) represents the adjacency matrix A of a graph, then the transition matrix R in (15) together with the stationary measure vRB in (14), define the Ruelle-Bowen path measure

MRB(x0,x1,,xN):=νRB(x0)rx0x1rxN-1xN. (17)

Proposition 4.1

The measure 𝔐RB (17) assigns probability λA-tuivj to any path of length t from node i to node j.

Proof

Starting from the stationary distribution (14), and in view of (15), the probability of a path ij is

uivi(1λAvi-1vj)=1λAuivj,

assuming that node j is accessible from node i in one step. Likewise, the probability of the path ijk is

uivi(1λAvi-1vj)(1λAvj-1vk)=1λA2uivk

independent of the intermediate state j, and so on. Thus, the claim follows.

The striking property of 𝔐RB is that induces a uniform probability measure on paths of equal length between any two given nodes. We quote from [7] “Since the number of paths of length t is of the order of λAt (up to a factor) the distribution on paths of fixed length is uniform up to a factor (which does not depend on t). Hence the Shannon entropy of paths of length t grows as t log λA, up to an additive constant. The entropy rate of this distribution is thus log λA which is optimal” by the expression for H𝒢 in (12).

The analysis also shows that the Ruelle-Bowen distribution is the solution of the particular SBP where the “prior” transition mechanism is given by the adjacency matrix! This observation is apparently new and beautifully links the topological entropy rate to a maximum entropy problem on path space. We state next this special case of Proposition 3.2.

Proposition 4.2

Let A be the adjacency matrix of a strongly connected aperiodic graph 𝒢. Let 𝔐 the nonnegative measure on 𝒳N+1 given by (3) with M = A and μ0 satisfying (2). Then, the Ruelle-Bowen measure 𝔐RB (17) solves the SBP (10) with marginals v0 = vN = vRB.

V. Robust transport over networks

Once again we consider a strongly connected, directed graph 𝒢 = (𝒱, ℰ) with n vertices. We identify node 1 as a source and node n as a sink and seek to transport a unit mass from 1 to n in at most N steps. The task is formalized by setting an initial marginal distribution v0(x) = δ1x(x) Kronecker’s delta. Similarly, the final distribution is vN(x) = δnx(x). Generally, we seek a transportation plan which is robust and avoids congestion as much as the topology of the graph permits. This latter feature of the transportation plan will be achieved in this section indirectly, without explicitly bringing into the picture the capacity of each edge (this is done in Section VI). With these two key specifications in mind, we like to control the flux so that the initial mass spreads as much as possible on the feasible paths joining vertices 1 and n in N steps before reconvening at time N in vertex n. We shall achieve this by constructing a suitable Markovian transition mechanism. As we want to allow for the possibility that all or part of the mass reaches node n at some time less than N, we always include a loop in node n so that our adjacency matrix A always has ann = 1. We observed in the previous section that the Ruelle-Bowen 𝔐RB measure on paths can be obtained as the solution of the maximum entropy problem when the “prior transition matrix” is the adjacency matrix. Since 𝔐RB gives equal probability to paths joining two specific vertices, it is natural to use it as a prior in a new maximum entropy problem with marginals δ1x, δnx so as to achieve the spreading of the probability mass on the feasible paths joining the source with the sink. Thus, we consider the following maximum entropy problem

Problem 5.1

Determine

M[δ1x,δnx]=argmin{D(PMRB)PP(δ1x,δnx)}.

By Theorem 2.3, the optimal, time varying transition matrix Π*(t) of the above problem is given, recalling the notations in (8), by

Π(t)=diag(ϕ(t))-1Rdiag(ϕ(t+1)), (18)

where ϕ(t)=Rϕ(t+1),ϕ^(t+1)=RTϕ^(t), with the boundary conditions

ϕ(0,x)ϕ^(0,x)=δ1x(x),ϕ(N,x)ϕ^(N,x)=δnx(x) (19)

for all x ∈ 𝒳. In view of (15), if we define

ϕv(t):=λA-tdiag(v)ϕ(t),ϕ^v(t):=λAtdiag(v)-1ϕ^(t),

then we have

ϕv(t)=Aϕv(t+1),ϕ^v(t+1)=ATϕ^v(t),t=0,,N-1.

Moreover,

ϕv(t,x)ϕ^v(t,x)=ϕ(t,x)ϕ^(t,x),t=0,,N-1,xX.

Here, again, A is the adjacency matrix of 𝒢 and v is the right eigenvector corresponding to the spectral radius λA.

The above analysis provides another interesting way to express 𝔐*[δ1x, δnx]; it also solves the Schrödinger bridge problem with the same marginals δ1x and δnx while different prior transition matrix A, the adjacency matrix. Thus, we can replace the two-step procedure by a single bridge problem. This is summarized in the following proposition.

Proposition 5.2

Let A be the adjacency matrix of a strongly connected aperiodic graph 𝒢, 𝔐 the nonnegative measure on 𝒳N+1 given by (3) with M = A and μ0 satisfying (2), then, the solution 𝔐*[δ1x, δnx] of Problem 5.1 also solves the Schrödinger bridge problem

min{D(PM)PP(δ1x,δnx)}. (20)

The iterative algorithm of [9, Section III] can now be based on (20) to efficiently compute the transition matrix of the optimal robust transport plan 𝔐*[δ1x, δnx].

Remark 5.3

Finally, observing that if AN has also zero elements, the robust transport described in this section may still be feasible provided there is at least one path of length N joining node 1 with node n, i.e., (AN)1n > 0.

As we discussed in the beginning of this section, the intuition to use 𝔐RB as a prior is to achieve the spreading of the probability on all the feasible paths connecting the source and the sink. It turns out this is in deeded the case; the solution 𝔐*[δ1x, δnx] of Problem 5.1 assigns equal probability to all the feasible paths of lengths N joining the source 1 with the sink n. Too see this, by (18), the probability of the optimal transport plan 𝔐*[δ1x, δnx] assigns on path x = (x0, x1, … , xN) is

M[δ1x,δnx](x)=δ1x(x0)t=1N-1rxtxt+1ϕ(t+1,xt+1)ϕ(t,xt)=δ1x(x0)ϕv(N,xN)ϕv(0,x0)t=1N-1axtxt+1.

Observing that t=1N-1axtxt+1=1 for feasible path and 0 otherwise, and δ1x(x0)φv(N, xN)v(0, x0) depends only on the boundary points x0, xN, we conclude that 𝔐*[δ1x, δnx] assigns equal probability to all the feasible paths. Moreover, there are (AN)1n feasible paths of length N connecting nodes 1 and n. Thus we establish the following.

Proposition 5.4

𝔐*[δ1x, δnx] assigns probability 1/(AN)1n to each of all the feasible paths of length N connecting 1 and n.

VI. Generalization: Not strongly connected and weighted graphs

Consider again a directed graph 𝒢 = (𝒱, ℰ) with n vertices. We associate to the edge ij an “energy” Uij ≥ 0. We study the following two specific cases (and their combination):

a) Graphs that are not strongly connected: We consider the same problem as in the previous section but the graph is not strongly connected. Following [7], we can give a large positive energy U0 to non existing links (this kind of “teleportation” is employed in the random walk of the Google Page rank algorithm to avoid getting stuck in absorbing states) and energy Uij = 0 to existing links. Then the adjacency matrix A is replaced by the matrix

B=[bij]=[exp(-Uij)].

The matrix B has all positive elements. Hence, we can apply the Perron-Forbenius theorem. Let u and v be left and right eigenvectors with positive components of the matrix B corresponding to the spectral radius λB of B, so that

BTu=λBu,Bv=λBv.

Suppose that u and v are chosen so that 〈u, v〉 = Σi uivi = 1. Then μU given by

μU(i)=ui·vi (21)

is a probability distribution which is invariant for the transition matrix

RU=λB-1diag(v)-1Bdiag(v), (22)

namely

RUTμU=μU. (23)

The corresponding path space measure 𝔐U is no longer uniform on paths of equal length. Indeed, the probability of the path (i = x0, x1, … , xt–1, j = xt) is

λB-texp(-=0t-1Uxx+1)uivj.

However, it is the minimum free energy rate (topological pressure in thermodynamics) distribution attaining the maximum value of −F = −Ū + S given by log λB and has therefore the form of a Boltzmann distribution, see [7, Section IV] for details. Notice that, as soon as there are virtual links, BA. By statement v) in Theorem 3.1, we then have log λA < log λB. Namely, the topological entropy has increased in accordance to our intuition. The expected total path energy of a path of length t is precisely t · Ū.

Again, as in Proposition 4.2, we have a special case of Proposition 3.2. Namely, the measure 𝔐U is the solution of a SBP where the prior 𝔐 is a Markovian measure on 𝒳N+1 as in (3) but with transition mechanism given by M = B. If U0 is very large, most of the transportation will occur on the real edges. We can then take 𝔐U as the prior distribution in a maximum entropy problem as in Section V obtaining again through the solution MU[δ1x,δnx] a robust transportation plan from node 1 to node n.

b) Weighted graphs: The quantities Uij may represent the cost of transporting a unit of mass on that edge or may be inversely proportional to capacity of the link, etc. The measure 𝔐U in this case may be far from uniform since it takes into account costs/capacities of the links. Again we can set up a maximum entropy problem with 𝔐U as prior obtaining a transport MU which compromises between the need to be robust and the cost/capacities of the different paths joining the source and the sink. For instance, if Uij = cij, the cost of transporting a unit of goods on the link ij, is large, the solution to the maximum entropy problem with send less mass through this link provided the topology of the graph allows for alternative routes. In this case, low cost and robustness of the transportation plan may be effectively conjugated. Indeed, we have the following striking result which generalizes Propositions 4.1 and 5.4.

Theorem 6.1

𝔐*[δ1x, δnx](x) assigns equal probability to paths x ∈ 𝒳N+1 of equal cost. In particular, it assigns maximum and equal probability to minimum cost paths.

Proof

For a path x = (x0, x1, …, xN), we have

M[δ1x,δnx](x)=δ1x(x0)ϕv(N,xN)ϕv(0,x0)t=1N-1bxtxt+1=δ1x(x0)ϕv(N,xN)ϕv(0,x0)exp[-t=1N-1Uxtxt+1]. (24)

Observe once more that δ1x(x0)ϕv(N,xN)ϕv(0,x0) does not depend on the particular path joining x0 and xN. Since t=1N-1Uxtxt+1 is the total cost of the path, the conclusion now follows.

In the discrete optimal mass transport (OMT) problem, one usually (e.g., see [23]) seeks to first identify the least costly path(s) (x0, x1,,xN-1, xN) from any starting node x0 ∈ 𝒳 to any ending node xN, along with the corresponding end-point cost for a unit mass3,

Cx0xN=minx1,,xN-1(Ux0x1++UxN-1xN).

This is a combinatorial problem but can also be cast as a linear program [24]. Having a solution to this first problem, the OMT problem can then be recast as the linear program

minq{x0,xNqx0,xNCx0xNqx0,xN0,x0qx0,xN=νN(xN),xNqx0,xN=ν0(x0)}. (25)

The solution to (25) is the transport plan qx0,xN which dictates the portion of mass that is to be sent from x0 to xN along the corresponding least costly path (x0, x1,,xN-1, xN). Alternatively, the OMT problem can be directly cast as a linear program in as many variables as there are edges [24].

An apparent shortcoming of the OMT formalism is the “rigidity” of the transportation to utilize only paths with minimal cost from starting to ending node. The transport provided by Theorem 6.1, which readily generalizes to any two marginals v0 and vN, provides an attractive alternative to the OMT approach: Minimum cost paths all have maximum probability, but some of the mass is also transported on alternative paths thereby ensuring a certain amount of robustness of the transportation plan. Also notice that the Schrödinger bridge measure MU[δ1x,δnx] determines, as a by-product, the minimum cost paths!

The argument provided at the end of the previous section (see Proposition 5.2) shows once more that MU[δ1x,δnx] can be obtained in both of the above cases in one step as solution to the Schrödinger bridge problem with the same marginals δ1x and δnx and prior transition matrix B.

All problems considered in this and in the previous section may be solved in the same way if the initial and/or the final mass is spread over several nodes.

VII. Examples

We present a simple academic example to illustrate our method. Consider the graph in Figure 1 with the following adjacency matrix

Fig. 1.

Fig. 1

Network topology

A=[011100000001010100000100010000000010000001100000000001000000001000000001100000001].

We seek to transport a unit mass from node 1 to node 9 in N = 3 and 4 steps. We add a self loop at node 9, i.e., a99 = 1, to allow for transport paths with different step sizes.

The shortest path from node 1 to 9 is of length 3 and there are three such paths, which are 1–2–7–9, 1–3–8–9 and 1–4–8–9. If we want to transport the mass with minimum number of steps, we may end up using one of these three paths. This is not so robust. On the other hand, if we apply the Schrödinger bridge framework with the RB measure 𝔐RB as the prior, then we get a transport plan with equal probabilities using all these three paths. The evolution of mass distribution is given by

[10000000001/31/31/3000000000001/32/30000000001],

where the four rows of the matrix show the mass distribution at time step t = 0, 1, 2, 3 respectively. As we can see, the mass spreads out first and then goes to node 9. When we allow for more steps N = 4, the mass spreads even more before reassembling at node 9, as shown below

[10000000004/72/71/700000001/71/72/701/72/70000001/71/72/73/7000000001].

Now we change the graph by adding a cost on the edge (7, 9). In particular, we consider the weighted adjacency matrix

B=[011100000001010100000100010000000010000001100000000001000000000.5000000001100000001].

When N = 3 steps is allowed to transport a unit mass from node 1 to node 9, the evolution of mass distribution for the optimal transport plan is given by

[10000000001/52/52/5000000000001/54/50000000001].

The mass travels through paths 1 – 2 – 7 – 9, 1 – 3 – 8 – 9 and 1 – 4 – 8 – 9, but unlike the unweighted case, the transport plan doesn’t take equal probability for these three paths Since we added a cost on the edge (7, 9), the probability that the mass takes this path becomes smaller. The plan does, however, assign equal probability to the two minimum cost paths 1 – 3 – 8 – 9 and 1 – 4 – 8 – 9 in agreement with Theorem 6.1. Suppose now we allow for more steps N = 4 and change the B matrix to

B=[00.70.70.700000000.700.700.7000000.70000.7000000000.70000000.70.700000000000.7000000000.5000000000.70.700000000.9].

Here, transporting on any edge is expensive. It is, however, more expensive to transverse link (7, 9) and less expensive to let the mass sit at the sink node 9. The evolution of the mass distribution is now

[10000000000.50420.31730.178500000000.13880.13880.238000.12750.35690000000.13880.09920.27760.4844000000001.0000].

We observe that almost one half of the mass (0.4844) reaches node 9 in three steps, and then sits there, travelling on the three shortest paths 1–2–7–9, 1–3–8–9 and 1–4–8–9. As before, more mass (0.1785) travels on the two minimum cost paths 1 – 3 – 8 – 9 and 1–4–8–9 in agreement with Theorem 6.1, whereas 0.1275 travels on the more expensive, minimum length path 1–2–7–9. There are now several other ways the mass can reach node 9 in 4 steps. Our robust transportation plan takes full advantage of them, transporting more that one half of the total mass along these alternative paths.

Finally, we consider the case where the underlying graph is not strongly connected. In particular, we delete several links in Figure 1 to make it not strongly connected and consider the graph in Figure 2. Again we want to transport a unit mass from node 1 to node 9. In order to do this, we add an artificial energy U0 to each non existing link as discussed in Section VI. We display the results for N = 4 steps. When we take U0 = 2, the evolution of mass is

Fig. 2.

Fig. 2

Network topology

[1000000000.04150.40790.34160.03260.04620.03260.03260.03260.03260.02700.03490.17400.14770.23300.06030.06030.16140.10140.01160.01520.01990.02420.01630.17090.17090.26410.3069000000001].

We can see that there is quite a portion of mass traveling along virtual (non existing) edges. If we increase the value to U0 = 8, then the mass evolution becomes

[1.0000000000000.00010.59950.40000.00010.00010.00010.00010.00010.00010.00000.00000.20000.19990.39940.00020.00020.19990.00040.00000.00000.00000.00010.00000.19990.19990.39950.2007000000001.0000].

The portion of mass traveling along non existing edges is negligible. Eventually, all the mass would be transported along feasible paths and in the limit the mass evolution (flow) is given by the rows of

[10000000003/52/5000000001/51/52/5001/50000001/51/52/51/5000000001].

VIII. Conclusions

In this paper, we have proposed a novel approach to design a robust transportation plan on a given directed graph. It is based on a sort of generalized maximum entropy problem (Schrödinger bridge) for measures on paths of the given network. Taking as prior measure the Ruelle-Bowen-Parry random walker, the solution naturally tends to spread the mass on all available routes joining the source and the sink. Hence, the resulting transport appears robust with respect to links/nodes failure. This approach can be adapted to graphs that are not strongly connected, as well as to weighted graphs. In the latter case, it can be used to effectively compromise between robustness and cost. Indeed, we exhibit a robust transportation plan which assigns maximum probability to minimum cost paths and therefore appears attractive when compared with Optimal Mass Transportation approaches. Since the transport plan is computed as a Schrödinger bridge, for which an efficient iterative algorithm is available, our procedure also appears to be computationally attractive.

In this paper, in order to avoid obscuring the fundamental ideas and to keep the paper at a reasonable length, we have chosen to present the essential features of our approach without touching on a number of related fascinating topics. For instance, in this paper robustness of a transport plan simply means that, in case of failure of certain links (e.g. due to congestion) or nodes, most of the mass will anyway reach the target nodes. There are, however, other notions of robustness in graph theory [1], [25], [26], [6], [27], some related to entropic principles [28], [29].

When weights represent costs, our approach of Section VI compromizing between minimization and robustness can be further compared to Optimal Mass Transport (OMT) over graphs [30], where only cost matters, and entropically regularized OMT-schemes [31], [32]. In discrete OMT, however, the cost function is supposed to be given, although computing it is typically an intractable problem for large networks.

Also, it is apparent that choosing the uniform as prior distribution in the maximum entropy problem such as in Section V we obtain a spreading of trajectories over which the transport occurs similar to the one in Optimal Mass Transport (OMT) on manifolds with positive Ricci-Curbastro curvature [33]. On discrete spaces and graphs, similar notions of curvature have been defined by Ollivier [34], [35]. They capture robustness and connectedness, convexity of entropy, and are related to the spectral gap [36], [37]. Their relevance in applications is discussed in, e.g., [38], [26], [6], [27]. It is therefore natural to investigate the precise connection between the role of the prior in random evolutions such as those studied in this paper and deterministic evolution on discrete curved spaces. All of these fascinating topics deserve further investigation and will be addressed elsewhere.

Acknowledgments

Supported in part by the NSF under Grant ECCS-1509387, the AFOSR under Grants FA9550-12-1-0319 and FA9550-15-1-0045, the NIH under Grants P41-RR-013218, P41-EB-015902 and 1U24CA18092401A1. This work was also supported by the Vincentine Hermes-Luh Chair, and by the University of Padova Research Project CPDA 140897.

Footnotes

1

Here we use the convention that 0/0 = 0.

2

We are now conforming to notation in [7] for ease of comparison. Hence we use u and v rather than ϕ̂ and ϕ.

3

We assume a self loop for each node with zero cost, i.e., Uxx = 0 for each x ∈ 𝒳.

Contributor Information

Yongxin Chen, Department of Mechanical Engineering, University of Minnesota, Minneapolis, Minnesota MN 55455, USA.

Tryphon Georgiou, Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, Minnesota MN 55455, USA.

Michele Pavon, Dipartimento di Matematica “Tullio Levi Civita”, Universit à di Padova, via Trieste 63, 35121 Padova, Italy.

Allen Tannenbaum, Department of Computer Science, Stony Brook University, Stony Brook, NY 11794.

References

  • 1.Callaway DS, Newman ME, Strogatz SH, Watts DJ. Network robustness and fragility: Percolation on random graphs. Physical review letters. 2000;85(25):5468. doi: 10.1103/PhysRevLett.85.5468. [DOI] [PubMed] [Google Scholar]
  • 2.Watts DJ, Strogatz SH. Collective dynamics of small-world networks. nature. 1998;393(6684):440–442. doi: 10.1038/30918. [DOI] [PubMed] [Google Scholar]
  • 3.Scott DM, Novak DC, Aultman-Hall L, Guo F. Network robustness index: a new method for identifying critical links and evaluating the performance of transportation networks. Journal of Transport Geography. 2006;14(3):215–227. [Google Scholar]
  • 4.Cabanes G, van Wilgenburg E, Beekman M, Latty T. Ants build transportation networks that optimize cost and efficiency at the expense of robustness. Behavioral Ecology. 2014;26(1):223–231. [Google Scholar]
  • 5.Brin S, Page L. Reprint of: The anatomy of a large-scale hypertextual web search engine. Computer networks. 2012;56(18):3825–3833. [Google Scholar]
  • 6.Sandhu R, Georgiou T, Reznik E, Zhu L, Kolesov I, Senbabaoglu Y, Tannenbaum A. Graph curvature for differentiating cancer networks. Scientific reports. 2015;5 doi: 10.1038/srep12323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Delvenne JC, Libert AS. Centrality measures and thermodynamic formalism for complex networks. Physical Review E. 2011;83(4):046117. doi: 10.1103/PhysRevE.83.046117. [DOI] [PubMed] [Google Scholar]
  • 8.Kitano H. Towards a theory of biological robustness. Molecular systems biology. 2007;3(1):137. doi: 10.1038/msb4100179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Georgiou TT, Pavon M. Positive contraction mappings for classical and quantum Schrödinger systems. Journal of Mathematical Physics. 2015;56(3):033301. [Google Scholar]
  • 10.Pavon M, Ticozzi F. Discrete-time classical and quantum markovian evolutions: Maximum entropy problems on path space. Journal of Mathematical Physics. 2010;51(4):042104. [Google Scholar]
  • 11.Parry W. Intrinsic markov chains. Transactions of the American Mathematical Society. 1964;112(1):55–66. [Google Scholar]
  • 12.Cover TM, Thomas JA. Elements of information theory. John Wiley & Sons; 2012. [Google Scholar]
  • 13.Föllmer H. École d’Été de Probabilités de Saint-Flour XV–XVII, 1985–87. Springer; 1988. Random fields and diffusion processes; pp. 101–203. [Google Scholar]
  • 14.Lemmens B, Nussbaum R. Nonlinear Perron-Frobenius Theory. Vol. 189 Cambridge University Press; 2012. [Google Scholar]
  • 15.Birkhoff G. Extensions of jentzsch’s theorem. Transactions of the American Mathematical Society. 1957;85(1):219–227. [Google Scholar]
  • 16.Bushell P. On the projective contraction ratio for positive linear mappings. Journal of the London Mathematical Society. 1973;2(2):256–258. [Google Scholar]
  • 17.Bushell PJ. Hilbert’s metric and positive contraction mappings in a Banach space. Archive for Rational Mechanics and Analysis. 1973;52(4):330–338. [Google Scholar]
  • 18.Fortet R. Résolution d’un système d’équations de M. Schrödinger. J Math Pures Appl. 83(9):1940. [Google Scholar]
  • 19.Beurling A. An automorphism of product measures. The Annals of Mathematics. 1960;72(1):189–200. [Google Scholar]
  • 20.Jamison B. Reciprocal processes. Z. Wahrscheinlichkeitstheorie verw. Gebiete. 1974;30:65–86. [Google Scholar]
  • 21.Horn RA, Johnson CR. Matrix analysis. Cambridge university press; 2012. [Google Scholar]
  • 22.Ruelle D. Thermodynamic formalism: the mathematical structure of equilibrium statistical mechanics. Cambridge University Press; 2004. [Google Scholar]
  • 23.Rachev ST, Rüschendorf L. Mass Transportation Problems: Volume I: Theory. Vol. 1 Springer Science & Business Media; 1998. [Google Scholar]
  • 24.Bazaraa MS, Jarvis JJ, Sherali HD. Linear programming and network flows. John Wiley & Sons; 2011. [Google Scholar]
  • 25.Jamakovic A, Uhlig S. On the relationship between the algebraic connectivity and graph’s robustness to node and link failures. Next Generation Internet Networks, 3rd EuroNGI Conference on; IEEE; 2007. pp. 96–102. [Google Scholar]
  • 26.Wang C, Jonckheere E, Banirazi R. Wireless network capacity versus ollivier-ricci curvature under heat-diffusion (hd) protocol. American Control Conference (ACC); 2014; IEEE; 2014. pp. 3536–3541. [Google Scholar]
  • 27.Sandhu R, Georgiou T, Tannenbaum A. Market fragility, systemic risk, and ricci curvature. 2015 doi: 10.1126/sciadv.1501495. arXiv preprint arXiv:1505.05182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Arnold L, Gundlach VM, Demetrius L. Evolutionary formalism for products of positive random matrices. The Annals of Applied Probability. 1994:859–901. [Google Scholar]
  • 29.Demetrius L, Manke T. Robustness and network evolution: an entropic principle. Physica A: Statistical Mechanics and its Applications. 2005;346(3):682–696. [Google Scholar]
  • 30.Léonard C. Lazy random walks and optimal transport on graphs. 2013 arXiv preprint arXiv:1308.0226. [Google Scholar]
  • 31.Cuturi M. Sinkhorn distances: Lightspeed computation of optimal transport. Advances in Neural Information Processing Systems. 2013:2292–2300. [Google Scholar]
  • 32.Benamou JD, Carlier G, Cuturi M, Nenna L, Peyré G. Iterative bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing. 2015;37(2):A1111–A1138. [Google Scholar]
  • 33.Villani C. Optimal transport: old and new. Vol. 338 Springer; 2008. [Google Scholar]
  • 34.Ollivier Y. Ricci curvature of markov chains on metric spaces. Journal of Functional Analysis. 2009;256(3):810–864. [Google Scholar]
  • 35.Ollivier Y. A survey of ricci curvature for metric spaces and markov chains. Probabilistic approach to geometry. 2010;57:343–381. [Google Scholar]
  • 36.Bauer F, Jost J, Liu S. Ollivier-Ricci curvature and the spectrum of the normalized graph laplace operator. 2011 arXiv preprint arXiv:1105.3803. [Google Scholar]
  • 37.Jost J, Liu S. Ollivier’s Ricci curvature, local clustering and curvature-dimension inequalities on graphs. Discrete & Computational Geometry. 2014;51(2):300–322. [Google Scholar]
  • 38.Banirazi R, Jonckheere E, Krishnamachari B. Heat diffusion algorithm for resource allocation and routing in multihop wireless networks. Global Communications Conference (GLOBECOM), 2012 IEEE; IEEE; 2012. pp. 5693–5698. [Google Scholar]

RESOURCES