Skip to main content
Springer logoLink to Springer
. 2022 Apr 8;187(2):19. doi: 10.1007/s10955-022-02911-9

A Dual Formula for the Noncommutative Transport Distance

Melchior Wirth 1,
PMCID: PMC8993752  PMID: 35509951

Abstract

In this article we study the noncommutative transport distance introduced by Carlen and Maas and its entropic regularization defined by Becker and Li. We prove a duality formula that can be understood as a quantum version of the dual Benamou–Brenier formulation of the Wasserstein distance in terms of subsolutions of a Hamilton–Jacobi–Bellmann equation.

Keywords: Quantum Markov semigroup, Duality, Quantum optimal transport

Introduction

The theory of optimal transport [28, 29] has experienced rapid growth in recent years with applications in diverse fields across pure and applied mathematics. Along with this growth came a lot of interest in extending the methods of optimal transport beyond the scope of its original formulation as an optimization problem for the transport cost between two probability measures.

One such extension deals with “quantum spaces”, where the probability measures are replaced by density matrices or density operators. Most of the work on quantum optimal transport in this sense can be grouped into one of the following two categories. The first approach (see e.g. [69, 11, 22, 27]) takes a quantum Markov semigroup (QMS) as input datum and relies on a noncommutative analog of the Benamou–Brenier formulation [4] of the Wasserstein distance for probability measures on Euclidean space

W22(μ,ν)=inf01Rn|vt|2dρtdt:ρ0=μ,ρ1=ν,ρ˙t+·(ρtvt)=0.

In the simple case when the generator L of the QMS is of the form

LA=jJ[Vj,[Vj,A]]

with self-adjoint matrices Vj, the associated noncommutative transport distance W on the set of density matrices is given by

W2(ρ0,ρ1)=inf01jJτ(Wj(t)[ρ(t)]0-1(Wj))dt:ρ˙(t)=jJ[Vj,Wj(t)],

where the infimum is taken over curves ρ that satisfy ρ(0)=ρ0, ρ(1)=ρ1, and where

[X]0(A)=01XαAX1-αdα.

For the definition of the metric W in the more general case of a QMS satisfying the detailed balance condition (DBC), we refer to the next section.

This approach has proven fruitful in applications to noncommutative functional inequalities, similar in spirit to the heuristics known as Otto calculus [8, 9, 12, 31].

The second approach (see e.g. [13, 14, 17, 23, 25, 26]) seeks to find a suitable noncommutative analog of the Monge–Kantorovich formulation [20] of the Wasserstein distance via couplings (or transport plans):

Wpp(μ,ν)=infX×Xdp(x,y)dπ(x,y):(pr1)#π=μ,(pr2)#π=ν.

This approach also allows to consider a quantum version of Monge–Kantorovich problem for arbitrary cost functions. So far, possible connections between these two approaches in the quantum world stay elusive.

The focus of this article lies on the noncommutative transport distance W introduced in the first approach. More precisely, we prove a dual formula that is a noncommutative analog of the expression of the classical L2-Wasserstein distance in terms of subsolutions of the Hamilton–Jacobi equation [5, 24]

W22(μ,ν)=12infRnu1dμ-Rnu0dν:u˙t+12|ut|20.

This result yields a noncommutative version of the dual formula obtained independently by Erbar et al. [15] and Gangb et al. [16] for the Wasserstein-like transport distance on graphs. In fact, we prove a dual formula that is not only valid for the metric W, but also for the entropic regularization recently introduced by Becker–Li [3]. When the generator L is again of the simple form discussed above, the entropic regularization Wε is a metric obtained when replacing the constraint

ρ˙(t)=jJ[Vj,Wj(t)]

in the definition of W by

ρ˙(t)=jJ[Vj,Wj(t)]+εLρ(t).

With the notation introduced in the next section, the main result of this article reads as follows.

Theorem

Let σMn(C) be an invertible density matrix and (Pt) an ergodic QMS on Mn(C) that satisfies the σ-DBC. The entropic regularization Wε of noncommutative transport distance induced by (Pt) satisfies the following dual formula:

12Wε2(ρ0,ρ1)=sup{τ(A(1)ρ1-A(0)ρ0)AHJBε1}.

Here a QMS (Pt) is said to satisfy the σ-DBC if

τ((PtA)Bσ)=τ(A(PtB)σ)

for all A,BMn(C) and t0. If σ is the identity matrix, this is the case exactly when the generator is of the form LA=j[Vj,[Vj,A]] with self-adjoint matrices Vj.

Moreover, HJBε1 stands for the set of all Hamilton–Jacobi–Bellmann subsolutions, a suitable noncommutative variant of solutions of the differential inequality

u˙(t)+12|u(t)|2-εΔu(t)0.

Other metrics similar to W also occur in the literature, most notably the one called the “anticommutator case” in [3, 10, 11]. In [9, 30], a class of such metrics was studied in a systematic way, and our main theorem applies in fact to this wider class of metrics. For the anticommutator case, this duality formula was obtained before in [10].

There are still some very natural questions left open. For one, we do not discuss the existence of optimizers. While for the primal problem this follows from a standard compactness argument, this question is more delicate for the dual problem, even when dealing with probability densities on discrete spaces instead of density matrices, and one has to relax the problem to obtain maximizers (see [16,  Sects. 6–7]).

Another interesting direction would be to extend the duality result from matrix algebras to infinite-dimensional systems. While a definition of the metric W for QMSs on semi-finite von Neumann algebras is available [19, 30], the problem of duality seems to be much harder to address. Even for abstract diffusion semigroups, the best known result only shows that the primal distance is the upper length distance associated with the dual distance and leaves the question of equality open [2,  Proposition 10.11].

Setting and Basic Definitions

In this section we introduce basic facts and definitions about QMSs that will be used later on. In particular, we review the definition of the noncommutative transport distance from [8] and its entropic regularization introduced in [3]. Our notation mostly follows [8, 9]. For a list of symbols we refer the reader to the end of this article.

Let Mn(C) denote the complex n×n matrices and let A be a unital -subalgebra of Mn(C). Let Ah denote the self-adjoint part of A, A+ the cone of positive elements of A and A++ the subset of invertible positive elements. We write τ for the normalized trace on Mn(C), that is,

τ(A)=1nk=1nAkk,

and HA for the Hilbert space formed by equipping A with the GNS inner product

·,·HA:A×AC,(A,B)τ(AB).

The adjoint of a linear operator K:HAHA is denoted by K.

We write S(A) for the set of all density matrices on A, that is, all positive elements ρA with τ(ρ)=1. The subset of invertible density matrices is denoted by S+(A).

A QMS on A is a family (Pt)t0 of linear operators on A that satisfy the following conditions:

  • Pt is unital and completely positive for every t0,

  • P0=idA, Ps+t=PsPt for all s,t0,

  • tPt is continuous.

We consider a QMS (Pt) on A which extends to a QMS on Mn(C) satisfying the σ-detailed balance condition (σ-DBC) for some density matrix σS+(A), that is,

τ((PtA)Bσ)=τ(A(PtB)σ)

for A,BA and t0. For σ=idA, this reduces to the symmetry condition Pt=Pt.

Let L denote the generator of (Pt), that is, the linear operator on A given by

L(A)=limt0PtA-At.

We further assume that (Pt) is ergodic (or primitive), that is, the kernel of L is one-dimensional. This assumption is natural in this context as it ensures that the metric WΛ,ε defined below is the geodesic distance induced by a Riemannian metric on S+(A) and in particular that it is finite.

Generators of QMSs are often described by their Lindblad form, but here we will rely on the additional structure coming from the σ-DBC and use a presentation of L provided by Alicki’s theorem [1, Theorem 3], [8, Theorem 3.1] instead: There exists a finite set J, real numbers ωj for jJ and VjMn(C) for jJ with the following properties:

  • τ(VjVk)=δjk for j,kJ,

  • τ(Vj)=0 for jJ,

  • for every jJ there exists a unique jJ with Vj=Vj,

  • σVjσ-1=e-ωjVj for jJ

such that

L(A)=jJe-ωj/2Vj[A,Vj]-eωj/2[A,Vj]Vj

for AA.

The numbers ωj are called Bohr frequencies of L and are uniquely determined by (Pt). The matrices Vj are not uniquely determined by (Pt) and σ, but in the following we will fix a set {VjjJ} that satisfies the preceding conditions.

Next we will discuss how the data from Alicki’s theorem give rise to a differential structure associated with L.

Let

HA,J=jJHA(j),

where HA(j) is a copy of HA for jJ. This is the quantum analog of the space of tangent vector fields in our setting.

We write j for [Vj,·] and

:HAHA,J,(A)=(j(A))jJ,

which provide analogs of the partial derivatives and the usual gradient operator, respectively. The commutator j satisfies the product rule

j(AB)=Aj(B)+j(A)B. 1

Note that in contrast too the usual partial derivatives, the order of the factors plays a role here. This is one central reason for many of the differences and intricacies of the quantum optimal transport distance compared to the classical Wasserstein distance.

Continuing with the analogy with calculus, we write div for the adjoint of -, that is,

div=-jJj.

The crucial ingredient in the definition of W, which allows to deal with the noncommutativity of the product rule, is the operator [ρ]ω, whose definition we recall next. For XA+ and αR define

[X]α:HAHA,[X]α(A)=01eα(s-1/2)XsAX1-sds.

The motivation for this definition is a chain rule identity [8,  Eq. (5.7)], which can best be illustrated in the case α=0:

[X]0(j(logX))=j(X).

Given α=(αj)jJ, we define

[X]α:HA,JHA,J,(Vj)jJ([X]αjVj)jJ.

For ε0 we write Inline graphic for the set of all pairs (ρ,V) such that ρH1([0,1];S+(A)) with ρ(0)=ρ0, ρ(1)=ρ1, VL2([0,1];HA,J) and

ρ˙(t)+divV(t)=εLρ(t) 2

for a.e. t[0,1].

Here and in the following we write H1([0,1];S+(A)) for the space of all maps ρ:[0,1]S+(A) such that (tτ(Aρ(t)))H1([0,1]) for all AA. The space L2([0,1];HA,J) and other vector-valued functions spaces occurring later are defined similarly.

We define a metric Wε on S+(A) by

graphic file with name 10955_2022_2911_Equ96_HTML.gif

where ω=(ωj)jJ with the Bohr frequencies ωj of L.

For ε=0, this is the noncommutative transport distance W introduced in [8] (as distance function associated with a Riemannian metric on S(A)+), and for ε>0, this is the entropic regularization of W introduced in [3].

A standard mollification argument shows that the infimum in the definition of Wε can equivalently be taken over Inline graphic with ρC([0,1];S+(A)). More precisely, if Inline graphic and (ηδ)δ>0 is a mollifying kernel, then (ρηδ,Vηδ) satisfies (2). A suitable reparametrization of the time parameter gives a pair Inline graphic such that ρδ is smooth and

limδ001Vδ(t),[ρδ(t)]ω-1Vδ(t)dt=01V(t),[ρ(t)]ω-1V(t)dt

By a substitution one can reformulate the minimization problem for Wε in such a way that the constraint becomes independent from ε. For that purpose define the relative entropy of ρS+(A) with respect to σ by

D(ρσ)=τ(ρ(logρ-logσ))

and the Fisher information of ρS+(A) by

I(ρ)=[ρ]ω(logρ-logσ),(logρ-logσ)HA,J.

According to [3,  Theorem 1], one has

graphic file with name 10955_2022_2911_Equ97_HTML.gif

The metric W is intimately connected to the relative entropy and therefore well-suited to study its decay properties along the QMS. For other applications, variants of the metric W have also proven useful (e.g. [10, 11]), for which the operator [ρ]ω is replaced. A systematic framework of these metrics has been developed in [9, 30]. It can be conveniently phrased in terms of so-called operator connections.

Let H be an infinite-dimensional Hilbert space. A map Λ:B(H)+×B(H)+B(H)+ is called an operator connection [21] if

  • AC and BD imply Λ(A,B)Λ(C,D) for A,B,C,DB(H)+,

  • CΛ(A,B)CΛ(CAC,CBC) for A,B,CB(H)+,

  • AnA, BnB imply Λ(An,Bn)Λ(A,B) for A,An,B,BnB(H)+.

For example, for every αR the map

Λα:(A,B)01eα(s-1/2)AsB1-sds

is an operator connection.

It can be shown that every operator connection Λ satisfies

UΛ(A,B)U=Λ(UAU,UBU)

for A,BB(H)+ and unitary UB(H) [21,  Sect. 2]. Embedding Cn into H, one can view A,BMn(C) as bounded linear operators on H, and the unitary invariance of Λ ensures that Λ(A,B) does not depend on the embedding of Cn into H.

For XA define

L(X):HAHA,AXAR(X):HAHA,AAX.

Note that if XA+, then

A,L(X)AHA=τ(AXA)=τ((X1/2A)(X1/2A))0,

so that L(X) is a positive operator, and the same holds for R(X).

Thus we can define

[X]Λ=Λ(L(X),R(X)).

If λ,μ0 and 1n denotes the identity matrix, then Λ(λ1n,μ1n) is a scalar multiple of the identity as a consequence the unitary invariance of Λ discussed above. By a slight abuse of notation, this scalar will be denoted by Λ(λ,μ).

Since L(X) and R(X) commute, we have

Λ(L(X),R(X))A=k,l=1nΛ(λk,λl)EkAEl 3

for XA+ and AHA, where (λk) are the eigenvalues of X and Ek the corresponding spectral projections.

More generally let Λ=(Λj)jJ be a family of operator connections and define

[ρ]Λj=Λj(L(ρ),R(ρ)),[ρ]Λ=jJ[ρ]Λj.

Clearly, [ρ]α=[ρ]Λα with the operator connection Λα from above.

Then one can define a distance WΛ,ε by

graphic file with name 10955_2022_2911_Equ98_HTML.gif

If Λj=Λωj as above, then we retain the original metric Wε, while for Λj(A,B)=12(A+B) (and ε=0) one obtains the distance studied in [10, 11].

Later we will make the additional assumption that Λj(A,B)=Λj(B,A), where jJ is the unique index in the Alicki representation of L such that Vj=Vj. It follows from the representation theorem of operator means [21] that the class of metrics WΛ,0 with Λ subject to this symmetry condition is exactly the class of metrics satisfying Assumptions 7.2 and 9.5 in [9].

For technical reasons in the proof of Theorem 2, it will be necessary to allow for curves of density matrices that are not necessarily invertible. For this purpose, we make the following convention: If K:HA,JHA,J is a positive operator and VHA,J, we define

V,K-1VHA,J=KW,WHA,JifV(kerK),KW=V,otherwise.

Since (kerK)=ranK and K is injective on (kerK), the element W in this definition exists and is unique. Moreover, this convention is clearly consistent with the usual definition if K is invertible.

Alternatively, as a direct consequence of the spectral theorem, this expression can equivalently be defined as

V,K-1VHA,J=k=1m1λk|V,WkHA,J|2,

where λ1,,λm are the eigenvalues of K and W1,,Wm an orthonormal basis of corresponding eigenvectors.

Lemma 1

If Kn:HA,JHA,J, nN, are positive invertible operators that converge monotonically decreasing to K, then

V,Kn-1VHA,JV,K-1VHA,J

for all VHA,J.

Proof

From the spectral expression it is easy to see that

V,Kn-1VHA,J=supδ>0V,(Kn+δ)-1VHA,J

and the same for Kn replaced by K. Moreover, since KnK, we have (Kn+δ)-1(K+δ)-1. Thus

V,K-1VHA,J=supδ>0V,(K+δ)-1VHA,J=supδ>0supnNV,(Kn+δ)-1VHA,J=supnNsupδ>0V,(Kn+δ)-1VHA,J=supnNV,Kn-1VHA,J.

Since (V,Kn-1VHA,J) is monotonically increasing, this settles the claim.

Write Inline graphic for the set of all pairs (ρ,V) such that ρH1([0,1];S(A)) with ρ(0)=ρ0, ρ(1)=ρ1, VL2([0,1];HA,J) and

ρ˙(t)+divV(t)=εLρ(t)

for a.e. t[0,1]. The only difference to the definition of Inline graphic is that ρ(t) is not assumed to be invertible.

Proposition 1

For ρ0,ρ1S+(A) we have

graphic file with name 10955_2022_2911_Equ99_HTML.gif

Proof

It suffices to show that every curve Inline graphic can be approximated by curves in Inline graphic such that the action integrals converge.

For that purpose let

ρδ:[0,1]S+(A),t(1-t)ρ0+t1Aift[0,δ],(1-δ)ρ((1-2δ)-1(t-δ))+δ1Aift(δ,1-δ),tρ1+(1-t)1Aift[1-δ,1].

Since (Pt) is assumed to be ergodic, by [8,  Theorem 5.4] there exists for every t[0,1] a unique X(t)Ah with τ(X(t)))=0 such that

1A-ρ0+div(X(t))=ε(1-t)Lρ0,

and X(t) depends continuously on t. For t[0,δ] let Vδ(t)=X(t).

Moreover, if λ is the smallest eigenvalue of ρ0, which is strictly positive by assumption, then ρδ(t)((1-t)λ+t)1Aλ1A.

Thus

0δVδ(t),[ρδ(t)]Λ-1Vδ(t)HA,Jdt0δVδ(t),[λ1A]Λ-1Vδ(t)HA,Jdt[λ1A]Λ-10δX(t)HA,J2dt0

as δ0. Similarly one can show

limδ01-δ1Vδ(t),[ρδ(t)]Λ-1Vδ(t)HA,Jdt=0.

By the same argument as above, for a.e. t(δ,1-δ) there exists a unique gradient Wδ(t) such that

divWδ(t)=-2δε1-2δLρ((1-2δ)-1(t-δ))

and

Wδ(t)HA,J2δε1-2δLρ((1-2δ)-1(t-δ))HA,J.

Since ρH1([0,1];S(A))C([0,1];S(A)), the norm on the right side is bounded independent of δ, so that

Wδ(t)HA,JC~δ

with a constant C~>0 independent of δ. As ρδ(t)δ1A for t(δ,1-δ), this implies

δ1-δWδ(t),[ρδ(t)]Λ-1Wδ(t)HA,Jdt1δδ1-δWδ(t),[1A]Λ-1Wδ(t)HA,JdtC~[1A]Λ-1δ0

as δ0.

With

Vδ(t)=11-2δV((1-2δ)-1(t-δ))+Wδ(t)

we have

ρ˙δ(t)+divVδ(t)=εLρδ(t).

Furthermore,

δ1-δVt-δ1-2δ,[ρδ(t)]Λ-1Vt-δ1-2δHA,Jdt=1-2δ1-δ01V(s),ρ(s)+δ1-δΛ-1V(s)HA,Jds,

where we used the substitution s=(1-2δ)-1(t-δ).

By Lemma 1 and the monotone convergence theorem we obtain

limδ0δ1-δVt-δ1-2δ,[ρδ(t)]Λ-1Vt-δ1-2δHA,Jdt=01V(s),[ρ(s)]Λ-1V(s)HA,Jds.

Together with the convergence result for Wδ from above, this implies

δ1-δVδ(t),[ρδ(t)]Λ-1Vδ(t)HA,Jdt01V(t),[ρ(t)]Λ-1V(t)HA,Jdt.

Altogether we have shown

limδ001Vδ(t),[ρδ(t)]Λ-1Vδ(t)HA,Jdt=01V(t),[ρ(t)]Λ-1V(t)HA,Jdt.

Real subspaces

Since the proof of the main result relies on convex analysis methods for real Banach spaces, we need to identify suitable real subspaces for our purposes. For A this is simply Ah, but for HA,J this is less obvious and will be done in the following.

For jJ denote by j the unique index in J such that Vj=Vj. Let H~A(j) be the linear span of {XjAA,XA}, and define a linear map J:H~A(j)H~A(j) by

J(XjA)=j(A)X.

By the product rule (1), (jA)X also belongs to H~A,J(j) and

J((jA)X)=Xj(A).

Thus J interchanges left and right multiplication, that is, J(AVB)=BJ(V)A for A,BA and VH~A(j).

Lemma 2

The map J is anti-unitary.

Proof

For A,B,X,YA we have

J(XjA),J(YjB)HA=τ(X(AVj-VjA)(VjB-BVj)Y)=τ((BVj-VjB)YX(VjA-AVj))=YjB,XjA.

Let

HA,J(h)=VjJH~A(j)J(Vj)=Vj.

By the previous lemma, HJ,A(h) is a real Hilbert space.

Lemma 3

Let (Λj)jJ be a family of operator connections such that

Λj(B,A)=Λj(A,B)

for all jJ. If AAh and ρS(A), then A,[ρ]ΛAHA,J(h).

Proof

For A the statement follows directly from the definitions. For [ρ]ΛA first note that

JΛ(L(ρ),R(ρ))=Λ(R(ρ),L(ρ))J

as a consequence of the spectral representation (3) and the fact that J interchanges left and right multiplication.

Thus

J([ρ]ΛjjA)=JΛj(L(ρ),R(ρ))jA=Λj(R(ρ),L(ρ))JjA=Λj(L(ρ),R(ρ))jA.

Duality

In this section we prove the duality theorem announced in the introduction. Our strategy follows the same lines as the proof in the commutative case in [15]. It crucially relies on the Rockafellar–Fenchel duality theorem quoted below. Throughout this section we fix an ergodic QMS with generator L satisfying the σ-DBC for some σS+(A) and a family (Λj)jJ of operator connections such that Λj(A,B)=Λj(B,A) for all jJ.

We need the following definition for the constraint of the dual problem. Here and in the following we write

V,Wρ=V,[ρ]ΛWHA,J

for V,WHA,J and ρA+.

Definition 1

A function AH1((0,T);Ah) is said to be a Hamilton–Jacobi–Bellmann subsolution if for a.e. t(0,T) we have

τ((A˙(t)+εLA(t))ρ)+12A(t)ρ20for allρS(A).

The set of all Hamilton–Jacobi–Bellmann subsolutions is denoted by HJBΛ,ε.

Our proof will establish equality between the primal and dual problem, but before we begin, let us show that one inequality is actually quite easy to obtain.

Proposition 2

For all ρ0,ρ1S+(A) we have

graphic file with name 10955_2022_2911_Equ100_HTML.gif

Proof

For AHJBΛ,ε and Inline graphic we have

τ(A(1)ρ1-A(0)ρ0)=01τ(A˙(t)ρ(t)+A(t)ρ˙(t))dt-01ετ((LA(t))ρ(t))+12A(t)ρ(t)2dt+01(A(t),V(t)HA,J+ετ(A(t)Lρ(t)))dt=01[ρ(t)]Λ1/2A(t),[ρ(t)]Λ-1/2V(t)HA,Jdt-1201[ρ(t)]Λ1/2A(t),[ρ(t)]Λ1/2A(t)HA,Jdt1201[ρ(t)]Λ1/2A(t),[ρ(t)]Λ1/2A(t)HA,J2dt+1201[ρ(t)]Λ-1/2A(t),[ρ(t)]Λ-1/2A(t)HA,J2dt-1201[ρ(t)]Λ1/2A(t),[ρ(t)]Λ1/2A(t)HA,Jdt=1201V(t),[ρ(t)]Λ-1V(t)HA,Jdt,

where we used AHJBΛ,ε and Inline graphic for the first inequality and Young’s inequality 2|ξ,η|ξ,ξ+η,η for the second inequality.

To prove actual equality, our crucial tool is the Rockafellar–Fenchel duality theorem (see e.g. [28,  Theorem 1.9], which we quote here for the convenience of the reader. Recall that if E is a (real) normed space, the Legendre–Fenchel transform F of a proper convex function F:ER{} is defined by

F:ER{},F(x)=supxE(x,x-F(x)).

Theorem 1

Let E be a real normed space and F,G:ER{} proper convex functions with Legendre–Fenchel transforms F,G. If there exists z0E such that G is continuous at z0 and F(z0),G(z0)<, then

supzE(-F(z)-G(z))=minzE(F(z)+G(-z)).

Before we state the main result, we still need the following useful inequality.

Lemma 4

For any operator connection Λ the map

fΛ:A++B(HA),A[A]Λ

is smooth and its Fréchet derivative satisfies

dfΛ(B)AfΛ(A)

for A,BA++ with equality if A=B.

Proof

Smoothness of fΛ is a consequence of the representation theorem of operator connections [21,  Theorem 3.4]. For the claim about the Fréchet derivative first note that fΛ is concave [21,  Theorem 3.5]. Therefore d2fΛ(X)[Y,Y]0 for all XA++ and YAh by [18,  Proposition 2.2].

The fundamental theorem of calculus implies

(dfΛ(A)-dfΛ(B))(A-B)=01d2fΛ(tA+(1-t)B)[A-B,A-B]dt0.

Since fΛ is 1-homogeneous by [21,  Eq. (2.1)], its derivative is 0-homogeneous. Thus, if we replace B by εB and let ε0, we obtain

dfΛ(A)AdfΛ(B)A.

Moreover, the 1-homogeneity of fΛ implies dfΛ(A)A=fΛ(A), which settles the claim.

Theorem 2

(Duality formula) For ρ0,ρ1S+(A) we have

12WΛ,ε(ρ0,ρ1)2=sup{τ(A(1)ρ1)-τ(A(0)ρ0):AHJBΛ,ε}=sup{τ(A(1)ρ1)-τ(A(0)ρ0):AHJBΛ,εC([0,1];A)}.

Proof

The second inequality follows easily by mollifying. We will show the duality formula for Hamilton–Jacobi subsolutions in H1. For this purpose we use the Rockafellar–Fenchel duality formula from Theorem 1.

Let E be the real Banach space

H1([0,1];HA(h))×L2([0,1];HA,J(h)).

By the theory of linear ordinary differential equations, the map

H1([0,1];HA(h))HA(h)×L2([0,1];HA(h)),A(A(0),A˙+εLA)

is a linear isomorphism.

Thus the dual space E can be isomorphically identified with

HA(h)×L2([0,1];HA(h))×L2([0,1];HA,J(h))

via the dual pairing

(A,V),(B,C,W)=τ(A(0)B)+01τ((A˙(t)+εLA(t))C(t))dt+01V(t),W(t)HA,Jdt.

Define functionals F,G:ER{} by

F(A,V)=-τ(A(1)ρ1)+τ(A(0)ρ0)ifV=A,otherwise,G(A,V)=0if(A,V)D,otherwise.

Here D denotes the set of all pairs (A,V) such that

τ((A˙(t)+εLA(t))ρ)+12V(t)ρ20

for all t[0,1], ρS(A).

It is easy to see that F and G are convex. Moreover, for A0(t)=-t1A and V0=0 we have V0=A0, hence F(A0,V0)=0, and

τ((A˙0(t)+εLA0(t))ρ)+12V0(t)ρ2=-1

for all t[0,1],ρS(A), hence G(A0,V0)=0. Furthermore, G is clearly continuous at (A0,V0).

Moreover,

sup(A,V)E(-F(A,V)-G(A,V))=supAHJBΛ,ε(ρ0,ρ1)(τ(A(1)ρ1)-τ(A(0)ρ0)).

Let us calculate the Legendre transforms of F and G, keeping in mind the identification of E. For F we obtain

F(B,C,W)=sup(A,V)E{(A,V),(B,C,W)-F(A,V)}=supA{τ(A(0)B)+01τ((A˙(t)+εLA(t))C(t))dt+01A(t),W(t)HA,Jdt+τ(A(1)ρ1)-τ(A(0)ρ0)}.

Since the last expression is homogeneous in A, we have F(B,C,W)= unless

-τ(A(1)ρ1)+τ(A(0)(ρ0-B))=01τ((A˙(t)+εLA(t))C(t))dt+01A(t),W(t)HA,Jdt

for all AH1([0,1];HA(h)).

This implies C(0)=-(ρ0-B) and C(1)=-ρ1 and

C˙(t)+divW(t)=εLC(t).

Thus

graphic file with name 10955_2022_2911_Equ101_HTML.gif

Here Inline graphic denotes the set of all pairs (X,U)H1((0,1);HA(h))×L2((0,1);HA,J(h)) satisfying X(0)=ρ0-B, X(1)=ρ1 and

X˙(t)+divU(t)=εLX(t).

The difference to the definitions of Inline graphic (or Inline graphic) and Inline graphic is that we do not make any positivity or normalization constraints. Note however that if Inline graphic, then

ddtτ(X(t))=τ(εLX(t)-divU(t))=0

so that τ(X(t))=τ(ρ1)=1 (and τ(B)=0).

Now let us turn to the Legendre transform of G. We have

G(B,C,W)=sup(A,V)E{(A,V),(B,C,W)-G(A,V)}=sup(A,V)D{τ(A(0)B)+01τ((A˙(t)+εLA(t))C(t))dt+01V(t),W(t)HA,J)dt}.

Since (A,V)D implies (A+λidA,V)D for all λR, we have G(B,C,V)= unless B=0. Furthermore, it follows from the definition of D that G(0,C,W)= unless C(t)0 for a.e. t[0,1].

For B=0 we have

G(0,C,W)=sup(A,V)D01τ((A˙(t)+εLA(t))C(t))dt+01V(t),W(t)HA,Jdtsup(A,V)D{-0112V(t)C(t)2dt+01[C(t)]Λ1/2V(t),[C(t)]Λ-1/2W(t)HA,Jdt}1201[C(t)]Λ-1W(t),W(t)HA,Jdt.

We will show next that the inequalities are in fact equalities. Let Cδ=C+δ and Vδ(t)=[C(t)δ]-1W(t). Moreover, let fj=fΛj with the notation from Lemma 4. Since

HA(h)R,BjJ(dfj(Cδ(t))B)Vjδ(t),Vjδ(t)HA

is a bounded linear map that depends continuously on t, there exists a unique continuous map Xδ:[0,1]HA(h) such that

τ(BXδ(t))=jJ(dfj(Cδ(t))B)Vjδ(t),Vjδ(t)HA

for every BHA(h) and t[0,1].

Let

Aδ:[0,1]Ah,Aδ(t)=-120tXδ(s)ds.

We claim that (Aδ,Vδ)D. Indeed,

τ(A˙δ(t)ρ)=-12jJ(dfj(Cδ(t))ρ)Vjδ(t),Vjδ(t)HA-12jJ[ρ]ΛjVjδ(t),Vjδ(t)HA=-12Vδ(t)ρ2,

where the inequality follows from Lemma 4. Note that we have equality for ρ=Cδ(t).

In particular, for ρ=C(t) we obtain

τ(A˙δ(t)C(t))+Vδ(t),W(t)HA,J12[C(t)]Λ-1W(t),W(t)HA,J.

On the other hand,

τ(A˙δ(t)C(t))=-12jJ(dfj(Cδ(t))(Cδ(t)-δ))Vjδ(t),Vjδ(t)HA-12[Cδ(t)]ΛVδ(t),Vδ(t)HA,J+12[δ]ΛVδ(t),Vδ(t)HA,J-12Vδ(t),W(t)HA,J,

where we again used Lemma 4 for the first inequality.

Put together, we have

12[Cδ(t)]Λ-1W(t),W(t)HA,Jτ(A˙δ(t)C(t))+Vδ(t),W(t)HA,J12[C(t)]Λ-1W(t),W(t)HA,J,

and

limδ001(τ(A˙δ(t)C(t))+Vδ(t),W(t)HA,J)dt=1201[C(t)]Λ-1W(t),W(t)HA,Jdt

follows from the monotone convergence theorem.

Hence

G(0,C,W)=1201[C(t)]Λ-1W(t),W(t)HA,Jdt

if C(t)0 for a.e. t[0,1]. Together with the formula for F, we obtain

graphic file with name 10955_2022_2911_Equ102_HTML.gif

where the last equality follows from Proposition 1.

An application of the Rockafellar–Fenchel theorem yields the desired conclusion.

Acknowledgements

The author wants to thank Jan Maas for helpful comments. He also acknowledges financial support from the Austrian Science Fund (FWF) through Grant Number F65 and from the European Research Council (ERC) under the European Union’s Horizon 2020 Research and Innovation Programme (Grant Agreement No. 716117).

Funding

Open access funding provided by Institute of Science and Technology (IST Austria).

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Alicki R. On the detailed balance condition for non-Hamiltonian systems. Rep. Math. Phys. 1976;10(2):249–258. doi: 10.1016/0034-4877(76)90046-X. [DOI] [Google Scholar]
  • 2.Ambrosio L, Erbar M, Savaré G. Optimal transport, Cheeger energies and contractivity of dynamic transport distances in extended spaces. Nonlinear Anal. 2016;137:77–134. doi: 10.1016/j.na.2015.12.006. [DOI] [Google Scholar]
  • 3.Becker S, Li W. Quantum statistical learning via quantum Wasserstein natural gradient. J. Stat. Phys. 2021 doi: 10.1007/s10955-020-02682-1. [DOI] [Google Scholar]
  • 4.Benamou JD, Brenier Y. A computational fluid mechanics solution to the Monge–Kantorovich mass transfer problem. Numer. Math. 2000;84(3):375–393. doi: 10.1007/s002110050002. [DOI] [Google Scholar]
  • 5.Bobkov SG, Gentil I, Ledoux M. Hypercontractivity of Hamilton–Jacobi equations. J. Math. Pures Appl. (9) 2001;80(7):669–696. doi: 10.1016/S0021-7824(01)01208-9. [DOI] [Google Scholar]
  • 6.Brenier Y, Vorotnikov D. On optimal transport of matrix-valued measures. SIAM J. Math. Anal. 2020;52(3):2849–2873. doi: 10.1137/19M1274857. [DOI] [Google Scholar]
  • 7.Carlen EA, Maas J. An analog of the 2-Wasserstein metric in non-commutative probability under which the fermionic Fokker–Planck equation is gradient flow for the entropy. Commun. Math. Phys. 2014;331(3):887–926. doi: 10.1007/s00220-014-2124-8. [DOI] [Google Scholar]
  • 8.Carlen EA, Maas J. Gradient flow and entropy inequalities for quantum Markov semigroups with detailed balance. J. Funct. Anal. 2017;273(5):1810–1869. doi: 10.1016/j.jfa.2017.05.003. [DOI] [Google Scholar]
  • 9.Carlen EA, Maas J. Non-commutative calculus, optimal transport and functional inequalities in dissipative quantum systems. J. Stat. Phys. 2020;178(2):319–378. doi: 10.1007/s10955-019-02434-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Chen Y, Gangbo W, Georgiou TT, Tannenbaum A. On the matrix Monge–Kantorovich problem. Eur. J. Appl. Math. 2020;31(4):574–600. doi: 10.1017/s0956792519000172. [DOI] [Google Scholar]
  • 11.Chen Y, Georgiou TT, Tannenbaum A. Matrix optimal mass transport: a quantum mechanical approach. IEEE Trans. Autom. Control. 2018;63(8):2612–2619. doi: 10.1109/tac.2017.2767707. [DOI] [Google Scholar]
  • 12.Datta N, Rouzé C. Relating relative entropy, optimal transport and Fisher information: a quantum HWI inequality. Ann. Henri Poincaré. 2020 doi: 10.1007/s00023-020-00891-8. [DOI] [Google Scholar]
  • 13.De Palma G, Trevisan D. Quantum optimal transport with quantum channels. Ann. Henri Poincaré. 2021 doi: 10.1007/s00023-021-01042-3. [DOI] [Google Scholar]
  • 14.Duvenhage, R.: Quadratic Wasserstein metrics for von Neumann algebras via transport plans (2020). arXiv:2012.03564
  • 15.Erbar M, Maas J, Wirth M. On the geometry of geodesics in discrete optimal transport. Calc. Var. Partial Differ. Equ. 2019;58(1):19. doi: 10.1007/s00526-018-1456-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Gangbo W, Li W, Mou C. Geodesics of minimal length in the set of probability measures on graphs. ESAIM Control Optim. Calc. Var. 2019;25:36. doi: 10.1051/cocv/2018052. [DOI] [Google Scholar]
  • 17.Golse F, Mouhot C, Paul T. On the mean field and classical limits of quantum mechanics. Commun. Math. Phys. 2016;343(1):165–205. doi: 10.1007/s00220-015-2485-7. [DOI] [Google Scholar]
  • 18.Hansen F. Operator convex functions of several variables. Publ. Res. Inst. Math. Sci. 1997;33(3):443–463. doi: 10.2977/prims/1195145324. [DOI] [Google Scholar]
  • 19.Hornshaw, D.F.: L2-Wasserstein distances of tracial W-algebras and their disintegration problem (2018). arXiv:1806.01073
  • 20.Kantorovitch L. On the translocation of masses. C. R. (Dokl.) Acad. Sci. USSR (N.S.) 1942;37:199–201. [Google Scholar]
  • 21.Kubo F, Ando T. Means of positive linear operators. Math. Ann. 1980;246(3):205–224. doi: 10.1007/BF01371042. [DOI] [Google Scholar]
  • 22.Mittnenzweig M, Mielke A. An entropic gradient structure for Lindblad equations and couplings of quantum systems to macroscopic models. J. Stat. Phys. 2017;167(2):205–233. doi: 10.1007/s10955-017-1756-4. [DOI] [Google Scholar]
  • 23.Ning L, Georgiou TT, Tannenbaum A. On matrix-valued Monge–Kantorovich optimal mass transport. IEEE Trans. Autom. Control. 2015;60(2):373–382. doi: 10.1109/TAC.2014.2350171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Otto F, Villani C. Generalization of an inequality by Talagrand and links with the logarithmic Sobolev inequality. J. Funct. Anal. 2000;173(2):361–400. doi: 10.1006/jfan.1999.3557. [DOI] [Google Scholar]
  • 25.Palma, G.D., Marvian, M., Trevisan, D., Lloyd, S.: The quantum Wasserstein distance of order 1 (2020). arXiv:2009.04469
  • 26.Peyré G, Chizat L, Vialard FX, Solomon J. Quantum entropic regularization of matrix-valued optimal transport. Eur. J. Appl. Math. 2019;30(6):1079–1102. doi: 10.1017/s0956792517000274. [DOI] [Google Scholar]
  • 27.Rouzé, C., Datta, N.: Concentration of quantum states from quantum functional and transportation cost inequalities. J. Math. Phys. 60(1), 012202, 22 (2019). 10.1063/1.5023210
  • 28.Villani C. Topics in Optimal Transportation. Providence: American Mathematical Society; 2003. [Google Scholar]
  • 29.Villani, C.: Optimal transport. Old an new. In: Grundlehren der Mathematischen Wissenschaften (Fundamental Principles of Mathematical Sciences), vol. 338. Springer, Berlin (2009). 10.1007/978-3-540-71050-9
  • 30.Wirth, M.: A Noncommutative Transport Metric and Symmetric Quantum Markov Semigroups as Gradient Flows of the Entropy (2018). arXiv:1808.05419
  • 31.Wirth, M., Zhang, H.: Complete gradient estimates of quantum Markov semigroups (2020). arXiv:2007.13506 [DOI] [PMC free article] [PubMed]

Articles from Journal of Statistical Physics are provided here courtesy of Springer

RESOURCES