Lie Group Cohomology and (Multi)Symplectic Integrators: New Geometric Tools for Lie Group Machine Learning Based on Souriau Geometric Statistical Mechanics

Frédéric Barbaresco; François Gay-Balmaz

doi:10.3390/e22050498

. 2020 Apr 25;22(5):498. doi: 10.3390/e22050498

Lie Group Cohomology and (Multi)Symplectic Integrators: New Geometric Tools for Lie Group Machine Learning Based on Souriau Geometric Statistical Mechanics

Frédéric Barbaresco ¹, François Gay-Balmaz ^2,^*

PMCID: PMC7516986 PMID: 33286271

Abstract

In this paper, we describe and exploit a geometric framework for Gibbs probability densities and the associated concepts in statistical mechanics, which unifies several earlier works on the subject, including Souriau’s symplectic model of statistical mechanics, its polysymplectic extension, Koszul model, and approaches developed in quantum information geometry. We emphasize the role of equivariance with respect to Lie group actions and the role of several concepts from geometric mechanics, such as momentum maps, Casimir functions, coadjoint orbits, and Lie-Poisson brackets with cocycles, as unifying structures appearing in various applications of this framework to information geometry and machine learning. For instance, we discuss the expression of the Fisher metric in presence of equivariance and we exploit the property of the entropy of the Souriau model as a Casimir function to apply a geometric model for energy preserving entropy production. We illustrate this framework with several examples including multivariate Gaussian probability densities, and the Bogoliubov-Kubo-Mori metric as a quantum version of the Fisher metric for quantum information on coadjoint orbits. We exploit this geometric setting and Lie group equivariance to present symplectic and multisymplectic variational Lie group integration schemes for some of the equations associated with Souriau symplectic and polysymplectic models, such as the Lie-Poisson equation with cocycle.

Keywords: momentum maps, cocycles, Lie group actions, coadjoint orbits, variational integrators, (multi)symplectic integrators, fisher metric, Gibbs probability density, entropy, Lie group machine learning, Casimir functions

1. Introduction

A geometric theory of statistical mechanics was developed by Souriau [1], motivated by the observation that Gibbs equilibrium states do not satisfy the usual physical covariance assumptions. This geometric theory, called by him Lie Groups Thermodynamics, is based on a Hamiltonian action of a Lie group on a symplectic manifold, to which are associated generalized Gibbs states, indexed by a Lie algebra parameter $β$ playing the role of a geometric (Planck) temperature. Usual Gibbs states defined from a Hamiltonian appear as special cases in which the Lie group is a one-parameter group. The generalized Gibbs states become compatible with Galileo relativity in classical mechanics and with Poincaré relativity in relativistic mechanics, and the maximum entropy principle is preserved. See [2] for an exposition of Souriau’s approach.

A natural equilibrium state is characterized by an element $β$ of the Lie algebra of the Lie group, determining the equilibrium temperature. In this geometric setting, the logarithm of the partition function, identified with the Massieu potential $Φ (β)$ , is defined on this Lie algebra. Its derivative, called the thermodynamic heat $Q (β)$ , gives the mean value of the energy and is an element of the dual of the Lie algebra. From this, two important quantities are defined. First the geometric heat capacity, given by minus the derivative of Q and giving the Fisher metric of the generalized Gibbs probability densities, second the entropy defined on the dual of the Lie algebra as the Legendre transform of the Massieu potential.

This geometric setting of Souriau was exploited and developed in [3,4,5,6,7,8,9,10] towards applications in information geometry and Lie group machine learning. Different tools developed based on Souriau Lie groups thermodynamics are explored in artificial intelligence for “Supervised Machine Learning” and “Non-Supervised Machine Learning” approaches. For “Supervised Machine Learning”, neural network natural gradient from information geometry could be extended on Lie algebra based on Fisher extension with respect to Souriau covariant maximum entropy Gibbs density on coadjoint orbits. For “Non-Supervised Machine Learning”, Souriau-Fisher metric transforms problems of learning on Lie groups to more classical problems of learning on metric spaces: extension of mean/median barycenter on Lie groups by Fréchet definition of geodesic barycenter, solved by Hermann Karcher flow and by exponential map (based on Souriau algorithm for matrix characteristic polynomial computation). For “Non-Supervised Machine Learning”, extension of “mean-shift” for homogeneous symplectic manifolds and Souriau-Fisher metric space. We can also make reference to GEOMSTATS libraries [11] developing codes for machine learning on Riemannian manifolds and Lie groups.

This paper introduces basic tools to extend supervised classification, using tools from geometric statistical mechanics and information geometry, which are applied to an extension of statistical learning theory for data as elements of Lie Groups. Classically, statistical machine learning is based on convex analysis on the set of posterior probability measures with respect to Gibbs posterior measures. Lie Groups thermodynamics introduces a generalized fully covariant Gibbs density on symplectic manifolds endowed with a symplectic Lie group action admitting a momentum map. An important example is the case of coadjoint orbits endowed with the Kirilov-Kostant-Souriau symplectic forms, and its extension with non-zero cohomology. We illustrate these statistical and geometrical tools for general Lie groups, which include affine groups such as Galileo group in mechanics, and the special Euclidean groups $S E (2)$ and $S E (3)$ in robotics. Classically we can associate to any posterior distribution an effective generalized geometric temperature, given by an element of the dual space of the Lie algebra, relating it to the Gibbs prior distribution. Classification rules could be introduced by Gibbs measures defined on parameter sets and depending on the observed sample value. A Gibbs measure is a special kind of probability measure used in statistical mechanics to describe the state of a particle system driven by a given energy function at some given temperature. Gibbs measures will be realized as minimizers of the average loss value under entropy constraints. In this extension for Lie Groups, an important tool is the log-Laplace transform related to the Massieu Characteristic Function in Thermodynamics (a re-parameterization of the free energy by Planck temperature preserving Legendre transform with respect to Entropy). As we want to deal with Lie group data for machine learning, we will consider tools very similar to those used in statistical mechanics to describe particle systems with many degrees of freedom. Classification rules could be described by Gibbs measures defined on parameter sets and depending on the observed sample value. Comparing any posterior distribution with a Gibbs prior distribution make it possible to provide a way to build an estimator which can be proved to reach adaptively at the best possible asymptotic error rate (by temperature selection of a Gibbs posterior distribution built within a single parametric model). Estimators derived from Gibbs posteriors show excellent performance in diverse tasks, such as classification, regression and ranking. The usual recommendation is to sample from a Gibbs posterior using MCMC (Markov chain Monte Carlo). With covariant Souriau Gibbs density, it is possible to extend MCMC and Gibbs sampler approach for Lie Groups Machine Learning.

More recently, the use of perturbation techniques was proposed as an alternative to MCMC techniques for sampling. These results were extended in conditional random fields loss, proving that the maximum in expectation with low-rank perturbations, provides an upper bound on the log partition (what we call Massieu characteristic function). New lower bounds on the partition function and new unbiased sequential sampler for the Gibbs distribution based on low-rank perturbations were introduced. All these methods are based on sampling from the Gibbs distribution, upper bounding the log partition function. All these results are synthetized in [12], where a new general method is also proposed, with connections to the recently proposed Fenchel-Young losses [13], using doubly stochastic scheme for minimization of these losses, for unsupervised and supervised learning. This is a generalization to the Gibbs distribution.

Methods for learning parameters of a Gibbs disribution on data ${(y_{i})}_{i = 1, \dots, n}$ are based on maximization of the likelihood

{\hat{ℓ}}_{n} (θ) = \frac{1}{n} \sum_{i = 1}^{n} log p_{Gibbs, θ} (y_{i}) = \frac{1}{n} \sum_{i = 1}^{n} 〈y_{i}, θ〉 - log ψ (θ)

that is optimized by gradients methods using the empirical log-likelyhood, given by

\nabla_{θ} {\hat{ℓ}}_{n} = {\hat{y}}_{n} - E_{Gibbs, θ} (y) .

For this method of moment-matching, the expectation of the Gibbs distribution is a challenge in some cases. This approach was replaced by a method called “perturb-and-MAP” to learn the parameters in this model as a proxy for log-likelihood. This minimization is equivalent to maximizing previous equation by substituting the log-partition $log ψ (θ)$ with

F_{ϵ} (θ) = E [F (θ + ϵ V)] = E [max_{y \in C} 〈y, θ + ϵ V〉]

with a random noise vector $ϵ V$ , $ϵ > 0$ . This approach could be linked with the use of Fenchel-Young losses [13]. In the perturbed model, the Fenchel-Young loss is given by:

L_{ϵ} (θ; y) = F_{ϵ} (θ) + ϵ Ω (y) - 〈θ, y〉 = D_{ϵ Ω} (y, {\hat{y}}_{ϵ}^{*} (θ))

with loss gradient $\nabla_{θ} L_{ϵ} (θ; y) = \nabla_{θ} F_{ϵ} (θ) - y = y_{ϵ}^{*} (θ) - y$ , where

y_{ϵ}^{*} (θ) = E_{p_{θ} (y)} [y] = E [arg max_{y \in C} 〈y, θ + ϵ V〉]

and $D_{ϵ Ω} (y, {\hat{y}}_{ϵ}^{*} (θ))$ is the Bregman divergence associated with $ϵ Ω$ . As $F_{ϵ}$ generalizes the log-sum-exp function on the simplex, its dual $Ω$ is a generalization of the negative entropy (which is the Fenchel dual of log-sum-exp). These connections were studied in [14].

In this paper, we describe a geometric framework for the study of Gibbs probability densities in statistical mechanics and information geometry, as well as the associated concepts of thermodynamic heat, entropy, and Fisher metric, inspired by Souriau’s symplectic model of statistical mechanics. This geometric framework unifies several earlier works on the subject, including Souriau’s symplectic model of statistical mechanics, its polysymplectic extension, Koszul model, and approaches developed in quantum information geometry. This approach helps to identify the common geometric structures appearing in various examples and provides a body of geometric tools for information geometry and Lie group machine learning. The emphasis is put on the role of the equivariance with respect to Lie group actions. For instance, we discuss the expression of the Fisher metric in presence of equivariance, we consider the associated Lie-Poisson equations with cocycle (also called affine Lie-Poisson equations) as well as their field theoretic versions, and we exploit the property of the entropy of the Souriau model as a Casimir function, to apply a geometric model for energy preserving entropy production on Lie algebras. In our developments, we make heavily use of several concepts from geometric mechanics, such as momentum maps, Casimir functions, coadjoint orbits, and Lie-Poisson brackets, as unifying concepts appearing in various applications of this framework to information geometry and machine learning. We consider in detail the Koszul model, the polysymplectic extension of the Souriau model, the case of the multivariate Gaussian probability densities, models of information geometry for quantum systems. We exploit the geometric framework to build geometric numerical integrator schemes for some of the equations associated with Souriau’s model and its polysymplectic extension. This is achieved by identifying the variational principles underlying these equations and by discretizing these principles, following the techniques of variational discretization, which result in schemes that preserve coadjoint orbits, (multi)symplectic structures, and discrete versions of Noether theorems.

The content of the paper is as follows. In Section 2.1 we present the general geometric framework for Gibbs probability densities that will be used in the paper. In particular, we review the definition of the Massieu potential, the thermodynamic heat, the entropy, the identification of the Fisher metric with the Hessian of the Massieu potential, and the maximum entropy principle. These results are independent of the existence of Lie group symmetries of the theory. The implications of such symmetries are studied in detail Section 2.2 where we present a Lie group equivariant setting that includes as special cases the Souriau model, its polysymplectic extension, and the case of multivariate Gaussian probability densities. The Souriau model is reviewed in Section 2.3 where we show that the associated entropy is a Casimir function for the Lie-Poisson bracket with cocycle and, motivated by an approach developed in quantum information geometry, we take advantage of this property to formulate a geometric model for entropy production. We also present the stochastic Hamiltonian equations associated with the Lie-Poisson bracket with cocycle. The polysymplectic model is reviewed in Section 2.4, where we show that the entropy also satisfies a natural extension of the Casimir property and we formulate a polysymplectic extension of the Lie-Poisson equations with cocycle. Finally, in Section 2.5 we give a general expression of the Fisher metric on orbits when equivariance is assumed. In Section 3 we apply the framework considered in Section 2 to various examples and identify common underlying geometric structures. We start in Section 3.1 with the case of multivariate Gaussian probability densities as an illustration of the general framework for which a cocycle is needed and which does not fall into the setting of the Souriau model. We apply Noether theorem to derive invariant quantities for geodesics of the Fisher metric. We then enlighten in Section 3.2 the strong analogies with quantum information geometry by considering Lie algebras with unitary representation and show that the Fisher metric as defined from the generalized heat capacity in Section 2.1, coincides with the Bogoliubov-Kubo-Mori metric. In this particular case the equation with Casimir dissipation/production reproduces a dissipative model used in quantum information geometry. Finally, in Section 3.3 we consider in detail the case of the Euclidean group of the plane $S E (2)$ , the associated Fisher metric, Lie-Poisson equations with cocycle and entropy production equations. In Section 4, we make use of this geometric setting to propose geometric integrators for some of the equations associated with the Souriau model and its polysymplectic extension. We first review some facts on variational integrators on Lie groups in Section 4.1 and about central extensions of Lie groups and the associated Euler-Poincaré equations in Section 4.2. This allows obtaining a variational formulation for the Lie-Poisson equations with cocycle. Based on this, we present a symplectic integrator for the Lie-Poisson equation with cocycle in Section 4.3 and a multisymplectic integrator for the Lie-Poisson field equations with cocycle in Section 4.4.

2. A General Framework for Lie Group Statistical Mechanics and Symmetries

2.1. A Class of Generalized Gibbs Probability Densities, Its Associated Entropy and Fisher Metric

In this section, we present a general framework for Gibbs probability densities in statistical mechanics and information geometry, that includes the classes considered for instance in the Koszul and Souriau models, as well as multivariate exponential families. In particular, we review the importance of the logarithm of the characteristic function, identified as the Massieu potential, from which the entropy arises as its Legendre transform and the Fisher information metric as its Hessian. We also discuss the relation of these Gibbs sates with the maximum entropy principle. While the concepts manipulated here are standard, our aim is to organize them in a general setting that is appropriate for the developments made in this paper.

The results described in this paragraph are independent of possible Lie group symmetries of the theory whose implications will be discussed in Section 2.2.

Let E be a vector space, whose elements will be denoted $β$ since they are generalisations of the inverse temperature. The duality pairing between elements $ν$ of the dual space $E^{*}$ and elements $β \in E$ is denoted as $〈ν, β〉$ . Besides the vector space E, the setting also involves a manifold M, endowed with a volume form $d μ$ .

Let $U : M \to E^{*}$ be a smooth function defined on M with values in $E^{*}$ . Denote by $Ω \subset E$ the largest open set such that for all $β \in Ω$ the two integrals

\int_{M} e^{- 〈U (m), β〉} d μ \in R and \int_{M} U (m) e^{- 〈U (m), β〉} d μ \in E^{*}

(1)

converge. We denote by $ψ : Ω \to R$ the partition function (or characteristic function), given by

ψ (β) = \int_{M} e^{- 〈U (m), β〉} d μ .

(2)

For all $β \in Ω$ , we consider the generalized Gibbs probability densities

p_{β} (m) = \frac{1}{ψ (β)} e^{- 〈U (m), β〉} .

(3)

For application in information geometry it is required that $β \mapsto p_{β}$ is injective. It is important to note that the Gibbs densities are not defined on the whole vector space E but only on the open subset $Ω$ . An element $β \in Ω$ is called a geometric temperature. From now on we assume that $Ω$ is not empty.

The Massieu potential is the function $Φ : Ω \to R$ defined by

Φ (β) = - log (ψ (β))

(4)

from which we can write the generalized Gibbs probability densities as

p_{β} (m) = e^{Φ (β) - 〈U (m), β〉}, \forall β \in Ω .

The thermodynamic heat $Q : Ω \to E^{*}$ is the first derivative of the Massieu potential, i.e.,

Q (β) : = D Φ (β) = \int_{M} U (m) p_{β} (m) d μ = E_{β} [U] \in E^{*},

(5)

where $E_{β}$ denotes the expectation with respect to $p_{β}$ .

We denote by $Ω^{*}$ the image of the function $Ω$ by Q and assume that $Q = D Φ : Ω \to Ω^{*}$ is a diffeomorphism. In this case, we can define the entropy $s : Ω^{*} \to R$ as the Legendre transform of the Massieu potential $Φ : Ω \to R$ , namely

s (ν) : = 〈ν, β〉 - Φ (β),

(6)

where $β = Q^{- 1} (ν)$ . In other words, $β \in Ω$ in (6) is such that

D Φ (β) = ν .

The name entropy for this Legendre transform is justified by the following result.

Lemma 1.

For every $β \in Ω$ , we have the equality

$s (Q (β)) = S (p_{β}),$

where $Q (β)$ is the thermodynamic heat and

$S (p) = - \int_{M} p log p d μ$

is the entropy of the probability density p.

Proof.

On one hand, using the definition of s in Equation (6) and $Φ$ in Equation (4), we have

$s (Q (β)) = 〈Q (β), β〉 - Φ (β) = \int_{M} 〈U (m), β〉 p_{β} (m) d μ + log (ψ (β)) .$

On the other hand, we compute

$\begin{matrix} S (p_{β}) & = - \int_{M} p_{β} (m) log (p_{β} (m)) d μ = - \int_{M} p_{β} (m) (- log (ψ (β)) - 〈U (m), β〉) d μ \\ = log (ψ (β)) \int_{M} p_{β} (m) d μ + \int_{M} 〈U (m), β〉 p_{β} (m) d μ \\ = log (ψ (β)) + \int_{M} 〈U (m), β〉 p_{β} (m) d μ . \end{matrix}$

These expressions are equal. ☐

Equation (6) is referred to as the Clairaut equation, see [15].

The generalized heat capacity is the symmetric tensor field $K : Ω \to sym (E)$ , defined as minus the Hessian matrix of the Massieu potential, i.e.,

K (β) : = - D^{2} Φ (β) = D^{2} log ψ (β) : E \times E \to R .

A direct computation gives, for all vectors $δ β_{1}, δ β_{2} \in E$ ,

\begin{matrix} K (β) (δ β_{1}, δ β_{2}) & = - D^{2} Φ (β) (δ β_{1}, δ β_{2}) = - {\frac{d}{d ε}|}_{ε = 0} D Φ (β + ε δ β_{1}) \cdot δ β_{2} \\ = - {\frac{d}{d ε}|}_{ε = 0} \int_{M} \frac{〈U (m), δ β_{2}〉}{ψ (β + ε δ β_{1})} e^{- 〈U (m), β + ε δ β_{1}〉} d μ \\ = E_{β} [〈U, δ β_{1}〉 〈U, δ β_{2}〉] - E_{β} [〈U, δ β_{1}〉] E_{β} [〈U, δ β_{2}〉] \end{matrix}

hence the generalized heat capacity is

\begin{matrix} K (β) & = E_{β} [(U - E_{β} (U)) \otimes (U - E_{β} (U))] \\ = E_{β} [(U - Q (β)) \otimes (U - Q (β))] . \end{matrix}

As a consequence, $K (β)$ is positive semidefinite for all $β \in Ω$ . Being the derivative of $Q : Ω \to Ω^{*}$ , it is positive definite if Q is a diffeomorphism.

Recall that in information geometry, the Fisher metric associated with the family $p_{β}$ , $β \in Ω$ , of probability densities is the symmetric tensor field $K : Ω \to sym (E)$ defined by

I (β) = - E_{β} [D^{2} log p_{β}] .

In our setting we have the following identification.

Proposition 2.

The generalized heat capacity of $p_{β}$ cocincides with the Fisher metric of $p_{β}$ :

$I (β) = K (β) .$

In other words, the Fisher metric is the Hessian of the characteristic function logarithm $K (β) = D^{2} log ψ (β)$ .

Proof.

From Equation (3), we have

$log (p_{β} (m)) = - log ψ (β) - 〈U (m), β〉$

hence, taking the second derivative with respect to $β$ we get

$- D^{2} log p_{β} = D^{2} log ψ (β) = - D^{2} Φ (β) .$

Please note that this equality does not depend on m, which proves that

$I (β) = - E_{β} [D^{2} log p_{β}] = - E_{β} (D^{2} Φ (β)) = - D^{2} Φ (β) = K (β) .$

Hence the result is proved. ☐

Proposition 3.

Let us assume that $Q : Ω \to Ω^{*}$ is a diffeomorphism. The inverse of the Fisher metric, i.e., the cometric on $Ω^{*}$ induced from the Fisher metric on Ω, is given by minus the Hessian of the entropy:

$- D^{2} s (ν) : E^{*} \times E^{*} \to R, \forall ν \in Ω^{*} .$

Proof.

From the definition of the thermodynamic heat, we have $D Φ (Q^{- 1} (ν)) \cdot δ β = 〈ν, δ β〉$ , for every $ν \in Ω^{*}$ and $δ β \in Ω$ . Taking the derivative with respect to $ν$ , we get

$D^{2} Φ (Q^{- 1} (ν)) (D Q^{- 1} (ν^{- 1}) \cdot δ ν, δ β) = 〈δ ν, δ β〉 .$ (7)

Taking now the derivative of Equation (6), we get $D s (ν) = Q^{- 1} (ν)$ hence $D^{2} s (ν) = D Q^{- 1} (ν)$ . This can be used in Equation (7) and shows that $D^{2} Φ (β) \cdot D^{2} s (ν) = i d_{E^{*}}$ , where $ν = Q (β)$ . The result follows then from Proposition 2. ☐

The following result shows that the generalized Gibbs probability densities satisfy the maximum entropy principle [16].

Proposition 4

(Maximum entropy principle). Let $U : M \to E^{*}$ be a smooth function and $ν \in Ω^{*} \subset E^{*}$ a given element. The generalized Gibbs probability density $p_{β}$ in Equation (3) with $β = Q^{- 1} (ν)$ is a solution of the maximum entropy principle:

$max_{q} [- \int_{M} q log q d μ] s u c h t h a t \{\begin{matrix} \int_{M} q d μ = 1 \\ \int_{M} U q d μ = ν . \end{matrix}$

Proof.

Given a probability density q, we have

$\begin{matrix} - \int_{M} (q log q - q log p_{β}) d μ & = - \int_{M} q log \frac{q}{p_{β}} d μ \leq - \int_{M} q (1 - \frac{p_{β}}{q}) d μ \\ = - \int_{M} (q - p_{β}) d μ = 0 . \end{matrix}$

Hence, if q satisfies the constraints we get

$\begin{matrix} - \int_{M} q log q d μ & \leq - \int_{M} q log p_{β} d μ = \int_{M} q (log ψ (β) + 〈U (m), β〉) d μ \\ = log ψ (β) + 〈\int_{M} U q d μ, β〉 = - Φ (β) + 〈ν, β〉 \\ = 〈ν, Q^{- 1} (β)〉 - Φ (Q^{- 1} (ν)) = S (ν) = s (p_{β}) . \end{matrix}$

In the fourth equality we used $β = Q^{- 1} (ν)$ , in the fifth equality we used definition Equation (6), and in the last equality we used Lemma 1. ☐

Koszul-Vinberg characteristic function. We now quickly describe a particular case of the above setting, which is related to Hessian geometry and in which the characteristic function Equation (2) recovers the Koszul-Vinberg characteirstic function, see [17,18,19,20,21] and the references in [3]. In this case, the Fisher information metric of information geometry coincides with the canonical Koszul Hessian metric given by Koszul forms. Analogies between Koszul-Vinberg model and Souriau symplectic model of statistical mechanics were enlightened in [3]. Here we will show how these two models precisely arise as special cases of the general setting presented in Section 2.1.

Let E be a vector space and $Ω \subset E$ an open convex cone in E. The cone $Ω$ is assumed to be regular, i.e., $Ω$ contains no straight line, which is equivalent to the condition $\bar{Ω} \cap (- \bar{Ω}) = {0}$ . We chose $M = Ω^{*} \subset E^{*}$ as the dual cone defined by

Ω^{*} : = \{ξ \in E^{*} ∣ 〈ξ, β〉 > 0, \forall β \in \bar{Ω} - {0}\},

and we choose the function $U : M = Ω^{*} \to E^{*}$ as the identity function on $Ω^{*}$ . We take the volume form as the Lebesgue measure $d ξ$ . The generalized Gibbs probability densities defined in Equation (3) are

p_{β} (ξ) = \frac{1}{ψ (β)} e^{- 〈ξ, β〉},

(8)

with characteristic function Equation (2) given by

ψ (β) = \int_{Ω^{*}} e^{- 〈ξ, β〉} d ξ .

(9)

This expression recovers the Koszul-Vinberg characteristic function of the cone $Ω$ , see [17,18,19,20,21,22,23]. We call Equation (8) the Koszul density of the cone Ω.

The Koszul 1-form, [18], defined as the differential of $- log ψ (β)$ coincides with the thermodynamic heat $Q : Ω \to Ω^{*}$ of the general setting above. It reads

Q (β) = \int_{Ω^{*}} ξ p_{β} (ξ) d ξ = E_{β} (ξ) .

The Koszul metric defined as the second derivative of $log ψ (β)$ coincides with the Fisher metric of information geometry from Proposition 2. It reads

\begin{matrix} I (β) (δ β_{1}, δ β_{2}) & = \int_{Ω^{*}} 〈ξ, δ β_{1}〉 〈ξ, δ β_{2}〉 p_{β} (ξ) d ξ - \int_{Ω^{*}} 〈ξ, δ β_{1}〉 p_{β} (ξ) d ξ \int_{Ω^{*}} 〈ξ, δ β_{2}〉 p_{β} (ξ) d ξ . \end{matrix}

From Proposition 4, given $ν \in Ω^{*}$ , the Koszul density of the cone $Ω$ with $β = Q^{- 1} (ν)$ , satisfies the maximum entropy principle

max_{q} [- \int_{Ω^{*}} q log q d ξ] such that \{\begin{matrix} \int_{Ω^{*}} q d ξ = 1 \\ \int_{Ω^{*}} ξ q d ξ = ν, \end{matrix}

see [3] for a direct proof.

An important example is $Ω : = {sym}^{+} (n) \subset E = sym (n)$ , the cone of symmetric positive definite $n \times n$ matrices. The dual space is chosen as $E^{*} = sym (n)$ with duality pairing $〈ν, β〉 = Tr (ν^{T} β)$ . In this case, it is well-known that $Ω^{*} = Ω$ . The generalized Gibbs probability densities are

p_{β} : {sym}^{+} (n) \to R, p_{β} (ξ) = \frac{1}{ψ (β)} e^{- 〈ξ, β〉},

where the Koszul-Vinberg characteristic function can be explicitly computed as

ψ (β) = \int_{{sym}^{+} (n)} e^{- 〈ξ, β〉} d ξ = det {(β)}^{- \frac{n + 1}{2}} ψ (I_{n}) .

The Massieu potential is deduced as

Φ (β) = - log (ψ (β)) = \frac{n + 1}{2} log (det (β)) - log (ψ (I_{n}))

(10)

and the thermodynamic heat and entropy are

Q (β) = D Φ (β) = \frac{n + 1}{2} β^{- 1} = E_{β} (ξ)

s (ν) = \frac{n + 1}{2} log (det (ν)) + \frac{n (n + 1)}{2} (1 - log \frac{n + 1}{2}) + log (ψ (I_{n})) .

We can thus write the generalized Gibbs probability densities as

\begin{matrix} p_{β} (ξ) & = \frac{1}{ψ (β)} e^{- 〈ξ, β〉} = det {(β)}^{\frac{n + 1}{2}} \frac{1}{ψ (I_{n})} e^{- 〈ξ, β〉} . \end{matrix}

Finally, the expression of the Fisher metric on $Ω$ is found by using Equation (10) as

I (β) (δ β_{1}, δ β_{2}) = - D^{2} Φ (β) (δ β_{1}, δ β_{2}) = \frac{n + 1}{2} Tr (β^{- 1} δ β_{1} β^{- 1} δ β_{2}),

(11)

for every $δ β_{1}, δ β_{2} \in E$ .

2.2. Equivariance with Respect to Lie Group Actions

In this section, we study the consequences of the equivariance of the function U appearing in the generalized Gibbs probability densities. More precisely, given a Lie group G, we assume that $U : M \to E^{*}$ is G-equivariant with respect to an action of the Lie group on M and an affine action of the Lie group on $E^{*}$ . This setting includes as special cases the Souriau symplectic model of statistical mechanics [24], its polysymplectic extension [5], the case of multivariate Gaussian densities, as treated for instance in [4], and approaches developed in quantum information geometry [25], for which the Fisher metric will be shown to coincide with the Bogoliubov-Kubo-Mori metric in Section 3.2.

Let G be a Lie group, and let

ϕ : G \times M \to M, (g, m) \mapsto ϕ_{g} (m)

be a left action of G on M, i.e., $ϕ$ satisfies

ϕ_{e} = i d_{M} and ϕ_{g} \circ ϕ_{h} = ϕ_{g h},

for every $g, h \in G$ , with $i d_{M}$ the identity on M. We denote by $g$ the Lie algebra of G. The infinitesimal generator of the action corresponding to $ξ \in g$ is the vector field $ξ_{M}$ on M defined by

ξ_{M} (m) = {\frac{d}{d ε}|}_{ε = 0} ϕ_{exp (ε ξ)} (m),

(12)

for every $m \in M$ , where $exp : g \to G$ is the Lie group exponential map.

We also consider a left linear action

ρ : G \times E \to E, (g, β) \mapsto ρ_{g} (β)

of G on the vector space E, $ρ_{g} \in L (E, E)$ . We denote by $ρ^{*} : G \times E^{*} \to E^{*}$ the linear right action of G induced on the dual space $E^{*}$

〈ρ_{g}^{*} (ν), β〉 = 〈ν, ρ_{g} (β)〉, \forall β \in E, ν \in E^{*}, g \in G .

(13)

We recall that a group one-cocycle with respect to $ρ^{*}$ is a map $θ \in C^{\infty} (G, E^{*})$ such that

θ (g h) = θ (g) + ρ_{g^{- 1}}^{*} (θ (h)),

(14)

for every $g, h \in G$ . Equivalently, a group one-cocycle $θ \in C^{\infty} (G, E^{*})$ with respect to $ρ^{*}$ is such that $A : G \times E^{*} \to E^{*}$ defined by

A_{g} (ν) = ρ_{g^{- 1}}^{*} (ν) + θ (g)

(15)

is an affine left action of G on $E^{*}$ .

Finally, we recall that the Jacobian of the action $ϕ_{g} : M \to M$ relative to the volume form $d μ$ is the function $J ϕ_{g} : M \to R$ defined by $ϕ_{g}^{*} d μ = J ϕ_{g} d μ$ , where $ϕ_{g}^{*}$ denotes the pull-back of the n-form $d μ$ by the diffeomorphism $ϕ_{g}$ . We will be interested in actions which satisfy $J ϕ_{g} = c (g)$ is a constant on M. Please note that $c (g h) = c (g) c (h)$ , for every $g, h \in G$ . The particular case $c (g) = 1$ corresponds to volume preserving diffeomorphisms.

Proposition 5.

Assume that the action ϕ of G on M satisfies $ϕ_{g}^{*} μ = c (g) μ$ and the function U is G-equivariant:

$U (ϕ_{g} (m)) = ρ_{g^{- 1}}^{*} (U (m)) + θ (g),$ (16)

for all $g \in G$ and $m \in M$ , where $θ \in C^{\infty} (G, E^{*})$ is a group one-cocycle. Then the open subset $Ω \subset E$ is invariant under the action of G on E, the partition function ψ satisfies

$ψ (ρ_{g} (β)) = ψ (β) c (g) e^{〈θ (g^{- 1}), β〉}$

for every $g \in G$ , and the probability density $p_{β}$ satisfies

$ϕ_{g}^{*} p_{β} = p_{ρ_{g}^{- 1} (β)},$

for every $g \in G$ , where $ϕ_{g}^{*} p_{β} = (p_{β} \circ ϕ_{g}) J ϕ_{g} = (p_{β} \circ ϕ_{g}) c (g)$ is the pull-back of a density.

As a consequence, the Massieu potential $Φ (β)$ , the thermodynamic heat $Q (β)$ , the entropy $s (ν)$ , and the heat capacity $K (β)$ satisfy the following equivariance properties

$\begin{matrix} Φ (ρ_{g} (β)) = Φ (β) - log (c (g)) - 〈θ (g^{- 1}), β〉 \end{matrix}$ (17)

$\begin{matrix} Q (ρ_{g} (β)) = ρ_{g^{- 1}}^{*} (Q (β)) + θ (g) \end{matrix}$ (18)

$\begin{matrix} s (ρ_{g^{- 1}}^{*} (ν) + θ (g)) = s (ν) + log (c (g)) \end{matrix}$ (19)

$\begin{matrix} K (ρ_{g} (β)) (ρ_{g} (δ β_{1}), ρ_{g} (δ β_{2})) = K (β) (δ β_{1}, δ β_{2}), \end{matrix}$ (20)

for every $g \in G$ .

Proof.

Using Equation (16) and a change of variables, we have

$\begin{matrix} ψ (ρ_{g} (β)) & = \int_{M} e^{- 〈U (m), ρ_{g} (β)〉} d μ = \int_{M} e^{- 〈ρ_{g}^{*} (U (m)), β〉} d μ \\ = \int_{M} e^{- 〈U (ϕ_{g^{- 1}} (m)) - θ (g^{- 1}), β〉} d μ = \int_{M} e^{- 〈U (ϕ_{g^{- 1}} (m)), β〉} d μ e^{〈θ (g^{- 1}), β〉} \\ = \int_{M} e^{- 〈U (m), β〉} J ϕ_{g} d μ e^{〈θ (g^{- 1}), β〉} = ψ (β) c (g) e^{〈θ (g^{- 1}), β〉} . \end{matrix}$

The other statement are checked in a similar way, by using Equations (13)–(16). ☐

This proposition unifies in a single statement, several Lie group equivariance properties observed in several models for information geometry and Lie group machine learning, see, e.g., [3,4,5,7,8]. Before discussing the symplectic and polysymplectic models we illustrate below these equivariance properties for the Koszul model recalled above.

Equivariance in the Koszul model. For the Koszul model recalled above, see [3] and references therein, $G = Aut (Ω)$ is the group of linear isomorphism that preserves $Ω \subset E$ . Given $g \in Aut (Ω)$ , we have $ρ_{g} : Ω \to Ω$ and it is clear that the dual action $ρ_{g}^{*}$ preserves the dual cone $Ω^{*}$ . In this very special case, $M = Ω^{*}$ and the G action on M is chosen as $ϕ_{g} : = ρ_{g^{- 1}}^{*}$ . Since $U : Ω^{*} \to E^{*}$ is the identity, there is no cocycle. However, we have $c (g) = J ϕ_{g}$ which is not equal to one in general and, for instance, the transformation Equation (17) of the Massieu potential reads

Φ (ρ_{g} (β)) = Φ (β) - log (c (g)) .

Let us consider as special case the cone of symmetric positive definite matrices $Ω = {sym}^{+} (n) \subset E = sym (n)$ . The dual space is chosen as $E^{*} = sym (n)$ with duality pairing $〈ν, β〉 = Tr (ν^{T} β)$ and we have $Ω^{*} = Ω$ .

We consider the left action of $G L (n)$ on $E = sym (n)$ given by

ρ_{A} (β) = A^{- T} β A^{- 1} .

(21)

Therefore, we have

ρ_{A}^{*} (ν) = A^{- 1} ν A^{- T} and ϕ_{A} (ξ) = ρ_{A^{- 1}}^{*} (ξ) = A ξ A^{T} .

(22)

Proposition 5 directly yields the following equivariance properties

\begin{matrix} ψ (A^{- T} β A^{- 1}) = ψ (β) c (A) \\ p_{β} (A ξ A^{T}) c (A) = p_{A^{T} β A} (ξ) \\ Φ (A^{- T} β A^{- 1}) = Φ (β) - log c (A) \\ Q (A^{- T} β A^{- 1}) = A Q (β) A^{T} \\ s (A ν A^{T}) = s (ν) + log c (A), \end{matrix}

for all $A \in G L (n)$ , where $c (A) = {(det A)}^{n + 1}$ .

2.3. Souriau Symplectic Model of Statistical Mechanics

In this section, we show that the Souriau symplectic model of statistical mechanics [24] arises as a special case of the preceding setting, by considering $(M, ω)$ a symplectic manifold and $d μ$ the Liouville form associated with $ω$ .

We then exploit this setting to show that the entropy in the Souriau model is a Casimir function of the Lie-Poisson bracket with Lie algebra cocycle associated with the nonequivariance cocycle of the momentum map, i.e., it Poisson commutes with every functions. Based on this we formulate a dynamical geometric model for dissipation/production of this Casimir, following the Lie algebraic setting proposed in [26,27]. This allows us to clarify the link between the geometry underlying Souriau symplectic models and that underlying models proposed in [25] in the framework of quantum physics by information geometry for some Lie algebras, see also [28]. Details will be given in Section 3.2. Finally, we present a stochastic perturbation of the Lie-Poisson equations with cocycle within the setting of stochastic Hamiltonian dynamics.

To present the Souriau model, we first quickly recall below the notion of momentum map and nonequivariance cocycle for symplectic manifolds, see, e.g., [29,30,31]. Consider a symplectic manifold $(M, ω)$ , i.e., a manifold M endowed with a closed non-degenerate two form $ω$ . The associated Liouville form is $d μ = \frac{{(- 1)}^{n (n - 1) / 2}}{n!} ω \land \dots \land ω$ (n times), where $2 n = dim M$ . Given a function $h : M \to R$ , the Hamiltonian vector field associated with H is the vector field $X_{h}$ defined by

i_{X_{h}} ω = d h .

(23)

Recall that the symplectic form $ω$ defines the Poisson bracket (see Remark 7)

{f, g} = ω (X_{f}, X_{g})

(24)

on functions $f, g \in C^{\infty} (M)$ .

A Lie group action $ϕ : G \times M \to M$ of G on M is symplectic, if it preserves the symplectic form, i.e., $ϕ_{g}^{*} ω = ω$ , for every $g \in G$ . Taking the derivative of this identity with respect to g at $g = e$ , we get $£_{ξ_{M}} ω = 0$ , for every $ξ \in g$ , where $ξ_{M}$ is the infinitesimal generator associated with the Lie algebra element $ξ \in g$ , see Equation (12), and £ is the Lie derivative. Equivalently, we have

d (i_{ξ_{M}} ω) = 0,

for every $ξ \in g$ , i.e., the one-form $i_{ξ_{M}} ω$ is locally exact. If it is globally exact, i.e., if $ξ_{M}$ is a Hamiltonian vector field for every $ξ \in g$ , then the action is called Hamiltonian and admits a momentum map $J : M \to g^{*}$ , which satisfies

i_{ξ_{M}} ω = d J_{ξ},

where $J_{ξ} : M \to R$ is defined by $J_{ξ} (m) : = 〈J (m), ξ〉$ , for every $ξ \in g$ .

When M is connected, there is a well-defined group one-cocycle $θ : G \to g^{*}$ , called the nonequivariance cocycle, given by

θ (g) = J (Φ_{g} (m)) - {Ad}_{g^{- 1}}^{*} (J (m)),

where $m \in M$ can be arbitrarily chosen. It characterizes the nonequivariance of the momentum map with respect to the action of G on M and the coadjoint action of G on $g^{*}$ . The group one-cocycle property is

θ (g h) = θ (g) + {Ad}_{g^{- 1}}^{*} (θ (h)),

for every $g, h, \in G$ . We consider its differential $Θ : = T_{e} θ$ seen as a map $Θ : g \times g \to R$ , i.e.,

Θ (ξ, η) = 〈T_{e} θ (ξ), η〉 = {\frac{d}{d ε}|}_{ε = 0} 〈θ (exp (ε ξ)), η〉 .

(25)

Taking the derivative of the relation above, we get

Θ (ξ, η) = J_{[ξ, η]} - {J_{ξ}, J_{η}},

(26)

where the last term uses the Poisson bracket Equation (24). The map $Θ : g \times g \to R$ is bilinear, skew-symmetric, and, as can be readily verified, satisfies the Lie algebra two-cocycle identity

Θ ([ξ, η], ζ) + Θ ([η, ζ], ξ) + Θ ([ζ, ξ], η) = 0 .

(27)

We refer to [29,30,31] for detailed introductions to these concepts.

Remark 6

(Lie group and Lie algebra cohomology). A group one-cocycle $θ \in C^{\infty} (G, g^{*})$ is called a group one-coboundary if there is a $λ \in g^{*}$ such that

$θ (g) = λ - {Ad}_{g^{- 1}}^{*} λ$

for every $g \in G$ . The quotient space of one-cocycles modulo one-coboundaries is called the first group cohomology of G and is denoted by $H^{1} (G, g^{*})$ . These definitions extend to arbitrary representation of G on a vector space, as in Equation (14).

A Lie algebra two-cocycle $Θ$ is called a Lie algebra two-coboundary if there is $λ \in g^{*}$ such that

$Θ (ξ, η) = 〈λ, [ξ, η]〉,$

for all $ξ, η \in g$ . The quotient space of Lie algebra two-cocycles by Lie algebra two-coboundaries is called the second Lie algebra cohomology of $g$ and is denoted by $H^{2} (g, R)$ .

2.3.1. Souriau Symplectic Model of Satistical Mechanics

The Souriau symplectic model of statistical mechanics is obtained by considering the following specific situation in the setting described in Section 2.2:

\begin{matrix} M : & a symplectic manifold \\ d μ : & the Liouville volume \\ ϕ_{g} : & a Hamiltonian action \\ E = g : & the Lie algebra of G \\ ρ_{g} = {Ad}_{g} : & the adjoint action of G on g \\ U = J : M \to g^{*} : & a momentum map . \end{matrix}

In particular, the thermodynamic heat becomes $Q (β) = E_{β} (J)$ and the Fisher metric on $Ω \subset g$ is

I (β) = E_{β} ((J - E_{β} (J)) \otimes (J - E_{β} (J))) \in sym (g) .

Proposition 5 directly yields the following equivariance properties

{Ad}_{g} Ω = Ω, ψ ({Ad}_{g} β) = ψ (β) e^{〈θ (g^{- 1}), β〉}, p_{β} \circ ϕ_{g} = p_{{Ad}_{g^{- 1}} β}

and

\begin{matrix} Φ ({Ad}_{g} β) = Φ (β) - 〈θ (g^{- 1}), β〉 \end{matrix}

(28)

\begin{matrix} Q ({Ad}_{g} β) = {Ad}_{g^{- 1}}^{*} (Q (β)) + θ (g) \end{matrix}

(29)

\begin{matrix} s ({Ad}_{g^{- 1}}^{*} ν + θ (g)) = s (ν) \end{matrix}

(30)

\begin{matrix} K ({Ad}_{g} β) ({Ad}_{g} δ β_{1}, {Ad}_{g} δ β_{2}) = K (β) (δ β_{1}, δ β_{2}), \end{matrix}

(31)

for every $g \in G$ . Note also that $Ω^{*}$ is invariant under the affine action $ν \mapsto {Ad}_{g^{- 1}}^{*} ν + θ (g)$ .

From Proposition 4, given $ν \in Ω^{*} \subset g^{*}$ , the generalized Gibbs probability density

p_{β} (m) = \frac{1}{ψ (β)} e^{- 〈J (m), β〉},

with $β = Q^{- 1} (ν)$ , satisfies the maximum entropy principle

max_{q} [- \int_{M} q log q d μ] such that \{\begin{matrix} \int_{M} q d μ = 1 \\ \int_{M} J q d μ = ν . \end{matrix}

We refer to [2] for a detailed presentation of Souriau’s model. We refer to [32] for recent developments exploiting Souriau’s model. As mentioned earlier in the general case, it is important to note that the generalized Gibbs densities are not defined on the whole Lie algebra $g$ but only on the open subset $Ω \subset g$ of geometric temperatures. As already observed by Souriau the set $Ω$ can be empty in some examples, such as the case of the action of the Galilean group. In this case, Souriau’s method considers Gibbs densities associated with one-parameter subgroups of the acting Lie group.

2.3.2. Lie-Poisson Equations with Cocycle and Property of the Entropy in Souriau’s Model

From Equation (30), we note that the entropy s is constant on the affine coadjoint orbits defined by

O = {{Ad}_{g^{- 1}}^{*} μ_{0} + θ (g) ∣ g \in G},

(32)

for $μ_{0} \in g^{*}$ . It is well-known that affine coadjoint orbits are symplectic manifolds, with symplectic form given by

ω_{O} (μ) ({ad}_{ξ}^{*} μ - Θ (ξ, \cdot), {ad}_{η}^{*} μ - Θ (η, \cdot)) = 〈μ, [ξ, η]〉 - Θ (ξ, η),

(33)

for $μ \in O$ , $ξ, η \in g$ . This is an extension to the affine case of the well-known Kirilov-Kostant-Souriau symplectic form on coadjoint orbits. The connected components of the affine coadjoint orbits Equation (32) are the symplectic leaves in the Poisson manifold $(g^{*}, {,}_{Θ})$ , where ${,}_{Θ}$ is the Lie-Poisson bracket with cocycle (or affine Lie-Poisson bracket)

{f, g}_{Θ} (μ) = 〈μ, [\frac{δ f}{δ μ}, \frac{δ g}{δ μ}]〉 - Θ (\frac{δ f}{δ μ}, \frac{δ g}{δ μ}), f, g : g^{*} \to R,

(34)

see, e.g., [29].

The Hamiltonian system (see Remark 7) associated with the Lie-Poisson bracket with cocycle Equation (34) and to a given Hamiltonian function $h : g^{*} \to R$ is given by the Lie-Poisson equations with cocycle (or affine Lie-Poisson equations)

\frac{d}{d t} f = {f, h}_{Θ}, \forall f : g^{*} \to R,

(35)

which yield the dynamical system

\frac{d}{d t} μ + {ad}_{\frac{δ h}{δ μ}}^{*} μ = Θ (\frac{δ h}{δ μ}, \cdot),

(36)

for a curve $μ (t) \in g^{*}$ . This dynamical system preserves each affine coadjoint orbit Equation (32) and defines on each of them a Hamiltonian system with respect to the symplectic form Equation (33). The Lie-Poisson equations with cocycle have important applications, in particular they appear in the geometric formulation of complex fluids, see [33,34,35], and geometrically exact (Cosserat) rods, see [36,37]. See [38] for another point of view on Lie-Poisson equations with cocycle. These equations are also referred to as affine Lie-Poisson equations or Lie-Poisson equations with non-zero cohomology.

Remark 7

(Poisson brackets and reduction, see [31]). Recall that a Poisson bracket on a manifold M is a Lie algebra structure ${\cdot, \cdot}$ on $C^{\infty} (M)$ which is a derivation in each factor: ${f g, h} = f {g, h} + {f, h} g$ . For instance, a symplectic structure $ω$ on M defines the Poisson bracket ${f, g} = ω (X_{f}, X_{g})$ for $f, g \in C^{\infty} (M)$ . Another example is the Lie-Poisson bracket

${f, g} (μ) = \pm 〈μ, [\frac{δ f}{δ μ}, \frac{δ g}{δ μ}]〉 f, g \in C^{\infty} (g^{*})$ (37)

on the dual of any Lie algebra $g$ , as well as its affine modified version Equation (34) by a two-cocycle $Θ$ . The Hamiltonian system associated with a Poisson bracket and a given Hamiltonian $h \in C^{\infty} (M)$ is the dynamical system characterized by the condition

$\frac{d}{d t} f = {f, h}$

for every functions $f \in C^{\infty} (M)$ , see for instance Equations (35) and (36).

An important point for applications in mechanics is the understanding of such Poisson structures as being induced from a canonical symplectic form (or, equivalently, from the associated canonical Poisson bracket) on a cotangent bundle, via reduction by symmetry relative to a Lie group action. This is the case for the Lie-Poisson bracket Equation (37) which is induced by the canonical symplectic form on $T^{*} G$ and the action of G on $T^{*} G$ given by the cotangent lifted action of right or left translation. The Lie-Poisson bracket with cocycle Equation (34) is induced by the canonical symplectic form on $T^{*} G$ and an affine modified cotangent lifted action of right or left translation ([33]).

Corollary 8.

The entropy s of the Souriau model is a Casimir function for the Lie-Poisson bracket with cocycle Equation (36), i.e., it satisfies

${s, f}_{Θ} = 0,$

for every smooth functions $f : g^{*} \to R$ .

Proof.

From Equation (30), we have

$〈\frac{δ s}{δ μ}, - {ad}_{ξ}^{*} μ + Θ (ξ, \cdot)〉 = 0$

for all $ξ \in g$ . This is equivalent to ${ad}_{\frac{δ s}{δ μ}}^{*} μ - Θ (\frac{δ s}{δ μ}, \cdot) = 0$ , which shows that ${s, f}_{Θ} = 0$ , for all f. ☐

As a consequence of the above, the information manifold foliates into level sets of the entropy, containing a family of coadjoint orbits, that could be interpreted in Thermodynamics: motion remaining on theses level sets is non-dissipative, whereas motion transversal to these level sets is dissipative. The affine Kirillov-Kostant-Souriau form makes each orbit into a homogeneous symplectic manifold. Hamiltonian motion on these affine coadjoint orbits is given by the solutions of the Lie-Poisson equations with cocycle Equation (36). We shall present below a geometric way to introduce dissipation and hence, motion through affine coadjoint orbits.

Elementary examples. A particularly simple case of Souriau symplectic model is when the symplectic manifold is a cotangent bundle $M = T^{*} Q$ endowed with the canonical symplectic form. Let G be a Lie group acting on the left on Q. Then its cotangent lifted action on $T^{*} Q$ is symplectic and admits the momentum map $J : T^{*} Q \to g^{*}$ given by

〈J (α_{q}), ξ〉 = 〈α_{q}, ξ_{Q} (q)〉 .

(38)

In this case, there is no cocycle, which yields obvious simplifications in the properties Equations (28)–(30).

Another case without cocycle is when M is an affine coadjoint orbit $M = O = {{Ad}_{g^{- 1}}^{*} μ + θ (g) ∣ g \in G}$ endowed with the symplectic form Equation (33). In this case, the momentum map is simply the inclusion $J : O \to g^{*}$ of the affine coadjoint orbit in the dual of the Lie algebra $g^{*}$ , [29]. While this example is simple, it plays an important role in the applications, e.g., [8,39]. An example with nonequivariance cocycle will be treated in detail in Section 3.3 for the special Euclidean group of the plane.

2.3.3. Dynamics with Casimir Dissipation/Production

We take advantage of the Casimir function s associated with the Souriau model, to formulate a dynamical geometric model for dissipation/production of this Casimir. This allows us to clarify the link between Souriau symplectic models and models proposed in [25] in the framework of quantum physics by information geometry for some Lie algebras, see also [28].

We follow the general Lie algebraic approach developed in [26,27] for Casimir dissipation, slightly extended here to take into account of a cocycle, and to a wider class of dissipation.

Given a symmetric positive bilinear form $γ : g \times g \to R$ , a Hamiltonian $h : g^{*} \to R$ , a parameter $Λ \neq 0$ , and a function $k : g^{*} \to R$ such that

[\frac{δ h}{δ μ}, \frac{δ k}{δ μ}] = 0,

(39)

we consider the modification of the Lie-Poisson equations with cocycle Equation (35) given by

\frac{d}{d t} f = {f, h}_{Θ} - Λ γ ([\frac{δ f}{δ μ}, \frac{δ k}{δ μ}], [\frac{δ s}{δ μ}, \frac{δ k}{δ μ}])

(40)

for every f. We denote by $♭ : g \mapsto g^{*}$ the flat operator associated with $γ$ . That is, the linear form $ξ^{♭} \in g^{*}$ is given by $ξ^{♭} (η) = γ (ξ, η)$ , for all $ξ, η \in g$ . Please note that the flat operator need not be either injective or surjective. Using the equality

\begin{matrix} - γ ([\frac{δ f}{δ μ}, \frac{δ k}{δ μ}], [\frac{δ s}{δ μ}, \frac{δ k}{δ μ}]) & = - 〈{[\frac{δ s}{δ μ}, \frac{δ k}{δ μ}]}^{♭}, [\frac{δ f}{δ μ}, \frac{δ k}{δ μ}]〉 = 〈{ad}_{\frac{δ k}{δ μ}}^{*} {[\frac{δ s}{δ μ}, \frac{δ k}{δ μ}]}^{♭}, \frac{δ f}{δ μ}〉, \end{matrix}

Equation (40) yields the dynamical system

\frac{d}{d t} μ + {ad}_{\frac{δ h}{δ μ}}^{*} μ = Θ (\frac{δ h}{δ μ}) + Λ {ad}_{\frac{δ k}{δ μ}}^{*} {[\frac{δ s}{δ μ}, \frac{δ k}{δ μ}]}^{♭} .

(41)

For $Θ = 0$ and $h = k$ , this is the model proposed in [26,27] and applied there in the infinite dimensional setting, with applications to geophysical fluids and magnetohydrodynamics.

The main properties of system Equation (41) are the following.

(i)
Energy conservation: taking $f = h$ in Equation (40), we obtain
$\frac{d}{d t} h = {h, h}_{Θ} - Λ γ ([\frac{δ h}{δ μ}, \frac{δ k}{δ μ}], [\frac{δ s}{δ μ}, \frac{δ k}{δ μ}]) = 0$
because of Equation (39) and since ${h, h}_{Θ} = 0$ . Hence the total energy h is preserved.
(ii)
Casimir dissipation ( $Λ > 0$ ) or production ( $Λ < 0$ ): taking $f = s$ in Equation (40), and using ${s, f}_{Θ} = 0$ , we obtain
$\frac{d}{d t} s = {s, h}_{Θ} - Λ γ ([\frac{δ s}{δ μ}, \frac{δ h}{δ μ}], [\frac{δ s}{δ μ}, \frac{δ k}{δ μ}]) = - Λ {∥[\frac{δ s}{δ μ}, \frac{δ k}{δ μ}]∥}^{2} \leq 0 / \geq 0,$
where ${∥ ξ ∥}^{2} = γ (ξ, ξ)$ .

We will explain in Section 3.2 how system Equation (41) recovers the model proposed in [25] in the context of information geometry for quantum systems for Lie algebras with unitary representation.

2.3.4. Stochastic Hamiltonian Dynamics

We shall briefly discuss here a stochastic perturbation of the Lie-Poisson equation with cocycle Equation (35) within the setting of stochastic Hamiltonian dynamics, see [40,41,42], which preserves the affine coadjoint orbits. This theory was recently extended for stochastic geometric modeling in fluid dynamics via variational principles in [43], see also [44,45].

In the context of this paper, this stochastic extension is motivated in geometric statistical mechanics to model Gibbs density in the case of centrifuge with random vibration along the axis (that is an open problem for industrial centrifuge, because for large equipment it is difficult to reduce vibration of this axis). In statistical machine learning, the problem is motivated for small data analytics, where the Gibbs density as maximum entropy of first order is an approximation. In this case, there is some fluctuations in estimation of mean momentum map due to the fact that the true Gibbs density is a density of higher order. This approximation could be modeled by an additional noise on the moment map.

In the setting of the Lie-Poisson equations with cocycle given in Equation (35), we consider the stochastic Hamiltonian dynamics given by

d f = {f, h}_{Θ} d t + \sum_{i = 1}^{N} {f, h_{i}}_{Θ} \circ d W_{i} (t),

(42)

where $h_{i} : g^{*} \to R$ , $i = 1, \dots, N$ are given Hamiltonians and $W_{i} (t)$ , $i = 1, \dots, N$ are independent Brownian motions introduced in the Stratonovich sense, as indicated by the symbol ∘. Please note that the contribution of each Hamiltonian is inserted via the Lie-Poisson bracket with cocycle ${\cdot, \cdot}_{Θ}$ given in Equation (34). This results in the following Stratonovich differential equation for the stochastic process $μ (t) \in g^{*}$

d μ + [{ad}_{\frac{δ h}{δ μ}}^{*} μ - Θ (\frac{δ h}{δ μ}, \cdot)] d t + \sum_{i = 1}^{N} [{ad}_{\frac{δ h_{i}}{δ μ}}^{*} μ - Θ (\frac{δ h_{i}}{δ μ}, \cdot)] \circ d W_{i} (t) = 0 .

(43)

The Itô form of Equation (43) can be obtained by the usual conversion formula. It can be expressed in a concise and general way as

d f = ({f, h}_{Θ} - \frac{1}{2} {h_{i}, {h_{i}, f}_{Θ}}_{Θ}) d t + \sum_{i = 1}^{N} {f, h_{i}}_{Θ} d W_{i} (t) .

In a similar way with its deterministic version in Equation (36) the system Equation (43) restricts to a stochastic Hamiltonian system on each affine coadjoint orbits Equation (32) with respect to the Kirillov-Kostant-Souriau symplectic structure with cocycle Equation (33). From Corollary 8 it follows that the entropy s of the Souriau model is preserved by the stochastic dynamics Equation (43).

In absence of the cocycle, Equation (43) can be formally obtained from the variational principle

δ \int_{0}^{T} [〈μ, d g g^{- 1}〉 - h (μ) d t - \sum_{i = 1}^{N} h_{i} (μ) \circ d W_{i} (t)] = 0,

(44)

for variations $δ g$ and $δ μ$ of $(g, μ) \in G \times g^{*}$ . More precisely, the variations $δ g$ and $δ μ$ give the two conditions

d μ + {ad}_{d g g^{- 1}}^{*} μ = 0 and d g g^{- 1} = \frac{δ h}{δ μ} d t + \sum_{i = 1}^{N} \frac{δ h_{i}}{δ μ} \circ d W_{i} (t)

which yield Equation (43) in the special case $Θ = 0$ . Such variational principles play an essential role in stochastic geometric modelling [43,44,45], where the emphasis is made on the Lagrangian side. For instance in [44] the Lagrangian version of Equation (44) given by

δ \int_{0}^{T} [ℓ (ξ) d t + 〈μ, d g g^{- 1} - ξ d t〉 - \sum_{i = 1}^{N} h_{i} (μ) \circ d W_{i} (t)] = 0,

(45)

is used, for variations $δ g$ , $δ ξ$ , $δ μ$ of $(g, ξ, μ) \in G \times g \times g^{*}$ , where $ℓ : g \to R$ is the Lagrangian associated with $h : g^{*} \to R$ in the hyperregular case. Variations $δ g$ , $δ ξ$ , and $δ μ$ give the three conditions

d μ + {ad}_{d g g^{- 1}}^{*} μ = 0, \frac{δ ℓ}{δ ξ} = μ, and d g g^{- 1} = ξ d t + \sum_{i = 1}^{N} \frac{δ h_{i}}{δ μ} \circ d W_{i} (t)

which yield equivalent equations to Equation (43) with $Θ = 0$ .

To extend Equation (44) to cover the case of a Lie algebra two-cocycle $Θ \neq 0$ we shall formulate the variational principle on the central extension $\hat{G} = G \times R$ of the Lie group G with respect to a group two-cocyce $B : G \times G \to R$ that integrates $Θ : g \times g \to R$ . We refer to Section 4.2 below for a quick review of the main formulas for central extensions and their use in connection with the Lie-Poisson equations with cocycle. For the application to Equation (43) here, we just need to recall the expression of the group multiplication $(g, α) (h, β) = (g h, α + β + B (g, h))$ for $(g, α), (h, β) \in \hat{G}$ where $B : G \times G \to R$ is the group two-cocycle, and the relation

Θ (ξ, η) = {\frac{d}{d s}|}_{s = 0} {\frac{d}{d t}|}_{t = 0} (B (exp {(t ξ)}^{- 1}, exp (s η)) - B (exp (s η), exp {(t ξ)}^{- 1}))

(46)

between $Θ$ and B. Based on this, we consider the variational principle

δ \int_{0}^{T} [〈(μ, a), (d g, d α) {(g, α)}^{- 1}〉 - h (μ) d t - \sum_{i = 1}^{N} h_{i} (μ) \circ d W_{i} (t)] = 0,

for variations $δ g$ , $δ α$ , $δ μ$ , and $δ a$ of $(g, α) \in \hat{G} = G \times R$ and $(μ, a) \in {\hat{g}}^{*} = g^{*} \times R$ . In the first term, the operations are associated with the tangent lift of the multiplication on the central extension $\hat{G}$ and the pairing is $〈(μ, a), (ξ, u)〉 = 〈μ, ξ〉 + a u$ , so that we have $〈(μ, a), (d g, d α) {(g, α)}^{- 1}〉 = 〈μ, d g g^{- 1}〉 + a (d α + D_{1} B (g, g^{- 1}) \cdot d g)$ . A computation, using the fact that B integrates $Θ$ , i.e., Equation (46), shows that the variations $δ g$ , $δ α$ , $δ μ$ , and $δ a$ give the four conditions

\begin{matrix} d μ + {ad}_{d g g^{- 1}}^{*} μ - a Θ (d g g^{- 1}, \cdot) = 0, & d a = 0, \\ d g g^{- 1} = \frac{δ h}{δ μ} d t + \sum_{i = 1}^{N} \frac{δ h_{i}}{δ μ} \circ d W_{i} (t), & d α + D_{1} B (g, g^{- 1}) \cdot d g = 0 . \end{matrix}

One notes that taking the initial condition $a = 1$ , we get the stochastic Lie-Poisson system with cocycle Equation (43) as desired. This approach using central extension can be easily used on the Lagrangian side too and yields the appropriate extension of Equation (45) to handle a cocycle $Θ \neq 0$ .

We refer to [43,44,45,46], for various finite and infinite dimensional applications of the stochastic Lie-Poisson Hamiltonian system Equation (42) in the case $Θ = 0$ .

2.4. Polysymplectic Model of Statistical Mechanics

Polysymplectic geometry, as developed in [47], arises as a special case of multisymplectic geometry which is the natural geometric setting of classical field theories, see, e.g., [48]. When used in conjunction with the general setting developed in Section 2.1 and Section 2.2, the polysymplectic setting furnishes a natural generalisation of the Souriau symplectic model, to which many properties extend. This extension was proposed in [5]. Here we emphazise this model as a specific case of the general framework described in Section 2.1 and Section 2.2. This allows transposing immediately all the properties of this framework to the polysymplectic model. In particular, we will see that the entropy of the polysymplectic model enjoys a natural extension of the Casimir property observed in Section 2.3.2. The relevant equation is here an Lie-Poisson field equation with cocycle that we will describe in detail below.

This model is motivated by higher-order model of statistical physics. For instance, for small data analytics (rarified gases, sparse statistical surveys, …), the density of maximum entropy should consider higher order moments constraints, so that the Gibbs density is not only defined by first moment but fluctuations request 2nd order and higher moments, as introduced in [49,50,51,52,53,54].

Polysymplectic manifolds. We only need a restricted amount of notions from polysymplectic geometry which are straighforward extensions of those recalled above in the symplectic context. We refer to [47] for more information. A polysymplectic manifold $(M, ω)$ is a manifold M endowed with a closed nondegenerate $R^{n}$ -valued 2-form. We can identify $ω$ with a collection $(ω^{1}, \dots, ω^{n})$ , of closed 2-forms with $⋂_{i = 1}^{n} ker ω^{i} = {0}$ .

A Lie group action $ϕ : G \times M \to M$ of G on M is polysymplectic, if $Φ_{g}^{*} ω^{i} = ω^{i}$ , for every $g \in G$ and $i = 1, \dots, n$ . Similarly as before, this implies that $i_{ξ_{M}} ω$ is a closed $R^{n}$ -valued one-form on M. If this form is exact, then the action is called Hamiltonian and admits a polysymplectic momentum map $J : M \to L (g, R^{n})$ , which satisfies

i_{ξ_{M}} ω = d J_{ξ},

where $J_{ξ} : M \to R^{n}$ is defined by $J_{ξ} (m) = J (m) \cdot ξ$ . In a similar way with the symplectic case, if M is connected, there is group one-cocycle $θ \in C^{\infty} (G, L (g, R^{n}))$ , $θ = (θ^{1}, \dots, θ^{n})$ , defined by

θ^{i} (g) = J^{i} (Φ_{g} (m)) - {Ad}_{g^{- 1}}^{*} (J^{i} (m)) .

and one defines the map $Θ : g \times g \to R^{n}$ by

Θ (ξ, η) : = {\frac{d}{d ε}|}_{ε = 0} θ (exp (ε ξ)) (η) \in R^{n} .

(47)

Taking the derivative of the relation above, we get

Θ {(ξ, η)}^{i} = J_{[ξ, η]}^{i} - ω^{i} (ξ_{M}, η_{M}) .

(48)

As a consequence $Θ$ is skew-symmetric, and satisfies the two-cocycle identity

Θ ([ξ, η], ζ) + Θ ([η, ζ], ξ) + Θ ([ζ, ξ], η) = 0,

(49)

see [47].

Polysymplectic model. The polysymplectic model of statistical mechanics is obtained by considering the following specific situation in the equivariant setting described in Section 2.2:

\begin{matrix} M & : a polysymplectic manifold \\ d μ & : a volume form \\ ϕ_{g} & : a volume preserving Hamiltonian action \\ E = L (R^{n}, g) & : the linear maps from R^{n} to the Lie algebra of G \\ ρ_{g} = {({Ad}_{g})}^{n} & : the action induced on L (R^{n}, g) by the adjoint \\ action of G on g \\ U = J : M \to E^{*} = L (g, R^{n}) & : a polysymplectic momentum map . \end{matrix}

Here the space $E = L (R^{n}, g)$ of linear maps is identified with the Cartesian product $g^{n} = g \times \dots \times g$ and ${({Ad}_{g})}^{n}$ acts on $β \in E$ as

{({Ad}_{g})}^{n} (β_{1}, \dots, β_{n}) = ({Ad}_{g} β_{1}, \dots, {Ad}_{b} β_{n}) .

The thermodynamic heat becomes a map $Q : Ω \subset L (R^{n}, g) \to Ω^{*} \subset L (g, R^{n})$ with $Q (β) = E_{β} (J) \in Ω^{*} \subset L (g, R^{n})$ and the Fisher metric on $Ω \subset L (R^{n}, g)$ is

I (β) = E_{β} ((J - E_{β} (J)) \otimes (J - E_{β} (J))) \in sym (L (R^{n}, g)) .

Proposition 5 directly yields the following equivariance properties

{({Ad}_{g})}^{n} Ω = Ω, ψ ({({Ad}_{g})}^{n} β) = ψ (β) e^{〈θ (g^{- 1}), β〉}, p_{β} \circ ϕ_{g} = p_{{({Ad}_{g^{- 1}})}^{n} β}

and

\begin{matrix} Φ ({({Ad}_{g})}^{n} β) = Φ (β) - 〈θ (g^{- 1}), β〉 \end{matrix}

(50)

\begin{matrix} Q ({({Ad}_{g})}^{n} β) = {({Ad}_{g^{- 1}}^{*})}^{n} (Q (β)) + θ (g) \end{matrix}

(51)

\begin{matrix} s ({({Ad}_{g^{- 1}}^{*})}^{n} ν + θ (g)) = s (ν) \end{matrix}

(52)

\begin{matrix} K ({({Ad}_{g})}^{n} β) ({({Ad}_{g})}^{n} δ β_{1}, {({Ad}_{g})}^{n} δ β_{2}) = K (β) (δ β_{1}, δ β_{2}), \end{matrix}

(53)

for every $g \in G$ . Note also that $Ω^{*}$ is invariant under the affine action $ν \in L (g, R^{n}) \mapsto {({Ad}_{g^{- 1}}^{*})}^{n} ν + θ (g) \in L (g, R^{n})$ .

From Proposition 4, given $ν \in Ω^{*} \subset L (g, R^{n})$ , the generalized Gibbs probability density

p_{β} (m) = \frac{1}{ψ (β)} e^{- 〈J (m), β〉},

with $β = Q^{- 1} (ν)$ , satisfies the maximum entropy principle

max_{q} [- \int_{M} q log q d μ] such that \{\begin{matrix} \int_{M} q d μ = 1 \\ \int_{M} J q d μ = ν . \end{matrix}

Particular cases. A particularly simple case of polysymplectic Souriau model is given by the manifold $M = T^{*} Q \oplus \dots \oplus T^{*} Q$ (Whitney sum with n factors) endowed with the polysymplectic form $Ω = (Ω^{1}, \dots, Ω^{n})$ , with $Ω^{k} = {(π^{k})}^{*} Ω_{can}$ . Here $π^{k} : T^{*} Q \oplus \dots \oplus T^{*} Q \to T^{*} Q$ is the projection onto the $k^{t h}$ factor of the sum and $Ω_{can}$ is the canonical symplectic form on $T^{*} Q$ . Let G be a Lie group acting on the left on Q. Then its naturally induced action on $T^{*} Q \oplus \dots \oplus T^{*} Q$ is polysymplectic and admits the polysymplectic momentum map $J : T^{*} Q \oplus \dots \oplus T^{*} Q \to L (g, R^{n})$ given by

J (α_{q}^{1}, \dots, α_{q}^{n}) = (J (α_{q}^{1}), \dots, J (α_{q}^{n})),

where $J : T^{*} Q \to g^{*}$ is the momentum map associated with the cotangent lifted action of G on $T^{*} Q$ given in Equation (38). In this case, there is no cocycle.

Another case without cocycle in the polysymplectic momentum map is when M is chosen as an orbit $M = O = {{({Ad}_{g^{- 1}}^{*})}^{n} μ + θ (g) ∣ g \in G}$ , $μ \in L (g, R^{n})$ , of the affine left action of G on $L (g, R^{n})$ given by $μ \mapsto {({Ad}_{g^{- 1}}^{*})}^{n} μ + θ (g)$ , with $θ \in C^{\infty} (G, L (g, R^{n}))$ a group one-cocycle. This orbit M is endowed with a natural polysymplectic form $ω = (ω^{1}, \dots, ω^{n})$ with $ω^{i}$ defined by

ω^{i} (μ) ({ad}_{ξ}^{*} μ - Θ (ξ, \cdot), {ad}_{η}^{*} μ - Θ (η, \cdot)) = 〈μ^{i}, [ξ, η]〉 - Θ^{i} (ξ, η)

where $Θ$ is given in Equation (47), which is the polysymplectic version of Equation (33). In this case, the polysymplectic momentum map is simply the inclusion $J : O \to L (g, R^{n})$ of the orbit in $L (g, R^{n})$ .

Property of the entropy and polysymplectic Lie-Poisson equations with cocycle. In the context of the polysymplectic model, a natural generalisation of the Lie-Poisson equations with cocycle Equation (36) are

\sum_{k = 1}^{n} \frac{\partial}{\partial x^{k}} μ^{k} + \sum_{k = 1}^{n} {ad}_{\frac{δ h}{δ μ^{k}}}^{*} μ^{k} = \sum_{k = 1}^{n} Θ^{k} (\frac{δ h}{δ μ^{k}}, \cdot),

(54)

for a map $μ : x = (x^{1}, \dots, x^{n}) \in U \subset R^{n} \mapsto μ (x) = (μ^{1} (x), \dots, μ^{n} (x)) \in L (g, R^{n})$ , with $h : L (g, R^{n}) \to R$ a given Hamiltonian. In absence of the cocycle, such a field theoretic version of the Lie-Poisson equation appears, for instance, for the spacetime Lagrangian and Hamiltonian theoretic description of Cosserat rods and molecular strands, see [36,37].

From the invariance property Equation (52), we have

\sum_{k = 1}^{n} 〈\frac{δ s}{δ μ^{k}}, - {ad}_{ξ}^{*} μ^{k} + Θ^{k} (ξ, \cdot)〉 = 0

for all $ξ \in g$ . This is equivalent to

\sum_{k = 1}^{n} {ad}_{\frac{δ s}{δ μ^{k}}}^{*} μ^{k} - \sum_{k = 1}^{n} Θ^{i} (\frac{δ s}{δ μ^{k}}, \cdot) = 0 .

For $h = s$ , Equation (54) thus reduce to $\sum_{k = 1}^{n} \frac{\partial}{\partial x^{k}} μ^{k} = 0$ . This is the natural extension of the Casimir property of s observed in the Souriau model in Section 2.3.2, given there by the condition ${ad}_{\frac{δ s}{δ μ}}^{*} μ - Θ (\frac{δ s}{δ μ}, \cdot) = 0$ , giving $\frac{d}{d t} μ = 0$ .

2.5. The Fisher Metric on Orbits and Equivariance

We give here a general expression of the Fisher metric on orbits of the action $ρ : G \times E \to E$ , in the general setting described in Section 2.1 and Section 2.2. This clarifies the link between the Fisher metric and the metric on adjoint orbits considered by Souriau, as enlightened in [4].

As in Section 2.1 we consider a manifold M, a vector space E, a function $U : M \to E^{*}$ , and the class of generalized Gibbs probability densities

p_{β} (m) = \frac{1}{ψ (β)} e^{- 〈U (m), β〉}, β \in Ω .

As in Section 2.2 given a Lie group G we consider an action $ϕ : G \times M \to M$ and a representation $ρ : G \times E \to E$ . We denote by

ξ_{E} (β) : = {\frac{d}{d ε}|}_{ε = 0} ρ_{exp (ϵ ξ)} (β) and ξ_{E^{*}} (ν) : = {\frac{d}{d ε}|}_{ε = 0} ρ_{exp (ε ξ)}^{*} (ν),

$β \in E$ , $ν \in E^{*}$ the infinitesimal generators of the representations $ρ_{g}$ and $ρ_{g}^{*}$ associated with $ξ \in g$ . We will use the equality $〈ξ_{E^{*}} (ν), β〉 = 〈ν, ξ_{E} (β)〉$ . Given the group one-cocycle $θ \in C^{\infty} (G, E^{*})$ associated with the function U, see Equation (16), we define $Θ \in C^{\infty} (g, E^{*})$ by

〈Θ (ξ), β〉 = {\frac{d}{d ε}|}_{ε = 0} 〈θ (exp (ε ξ)), β〉,

(55)

for $ξ \in g$ and $β \in E$ . Recall that the Fisher metric is $I (β) = - E_{β} [D^{2} log p_{β}]$ and coincides with the generalized heat capacity, see Proposition 2.

Proposition 9.

On the G-orbit through $β \in Ω$ , the Fisher metric is written in terms of Θ and Q as follows

$I (β) (ξ_{E} (β), ζ_{E} (β)) = - 〈Θ (ξ), ζ_{E} (β)〉 + 〈ξ_{E^{*}} (Q (β)), ζ_{E} (β)〉 .$ (56)

Proof.

Taking the derivative with respect to g at e of the equality Equation (18) given by

$〈Q (ρ_{g} (β)), γ〉 = 〈ρ_{g^{- 1}}^{*} (Q (β)), γ〉 + 〈θ (g), γ〉$

for every $γ \in E$ , we get

$〈D Q (β) \cdot ξ_{E} (β), γ〉 = 〈Θ (ξ), γ〉 - 〈ξ_{E^{*}} (Q (β)), γ〉,$

for every $ξ \in g$ . For $γ = ζ_{E} (β) \in T_{β} O$ , we get

$〈D Q (β) \cdot ξ_{E} (β), ζ_{E} (β)〉 = 〈Θ (ξ), ζ_{E} (β)〉 - 〈ξ_{E^{*}} (Q (β)), ζ_{E} (β)〉 .$

Therefore, from Proposition 2, we can write

$I (β) (ξ_{E} (β), ζ_{E} (β)) = - D^{2} Φ (β) (ξ_{E} (β), ζ_{E} (β)) = - 〈Θ (ξ), ζ_{E} (β)〉 + 〈ξ_{E^{*}} (Q (β)), ζ_{E} (β)〉,$

which proves the result. ☐

We illustrate this result for the Souriau model, its polysymplectic extension, and the Koszul model.

Souriau Lie group statistical model. In this case, M is endowed with a symplectic structure $ω$ , we take $E = g$ and $U = J : M \to g^{*}$ with nonequivariance cocycle $θ \in C^{\infty} (G, g^{*})$ , i.e., $θ (g) = J (ϕ_{g} (m)) - {Ad}_{g^{- 1}}^{*} J (m) \in g^{*}$ . The map $Θ$ defined in Equation (55) becomes here a two cocycle $Θ : g \times g \to R$ , see Equations (25)–(27), via the relation $〈Θ (ξ), η〉 = Θ (ξ, η)$ . Proposition 9 immediately yields the following result as a corollary, which is obtained by noting that $ξ_{E} (β) = {ad}_{ξ} β$ and $ξ_{E^{*}} (ν) = {ad}_{ξ}^{*} ν$ and is a consequence of Equation (29).

Corollary 10.

On the adjoint orbit through β in $g$ , the Fisher metric is written as follows

$I (β) ({ad}_{ξ} β, {ad}_{ζ} β) = - Θ (ξ, {ad}_{ζ} β) + 〈{ad}_{ξ}^{*} Q (β), {ad}_{ζ} β〉 .$ (57)

Please note that Equation (57) can be written as

I (β) ({ad}_{ξ} β, {ad}_{ζ} β) = - Θ_{β} (ξ, {ad}_{ζ} β),

where $Θ_{β} (ξ, η) : = Θ (ξ, η) - 〈{ad}_{ξ}^{*} Q (β), η〉 = Θ (ξ, η) - 〈Q (β), [ξ, η]〉$ is a two-cocycle. In particular, the last term is a coboundary. We refer to [2] for more information.

Polysymplectic Lie group statistical model. In this case, M is endowed with a polysymplectic structure $ω = (ω^{1}, \dots, ω^{n})$ , we take $E = L (R^{n}, g)$ and $U = J : M \to L (g, R^{n})$ with nonequivariance cocycle $θ \in C^{\infty} (G, L (g, R^{n}))$ :

θ (g) = J (ϕ_{g} (m)) - {({Ad}_{g^{- 1}}^{*})}^{n} J (m) \in L (g, R^{n}) .

The map $Θ \in C^{\infty} (g, L (g, R^{n}))$ defined in Equation (55) is identified here with the map $Θ : g \times g \to R^{n}$ defined in Equation (47), with the properties Equations (48) and (49). The identification being $Θ (ξ) (η) = Θ (ξ, η)$ , where $Θ (ξ) \in L (g, R^{n})$ is applied to $η \in g$ . We note the equality $〈Θ (ξ), (η_{1}, \dots, η_{n})〉 = \sum_{i = 1}^{n} Θ^{i} (ξ, η_{i})$ , for $(η_{1}, \dots, η_{n}) \in g^{n}$ identified with $L (R^{n}, g)$ , where $〈,〉$ on the left hand side is the duality pairing between $L (R^{n}, g)$ and $L (g, R^{n})$ .

We now apply Proposition 9, which follows here from Equation (51). We have the infinitesimal generators $ξ_{E} (β) = ({ad}_{ξ} β_{1}, \dots, {ad}_{ξ} β_{n})$ and $ξ_{E^{*}} (ν) = ({ad}_{ξ}^{*} ν^{1}, \dots, {ad}_{ξ}^{*} ν^{n})$ , and we get

〈Θ (ξ), ζ_{E} (β_{1}, \dots, β_{n})〉 = 〈Θ (ξ), ({ad}_{ζ} β_{1}, \dots, {ad}_{ζ} β_{n})〉 = 〈Θ^{i} (ξ, {ad}_{ζ} β_{i})〉

and

〈ξ_{E^{*}} Q (β), ζ_{E} (β)〉 = \sum_{i} 〈{ad}_{ξ}^{*} Q {(β)}^{i}, {ad}_{ζ} β_{i}〉 .

Therefore, the following result is obtained.

Corollary 11.

On the adjoint orbit through β in $L (R^{n}, g)$ , the Fisher metric is written as follows

$\begin{matrix} I (β) (({ad}_{ξ} β_{1}, \dots, {ad}_{ξ} β_{n}), ({ad}_{ζ} β_{1}, \dots, {ad}_{ζ} β_{n})) = - \sum_{i} Θ^{i} (ξ, {ad}_{ζ} β_{i}) + \sum_{i} 〈{ad}_{ξ}^{*} Q {(β)}^{i}, {ad}_{ζ} β_{i}〉 . \end{matrix}$ (58)

Koszul model. For the Koszul model with $Ω = sym {(n)}^{+}$ the cone of positive definite matrices and the Lie group $G = G L (n)$ , the actions Equations (21) and (22) have the associated infinitesimal generators

ξ_{E} (β) = - ξ^{T} β - β ξ and ξ_{E^{*}} (ν) = - ξ ν - ν ξ^{T} .

In this case, $Θ = 0$ and Proposition 9 is satisfied by noting the equalities

I (β) (ξ_{E} (β), ζ_{E} (β)) = (n + 1) Tr (ξ ζ + β^{- 1} ξ^{T} β ζ) = 〈ξ_{E^{*}} (Q (β)), ζ_{E} (β)〉 .

3. Applications

In this section, we show how the framework considered in Section 2 applies to various examples and helps identifying common underlying geometric structures. We start with the case of multivariate Gaussian probability densities as an illustration of the general framework, for which a cocycle is needed and which does not fall into the setting of the Souriau model. We then enlighten the strong analogies with quantum information geometry by considering Lie algebras with unitary representation and show that the Fisher metric as defined from the generalized heat capacity in Section 2.1, coincides with the Bogoliubov-Kubo-Mori metric. In this particular case the equation with Casimir dissipation/production considered in Section 2.3.3 reproduces a dissipative model of [25]. Finally, we consider in detail the case of the Euclidean group of the plane $S E (2)$ since it allows explicit and relatively easy computations while exhibiting the interesting feature of having cocycle. This example fits into the setting of the Souriau symplectic model.

3.1. Multivariate Gaussian Probability Densities

In this paragraph we study in detail the case of multivariate Gaussian densities, following the approach developed in Section 2.1 and Section 2.2. A first treatment in this spirit was given in [4], Section 8. Here we clarify several steps in this approach by following systematically the general setting presented in Section 2.1 and Section 2.2, while we note that this example is not a particular case of the Souriau model. We present explicitly the cocycle, which is here defined on the general affine group, with values in the Cartesian product of symmetric matrices and the Euclidean space.

Gaussian probability densities in generalized Gibbs form. Consider a multivariate Gaussian density with symmetric and positive definite covariance matrix $R \in {sym}^{+} (n)$ and mean $m \in R^{n}$ . The Gaussian probability density is written in the generalized Gibbs form $p_{β}$ discussed above in Section 2.1 as follows:

\begin{matrix} p_{(R, m)} (z) & = \frac{1}{{(2 π)}^{n / 2} det {(R)}^{1 / 2}} e^{- \frac{1}{2} {(z - m)}^{T} R^{- 1} (z - m)} \\ = \frac{1}{{(2 π)}^{n / 2} det {(R)}^{1 / 2} e^{\frac{1}{2} m^{T} R^{- 1} m}} e^{- (\frac{1}{2} z^{T} R^{- 1} z - m^{T} R^{- 1} z)} \\ = \frac{1}{{(2 π)}^{n / 2} det {(R)}^{1 / 2} e^{\frac{1}{2} m^{T} R^{- 1} m}} e^{- 〈(z z^{T}, z), (\frac{1}{2} R^{- 1}, - R^{- 1} m)〉} \\ = : \frac{1}{ψ (β)} e^{- 〈U (z), β〉} = : p_{β} (z), \end{matrix}

for every $z \in R^{n}$ . In the last equality above, we have defined the energy function

U : R^{n} \to sym (n) \times R^{n}, U (z) = (z z^{T}, z),

the vector $β \in {sym}^{+} (n) \times R^{n}$ in terms of $(R, m)$ as

β = (β_{1}, β_{2}) : = (\frac{1}{2} R^{- 1}, - R^{- 1} m) \Leftrightarrow R = \frac{1}{2} β_{1}^{- 1}, m = - \frac{1}{2} β_{1}^{- 1} β_{2},

(59)

and the partition function

\begin{matrix} ψ (β) & = {(2 π)}^{n / 2} det {(R)}^{1 / 2} e^{\frac{1}{2} m^{T} R^{- 1} m} = π^{n / 2} det {(β_{1})}^{- 1 / 2} e^{\frac{1}{4} Tr (β_{2}^{T} β_{1}^{- 1} β_{2})} . \end{matrix}

The general theory of Section 2.1 will be applied here with the manifold $M = R^{n}$ , the vector space $E = sym (n) \times R^{n}$ , and the open subset $Ω = {sym}^{+} (n) \times R^{n}$ . It is important to note that the element $β$ of the general theory is not given by the couple $(R, m)$ , but related to $(R, m)$ via Equation (59). This plays a main role in the understanding of the equivariance properties below.

Characteristic function, thermodynamic heat, and entropy. The Massieu potential is computed in terms of $β \in Ω$ as

\begin{matrix} Φ (β) & = - log (ψ (β)) = - \frac{n}{2} log (2 π) - \frac{1}{2} log (det (R)) - \frac{1}{2} m^{T} R^{- 1} m \\ = K + \frac{1}{2} log (det (β_{1})) - \frac{1}{4} β_{2}^{T} β_{1}^{- 1} β_{2}, \end{matrix}

where we defined the constant $K = - \frac{n}{2} log (π)$ . To compute the derivative, we consider the dual space $E^{*} = sym (n) \times R^{n}$ , with duality pairing

〈(ν_{1}, ν_{2}), (β_{1}, β_{2})〉 = Tr (ν_{1} β_{1}) + ν_{2} \cdot β_{2},

$(ν_{1}, ν_{2}) \in E^{*}$ , $(β_{1}, β_{2}) \in E^{*}$ . With respect to this duality pairing we have

\frac{δ Φ}{δ β_{1}} = \frac{1}{2} β_{1}^{- 1} + \frac{1}{4} β_{1}^{- 1} β_{2} {(β_{1}^{- 1} β_{2})}^{T} = R + m m^{T}, \frac{δ Φ}{δ β_{2}} = - \frac{1}{2} β_{1}^{- 1} β_{2} = m,

so we get the thermodynamic heat $Q : Ω \subset E \to Ω^{*} \subset E^{*}$ as

β = (β_{1}, β_{2}) \mapsto Q (β_{1}, β_{2}) = (\frac{1}{2} β_{1}^{- 1} + \frac{1}{4} β_{1}^{- 1} β_{2} {(β_{1}^{- 1} β_{2})}^{T}, - \frac{1}{2} β_{1}^{- 1} β_{2}) = (ν_{1}, ν_{2}) .

In terms of the covariance matrix R and the mean m, this is written as

(\frac{1}{2} R^{- 1}, - R^{- 1} m) \in Ω \subset E \mapsto (R + m m^{T}, m) \in Ω^{*} \subset E^{*} .

The entropy in terms of $β = (β_{1}, β_{2})$ and $(R, m)$ is computed by taking the Legendre transform of $Φ$ as

\begin{matrix} s (β_{1}, β_{2}) & = \frac{n}{2} (1 + log π) - \frac{1}{2} log (det (β_{1})) \\ s (R, m) & = \frac{n}{2} (1 + log (2 π)) + \frac{1}{2} log (det (R)) . \end{matrix}

Its expression $s : Ω^{*} \to R$ in terms of $(ν_{1}, ν_{2})$ is found by using

Q^{- 1} (ν_{1}, ν_{2}) = (\frac{1}{2} {(ν_{1} - ν_{2} ν_{2}^{T})}^{- 1}, - {(ν_{1} - ν_{2} ν_{2}^{T})}^{- 1} ν_{2}) .

Fisher information metric. We compute the generalized heat capacity $K (β) : = - D^{2} Φ (β)$ as follows, see Section 2.1:

\begin{matrix} K (β) & = - D^{2} Φ (β_{1}, β_{2}) ((δ β_{1}, δ β_{2}), (Δ β_{1}, Δ β_{2})) \\ = - {\frac{d}{d ε}|}_{ε = 0} D Φ (β_{1} + ε Δ β_{1}, β_{2} + ε Δ β_{2}) (δ β_{1}, δ β_{2}) \\ = - {\frac{d}{d ε}|}_{ε = 0} \frac{1}{2} Tr (β_{1}^{- 1} δ β_{1}) - \frac{1}{4} Tr (β_{1}^{- 1} β_{2} β_{2}^{T} β_{1}^{- 1} δ β_{1}) + \frac{1}{2} Tr (β_{2}^{T} β_{1}^{- 1} δ β_{2}) \\ = \frac{1}{2} Tr (β_{1}^{- 1} Δ β_{1} β_{1}^{- 1} δ β_{1}) + \frac{1}{2} Tr (β_{1}^{- 1} Δ β_{1} β_{1}^{- 1} β_{2} β_{2}^{T} β_{1}^{- 1} δ β_{1}) \\ - \frac{1}{2} Tr (β_{1}^{- 1} Δ β_{2} β_{2}^{T} β_{1}^{- 1} δ β_{1}) - \frac{1}{2} Tr (β_{1}^{- 1} δ β_{2} β_{2}^{T} β_{1}^{- 1} Δ β_{1}) \\ + \frac{1}{2} Tr (Δ β_{2}^{T} β_{1}^{- 1} δ β_{2}) . \end{matrix}

From Proposition 2 this coincides with the Fisher metric. Let us verify that this is the case by rewriting these five terms in terms of the mean and covariance matrix $(m, R)$ . The above expression equals

\begin{matrix} = \frac{1}{2} Tr (Δ R R^{- 1} δ R R^{- 1}) \\ + Tr (R^{- 1} δ R R^{- 1} m m^{T} R^{- 1} Δ R) \\ - Tr (R^{- 1} δ R R^{- 1} m m^{T} R^{- 1} Δ R) + Tr (Δ m m^{T} R^{- 1} δ R R^{- 1}) \\ - Tr (R^{- 1} Δ R R^{- 1} m m^{T} R^{- 1} δ R) + Tr (δ m m^{T} R^{- 1} Δ R R^{- 1}) \\ + Tr (R^{- 1} Δ R R^{- 1} m m^{T} R^{- 1} δ R) - Tr (δ m m^{T} R^{- 1} Δ R R^{- 1}) \\ - Tr (Δ m^{T} R^{- 1} δ R R^{- 1} m) + Tr (Δ m^{T} R^{- 1} δ m) \\ = \frac{1}{2} Tr (Δ R R^{- 1} δ R R^{- 1}) + Tr (Δ m^{T} R^{- 1} δ m), \end{matrix}

which gives the Fisher metric $I (R, m)$ for multivariate Gaussian densities.

Equivariance with respect to the general affine group. We consider the general affine group

G A (n) = G L (n) Ⓢ R^{n}

defined as the semidirect product of the general linear group and $R^{n}$ . The group multiplication is

(A, a) (B, b) = (A B, A b + a)

and the inverse of an element is ${(A, a)}^{- 1} = (A^{- 1}, - A^{- 1} a)$ . The Lie algebra is the semidirect product Lie algebra $ga (n) = gl (n) Ⓢ R^{n}$ with Lie brackets $[(U, u), (V, v)] = (U V - V U, U v - V u)$ .

The group $G A (n)$ acts on the left on the covariance matrix and the mean $(R, m) \in {sym}^{+} (n) \times R^{n}$ as follows:

Ψ_{(A, a)} (R, m) = (A R A^{T}, A m + a) .

(60)

We consider the left action of $G A (n)$ on $R^{n}$ given by

ϕ_{(A, a)} (z) = A z + a .

We note that $J ϕ_{(A, a)} = det (A)$ , a constant function on $R^{n}$ , hence $ϕ$ satisfies the hypothesis of Lemma 5.

It is instructive to observe that the expression

Φ (Ψ_{(A, a)} (R, m)) - Φ (R, m)

is not linear in $(R, m)$ , compare with Equation (17). However, such a statement is true when it is expressed in terms of the variables $(β_{1}, β_{2})$ . We first need the expression of the action of $G L (n)$ on $(β_{1}, β_{2})$ . This is done in the next lemma.

Lemma 12.

The left action of $G A (n)$ induced on $(β_{1}, β_{2}) \in sym (n) \times R^{n}$ by the action Ψ bin Equation (60) is given by

$ρ_{(A, a)} (β_{1}, β_{2}) = (A^{- T} β_{1} A^{- 1}, A^{- T} β_{2} - 2 A^{- T} β_{1} A^{- 1} a) .$

Its dual left action is

$ρ_{{(A, a)}^{- 1}}^{*} (ν_{1}, ν_{2}) = (A ν_{1} A^{T} + {[2 A ν_{2} a^{T}]}^{sym}, A ν_{2})$

Proof.

This is a direct computation using Equation (59). ☐

The situation is illustrated in the following commuting diagram.

\begin{matrix} (R, m) \in sym (n) \times R^{n} & \overset{Ψ_{(A, a)}}{⟶} & sym (n) \times R^{n} \\ ↓ & ↓ \\ (β_{1}, β_{2}) \in sym (n) \times R^{n} & \overset{ρ_{(A, a)}}{⟶} & sym (n) \times R^{n} \end{matrix}

The following result shows that the equivariant setting developed in Section 2.2 applies here with the action $ϕ_{(A, a)}$ and the representation $ρ_{(A, a)}$ (not $Ψ_{(A, a)}$ ).

Lemma 13.

The energy function $U (z) = (z z^{T}, z)$ satisfies the relation

$U (ϕ_{(A, a)} (z)) = ρ_{{(A, a)}^{- 1}}^{*} (U (z)) + θ (A, a)$ (61)

for the group one-cocycle $θ : G A (n) \to sym (n) \times R^{n}$ given by

$θ (A, a) = (a a^{T}, a) .$

The Massieu potential, the thermodynamic heat, and the entropy satisfy the equivariance properties

$\begin{matrix} Φ (ρ_{(A, a)} (β_{1}, β_{2})) - Φ (β_{1}, β_{2}) = - log (det (A)) + 〈θ ({(A, a)}^{- 1}), (β_{1}, β_{2})〉 \\ Q (ρ_{(A, a)} (β_{1}, β_{2})) = ρ_{{(A, a)}^{- 1}}^{*} (Q (β_{1}, β_{2})) + θ (A, a) \\ s (ρ_{{(A, a)}^{- 1}}^{*} (ν_{1}, ν_{2}) + θ (A, a)) = s (ν_{1}, ν_{2}) + log (det (A)) . \end{matrix}$

Proof.

To prove Equation (61) we note that

$\begin{matrix} U (ϕ_{(A, a)} (z)) - ρ_{{(A, a)}^{- 1}}^{*} (U (z)) = U (A z + a) - ρ_{{(A, a)}^{- 1}}^{*} (z z^{T}, z) \\ = ((A z + a) {(A z + a)}^{T}, A z + a) - (A z z^{T} A^{T} + {[2 A z a^{T}]}^{sym}, A z) = (a a^{T}, a) . \end{matrix}$

The other results follow from Proposition 5 and from $J ϕ_{(A, a)} = det (A)$ . Alternatively, we can compute explicitly

$\begin{matrix} Φ (ψ_{(A, a)} (β_{1}, β_{2})) = Φ (A^{- T} β_{1} A^{- 1}, A^{- T} β_{2} - 2 A^{- T} β_{1} A^{- 1} a) \\ = K + \frac{1}{2} log (det (A^{- T} β_{1} A^{- 1})) \\ - \frac{1}{4} Tr ({(A^{- T} β_{1} A^{- 1})}^{- 1} (A^{- T} β_{2} - 2 A^{- T} β_{1} A^{- 1} a) {(A^{- T} β_{2} - 2 A^{- T} β_{1} A^{- 1} a)}^{T}) \\ = K + \frac{1}{2} log (det {(A)}^{- 2} β_{1}) \\ - \frac{1}{4} Tr (β_{1}^{- 1} (β_{2} - 2 β_{1} A^{- 1} a) (β_{2}^{T} - 2 a^{T} A^{- T} β_{1})) \\ = K - log (det (A)) + \frac{1}{2} log (det (β_{1})) \\ - \frac{1}{4} Tr (β_{1}^{- 1} β_{2} β_{2}^{T}) + \frac{1}{2} Tr (β_{2} a^{T} A^{- T}) + \frac{1}{2} Tr (A^{- 1} a β_{2}^{T}) - Tr (A^{- 1} a a^{T} A^{- T} β_{1}) \\ = Φ (β_{1}, β_{2}) - log (det (A)) + 〈(- A^{- 1} a a^{T} A^{- T}, A^{- 1} a), (β_{1}, β_{2})〉 \end{matrix}$

which shows the result since $(- A^{- 1} a a^{T} A^{- T}, A^{- 1} a) = θ ({(A, a)}^{- 1})$ . ☐

The identity relating the Fisher information metric, the cocycle, and the thermodynamic heat follows from the general Equation (56) as

I (β) (ξ_{E} (β), ζ_{E} (β)) = - 〈Θ (ξ), ζ_{E} (β)〉 + 〈ξ_{E^{*}} (Q (β)), ζ_{E} (β)〉,

where $(β_{1}, β_{2}) \in {sym}^{+} (n) \times R^{n}$ , $ξ = (ξ_{1}, ξ_{2}), ζ = (ζ_{1}, ζ_{2}) \in ga (n)$ , $Θ (ξ_{1}, ξ_{2}) = (0, ξ_{2})$ and the infinitesimal generators are

\begin{matrix} ξ_{E} (β) & = (- ξ_{1}^{T} β_{1} - β_{1} ξ_{1}, - ξ_{1}^{T} β_{2} - 2 β_{1} ξ_{2}) \\ ξ_{E^{*}} (ν) & = (- ξ_{1} ν_{1} - ν_{1} ξ_{1}^{T} - 2 {[ν_{2} ξ_{2}^{T}]}^{sym}, - ξ_{1} ν_{2}) . \end{matrix}

Geodesics on multivariate Gaussian densities and Noether theorem. Let us consider the Lagrangian $L : T Ω = Ω \times E \to R$ given by the kinetic energy of the Fisher metric

L (R, \dot{R}, m, \dot{m}) = \frac{1}{2} Tr ({(R^{- 1} \dot{R})}^{2}) + {\dot{m}}^{T} R^{- 1} \dot{m} .

(62)

The associated Euler-Lagrange equations are

\{\begin{matrix} \ddot{R} + \dot{m} {\dot{m}}^{T} - \dot{R} R^{- 1} \dot{R} = 0 \\ \ddot{m} - \dot{R} R^{- 1} \dot{m} = 0 . \end{matrix}

(63)

In accordance with Proposition 5, see Equation (20), the Fisher metric is invariant with respect to the action of $G A (n)$ on $(R, m) \in Ω$ given in Equation (60). As a consequence, the Lagrangian is invariant under the tangent lifted action of $G A (n)$ on $T Ω$ given by

Φ_{(A, a)}^{T} (R, \dot{R}, m, \dot{m}) = (A R A^{T}, A \dot{R} A^{T}, A m + a, A \dot{m}) .

From Noether theorem, the corresponding momentum map is conserved. The momentum map $J^{L} : T Ω \to ga {(n)}^{*}$ associated with this Lagrangian and this action is given by

J^{L} (R, \dot{R}, m, \dot{m}) = J (R, \frac{\partial L}{\partial \dot{R}}, m, \frac{\partial L}{\partial \dot{m}}) = J (R, R^{- 1} \dot{R} R^{- 1}, m, 2 R^{- 1} \dot{m})

with $J : T^{*} Ω \to ga {(n)}^{*}$ the momentum map of the cotangent lifted action of $G A (n)$ relative to the canonical symplectic form, see Equation (38). Using the expression of the infinitesimal generator of $Ψ$ given by

{(U, u)}_{Ω} (R, m) = (R, U R + R U^{T}, m, U m + u),

for $(U, u) \in ga (n)$ , we get $J (R, m, p_{R}, p_{m}) = (2 p_{R} R + p_{m} m^{T}, p_{m})$ , so that

J^{L} (R, \dot{R}, m, \dot{m}) = (2 R^{- 1} \dot{R} + 2 R^{- 1} \dot{m} m^{T}, 2 R^{- 1} \dot{m}) .

From Noether theorem, we have the conservation laws

\{\begin{matrix} R^{- 1} \dot{R} + R^{- 1} \dot{m} m^{T} = c s t e \\ R^{- 1} \dot{m} = c s t e . \end{matrix}

We also refer to [55,56].

3.2. Unitary Representations and Quantum Fisher Metric

In this paragraph, we highlight the strong analogies between the equivariant setting considered in this paper, and techniques in quantum information geometry, as developed in [25], see also [28]. In particular, when this setting is considered in the quantum context, the Fisher metric, as defined from the derivative of the generalized heat capacity, coincides with the Bogoliubov-Kubo-Mori metric. We also illustrate how the general equations with Casimir dissipation/production considered above reproduce the dissipative model proposed in [25].

In [25] information geometry was studied for some Lie algebras where for certain unitary representations, the statistical manifold of states was defined as convex cone for which the partition function is finite, making reference to Bogoliubov-Kubo-Mori metric. Please note that only the case with zero cohomology for the Lie algebras $g = so (3)$ and $g = sl (2, R)$ was studied.

Let G be a Lie group, acting on a complex Hilbert space by a unitary left representation, $U_{g} : H \to H$ . We denote by $β_{H}$ the associated infinitesimal generator, giving the Lie algebra representation, and consider the self-adjoint operator $i β_{H}$ . We assume $dim H < \infty$ . The following class of density matrices is considered

ρ_{β} = \frac{1}{ψ (β)} exp (- i β_{H}),

(64)

for $β \in g$ , with partition function $ψ (β) = Tr (exp (- i β_{H}))$ . We adopted in Equation (64) a general form for the class of density matrices, which includes the class considered in [25] and reference therein. Please note that Equation (64) is a model of faithful quantum states. Note also that the map $β \mapsto ρ_{β}$ is not necessarily injective in general. We do not assume this hypothesis for the development below, but it is required in quantum information geometry.

As in Section 2.1, we adopt the following definitions

Φ (β) = - log (ψ (β)), Q (β) = D Φ (β), K (β) = - D^{2} Φ (β)

corresponding to the Massieu potential, the thermodynamic heat, and the generalized heat capacity. We note that

〈Q (β), δ β〉 = Tr (ρ_{β} i δ β_{H}) = {〈i δ β_{H}〉}_{ρ_{β}},

for all $δ β \in g$ , which gives the expectation value of the observable $i δ β_{H}$ in the quantum state $ρ_{β}$ . A result analogue to Equation (5) in the classical case. The generalized heat capacity is computed as

\begin{matrix} K (β) (δ β_{1}, δ β_{2}) & = - D^{2} Φ (β) (δ β_{1}, δ β_{2}) \\ = Tr (ρ_{β} i {(δ β_{1})}_{H} i {(δ β_{2})}_{H}) - Tr (ρ_{β} i {(δ β_{1})}_{H}) Tr (ρ_{β} i {(δ β_{2})}_{H}), \end{matrix}

thereby giving the covariance of the observables $i {(δ β_{1})}_{H}$ and $i {(δ β_{2})}_{H}$ in the quantum state $ρ_{β}$ . In [25], K is called the Bogoliubov-Kubo-Mori metric and chosen as the quantum version to the Fisher metric. Such a choice is geometrically natural in view of the result of Proposition 2 which identifies K with the Fisher metric in the classical case.

The von Neumann entropy of the density matrix can be expressed in terms of $Φ$ and Q as $- Tr (ρ_{β} log ρ_{β}) = Tr (ρ_{β} i β_{H}) + log ψ (β) = 〈Q (β), β〉 - Φ (β) = s (ν)$ , for $ν = Q (β) = {〈i {(\cdot)}_{H}〉}_{β} \in g^{*}$ . This is analogue to the result of Lemma 1 giving the entropy as the Legendre transform of $Φ (β)$ , thus giving a quantum version of the Clairaut equation. See also [57] for a link with the Fisher metric.

Using ${({Ad}_{g} β)}_{H} = U_{g} β_{H} U_{g^{- 1}}$ , we have the following equivariance properties, which are obtained as in Proposition 5,

\begin{matrix} ψ ({Ad}_{g} β) & = ψ (β) \\ Φ ({Ad}_{g} β) & = Φ (β) \\ ρ_{{Ad}_{g} β} & = U_{g} \circ ρ_{β} \circ U_{g^{- 1}} \\ Q ({Ad}_{g} β) & = {Ad}_{g^{- 1}}^{*} (Q (β)) \\ s ({Ad}_{g^{- 1}}^{*} ν) & = s (ν) \\ K ({Ad}_{g} β) ({Ad}_{g} δ β_{1}, {Ad}_{g} δ β_{2}) & = K (β) (δ β_{1}, δ β_{2}), \end{matrix}

for every $g \in G$ . In particular, $ψ$ and $Φ$ are constant on adjoint orbits and s is a Casimir for the Lie-Poisson bracket on $g^{*}$

{f, g} (μ) = 〈μ, [\frac{δ f}{δ μ}, \frac{δ g}{δ μ}]〉,

(65)

where we identify $g^{*}$ with $g$ using the duality pairing $〈ν, β〉 = Tr (ν^{*} β)$ and view $g$ as a Lie subalgebra of $u (H)$ . Please note that with this pairing, we have ${Ad}_{g^{- 1}}^{*} = {Ad}_{g}$ and ${ad}_{β}^{*} μ = [μ, β]$ , so adjoint and coadjoint orbits are identified, and the Kirillov-Kostant-Souriau symplectic form on coadjoint orbits becomes

ω_{O} (μ) ({ad}_{ξ} μ, {ad}_{η} μ) = 〈μ, [ξ, η]〉 .

Relation Equation (56) and ${ad}_{ξ}^{*} μ = [μ, ξ]$ gives here the following expression of the Bogoliubov-Kubo-Mori metric on (co)adjoint orbits

K (μ) ({ad}_{ξ} μ, {ad}_{η} μ) = 〈{ad}_{ξ}^{*} Q (μ), {ad}_{η} μ〉 = 〈Q (μ), [ξ, [η, μ]]〉 = 〈μ, [[Q (μ), ξ], η]〉 .

Casimir dissipation/production. The general equations for Casimir dissipation/production Equation (40) applied here with $g \subset u (H)$ , $g^{*} = g$ , and $γ (ν, β) = 〈ν, β〉 = Tr (ν^{*} β)$ , become

\frac{d}{d t} f = {f, h} - Λ 〈[\frac{δ f}{δ μ}, \frac{δ k}{δ μ}], [\frac{δ s}{δ μ}, \frac{δ k}{δ μ}]〉

(66)

for every f, with ${f, g}$ the Lie-Poisson bracket Equation (65). Since ${ad}_{ξ}^{*} μ = [μ, ξ]$ , Equation (41) yield

\frac{d}{d t} μ + [μ, \frac{δ h}{δ μ}] = Λ [[\frac{δ s}{δ μ}, \frac{δ k}{δ μ}], \frac{δ k}{δ μ}] .

(67)

Such equations where proposed in [25] with $h (μ) = 〈i H, μ〉$ , $k (μ) = 〈i T, μ〉$ , $s (μ) = \frac{1}{2} 〈μ, μ〉$ , with T and H two commuting self-adjoint operator $[T, H] = 0$ , thereby yielding the system

\frac{d}{d t} μ + i [μ, H] = - Λ [[μ, T], T],

(68)

with energy conservation and entropy production ( $Λ < 0$ )

\frac{d}{d t} h = 0 and \frac{d}{d t} s = - Λ ∥ [μ, T] ∥^{2} .

3.3. Souriau Symplectic Model for $S E (2)$ , Lie-Poisson Equations with Cocycle, and Casimir Dissipation

In this paragraph, we illustrate many aspects of the geometric setting by considering the special Euclidean group of the plane, as it allows explicit and relatively easy computations while having a nonequivariant momentum map. We present the Lie-Poisson equations with cocycle (affine Lie-Poisson equations) with Casimir dissipation/production associated with the entropy of the Souriau symplectic model.

Momentum map and cocycle. Consider the special Euclidean group of the plane $S E (2) = S O (2) Ⓢ R^{2}$ with semidirect product group multiplication

(R_{φ}, a) (R_{ψ}, b) = (R_{φ} R_{ψ}, R_{φ} b + a),

where $R_{φ}$ is a rotation of angle $φ$ . It acts on the plane $R^{2}$ as

ϕ_{(θ, a)} (x) = R_{θ} x + a

(69)

with infinitesimal generator

{(λ, u)}_{R^{2}} (x) = - λ J x + u

for $(λ, u) \in se (2) = so (2) Ⓢ R^{2}$ , where we identify $so (2)$ with $R$ and with

J = [\begin{matrix} 0 & 1 \\ - 1 & 0 \end{matrix}] .

We consider on $R^{2}$ the symplectic form $ω (x, y) = x \cdot J y$ . It is easy to see that the action (69) is symplectic and admits the momentum map

J (x) = (- \frac{1}{2} {| x |}^{2}, J x) .

This momentum map is not equivariant, with nonequivariance cocycle given by

θ (R_{φ}, a) = (- \frac{1}{2} {| a |}^{2}, J a) .

(70)

Gibbs densities, entropy, and Fisher metric. The generalized Gibbs probability densities are here given on $M = R^{2}$ by

p_{β} (x) = \frac{1}{ψ (β)} e^{- 〈J (x), β〉} = \frac{1}{ψ (β)} e^{\frac{1}{2} λ {| x |}^{2} - u \cdot J x},

where $β = (λ, u) \in Ω \subset se (2)$ , with $Ω = (- \infty, 0) \times R^{2}$ and the partition function and Massieu potential are computed to be

ψ (β) = - \frac{2 π}{λ} e^{- \frac{1}{2 λ} {| u |}^{2}}, Φ (β) = - log (2 π) + log (- λ) + \frac{1}{2 λ} {| u |}^{2}, β = (λ, u) \in Ω .

From this, we get the thermodynamic heat $Q : Ω \subset se (2) \to Ω^{*} \subset se {(2)}^{*}$ as

Q (λ, u) = D Φ (λ, u) = (\frac{1}{λ} - \frac{{| u |}^{2}}{2 λ^{2}}, \frac{1}{λ} u)

and we note that $Ω^{*} = {(μ, m) \in se {(2)}^{*} ∣ μ + \frac{{| m |}^{2}}{2} < 0}$ . The entropy $s : Ω^{*} \to R$ is obtained as the Legendre transform of $Φ : Ω \to R$ as

s (μ, m) = 1 + log (2 π) + log (- μ - \frac{{| m |}^{2}}{2}) .

(71)

We note the relation

\frac{δ s}{δ m} = \frac{δ s}{δ μ} m, \frac{δ s}{δ μ} = {(μ + \frac{{| m |}^{2}}{2})}^{- 1},

between the partial derivatives of s. From Proposition 2, the Fisher metric is found as

\begin{matrix} I (β) (δ β_{1}, δ β_{2}) & = - D^{2} Φ (β) (δ β_{1}, δ β_{2}) \\ = \frac{1}{λ^{2}} (1 - \frac{{| u |}^{2}}{λ}) δ λ_{1} δ λ_{2} + \frac{1}{λ^{2}} (u \cdot δ u_{1} δ λ_{2} + u \cdot δ u_{2} δ λ_{1}) - \frac{1}{λ} δ u_{1} \cdot δ u_{2}, \end{matrix}

for every $β = (λ, u) \in Ω \subset se (2)$ , i.e.,

I (β) = \frac{1}{λ^{2}} [\begin{matrix} 1 - \frac{{| u |}^{2}}{λ} & u^{T} \\ u & λ I_{2} \end{matrix}] .

Affine Lie-Poisson equations and Casimir dissipation. The affine coadjoint action associated with Equation (70) is found as

{Ad}_{{(φ, a)}^{- 1}}^{*} (μ, m) + θ (φ, a) = (μ - R_{φ} m \cdot J a - \frac{1}{2} {| a |}^{2}, R_{φ} m + J a),

from which we directly observe that the entropy Equation (71) is constant on affine coadjoint orbits $O^{Θ} \subset se {(2)}^{*}$ and hence is a Casimir of the Lie-Poisson bracket with cocycle on $se {(2)}^{*}$ .

Using the expression $Θ ((λ, u), (γ, v)) = - ω (u, v)$ of the associated two cocycle, we get the Lie-Poisson bracket with cocycle

\begin{matrix} {f, g}_{Θ} (μ, m) & = 〈(μ, m), [(\frac{δ f}{δ μ}, \frac{δ f}{δ m}), (\frac{δ g}{δ μ}, \frac{δ g}{δ m})]〉 - Θ ((\frac{δ f}{δ μ}, \frac{δ f}{δ m}), (\frac{δ g}{δ μ}, \frac{δ g}{δ m})) \\ = \frac{δ g}{δ μ} m \cdot J \frac{δ f}{δ m} - \frac{δ f}{δ μ} m \cdot J \frac{δ g}{δ m} + \frac{δ f}{δ m} \cdot J \frac{δ g}{δ m} \\ = \frac{δ g}{δ μ} ω (m, \frac{δ f}{δ m}) - \frac{δ f}{δ μ} ω (m, \frac{δ g}{δ m}) + ω (\frac{δ f}{δ m}, \frac{δ g}{δ m}) . \end{matrix}

Given a Hamiltonian $h : se {(2)}^{*} \to R$ , one gets the Lie-Poisson equations with cocycle as the following system of ODEs

\dot{f} = {f, h}_{Θ} ⟺ \{\begin{matrix} \frac{d}{d t} μ + m \cdot J \frac{δ h}{δ m} = 0 \\ \frac{d}{d t} m + \frac{δ h}{δ μ} J m = J \frac{δ h}{δ m} . \end{matrix}

(72)

These equations determine Hamiltonian dynamics on affine coadjoint orbits that are the level sets of the entropy. From the point of view of thermodynamics, motion remaining on these surfaces is non-dissipative, whereas motion transversal to these surfaces is dissipative. We apply below the geometric approach to include dissipation and hence, motion through affine coadjoint orbits, as considered in general in Section 2.3.3.

Given the entropy Equation (71) and a function $k : se {(2)}^{*} \to R$ which commutes with the Hamiltonian, i.e.,

\frac{δ k}{δ μ} J \frac{δ h}{δ m} = \frac{δ h}{δ μ} J \frac{δ k}{δ m} ⟺ \frac{δ k}{δ μ} \frac{δ h}{δ m} = \frac{δ h}{δ μ} \frac{δ k}{δ m}

(for instance $k = h$ ), the Casimir dissipative/production Equation (40) gives here

\begin{matrix} \frac{d}{d t} f & = {f, h}_{Θ} - Λ (\frac{δ k}{δ μ} J \frac{δ f}{δ m} - \frac{δ f}{δ μ} J \frac{δ k}{δ m}) \cdot (\frac{δ k}{δ μ} J \frac{δ s}{δ m} - \frac{δ s}{δ μ} J \frac{δ k}{δ m}) \\ = {f, h}_{Θ} - Λ (\frac{δ k}{δ μ} \frac{δ f}{δ m} - \frac{δ f}{δ μ} \frac{δ k}{δ m}) \cdot (\frac{δ k}{δ μ} \frac{δ s}{δ m} - \frac{δ s}{δ μ} \frac{δ k}{δ m}) \end{matrix}

for every f, therefore, the following equations emerge

\{\begin{matrix} \frac{d}{d t} μ + m \cdot J \frac{δ h}{δ m} = Λ \frac{δ s}{δ μ} (\frac{δ k}{δ μ} m - \frac{δ k}{δ m}) \cdot \frac{δ k}{δ m} \\ \frac{d}{d t} m + \frac{δ h}{δ μ} J m = J \frac{δ h}{δ m} - Λ \frac{δ s}{δ μ} (\frac{δ k}{δ μ} m - \frac{δ k}{δ m}) \frac{δ k}{δ μ}, \end{matrix}

(73)

which have the property of preserving the Hamiltonian while dissipating/producing entropy as

\begin{matrix} \frac{d}{d t} s & = - Λ | \frac{δ k}{δ μ} \frac{δ s}{δ m} - \frac{δ s}{δ μ} \frac{δ k}{δ m} |^{2} = - Λ {\frac{δ s}{δ μ}}^{2} | \frac{δ k}{δ μ} m - \frac{δ k}{δ m} |^{2} \\ = - \frac{Λ}{{(μ + \frac{{| m |}^{2}}{2})}^{2}} {| \frac{δ k}{δ μ} m - \frac{δ k}{δ m} |}^{2} . \end{matrix}

They are the $S E (2)$ version of the Equation (67) proposed in the quantum context.

4. Variational Principles and (Multi)Symplectic Integrators

In this section, we make use of the geometric setting presented above to propose geometric integrators for some of the equations described earlier. Geometric integrators are numerical schemes designed with the aim to preserve as much possible the geometric structures underlying the equations they discretize [58]. It turns out that the preservation of geometric structures not only produces an improved qualitative behaviour, but also allows for a more accurate long-time integration. One efficient way to derive geometric integrators is to exploit the variational formulation of the continuous equations and to mimic this formulation at the spatial and/or temporal discrete level. For instance, for the ODEs of classical mechanics, a time discretization of the Lagrangian variational formulation permits the derivation of numerical schemes, called variational integrators, that are symplectic, exhibit good energy behavior, and inherit a discrete version of Noether’s theorem which guarantees the exact preservation of momenta arising from symmetries, see [59]. These methods are especially well-suited for systems on Lie group [60].

Variational integrators were extended to PDEs in various ways, one way being given by multisymplectic variational integrators ([61,62,63,64]) in which the starting point is a spacetime discretization of the Hamilton principle. Here also, a discrete version of Noether’s theorem for field theories is available in presence of symmetries. We refer to [64,65,66] for recent applications of multisymplectic variational discretizations.

In this section, we will present a geometric discretization of the Lie-Poisson equations with cocycle, see Section 2.3, that is symplectic and preserves the affine coadjoint orbits. We will then extend this approach to treat the case of the polysymplectic version of these Lie-Poisson equations with cocycle, see Section 2.4, by constructing a multisymplectic integrator. To achieve these goals, we will first present the variational principles attached to these equations, by looking at them from the Lagrangian side. Then these variational principles will be discretized in time or in space and time.

4.1. Preliminaries on Variational Lie Group Integrators

We very briefly recall the broad idea of variational integrators and refer to [59] for the detailed description. They are based on a discrete version of the Hamilton principle given, for a Lagrangian $L : T Q \to R$ , as

δ \int_{0}^{T} L (q (t), \dot{q} (t)) d t = 0,

(74)

for arbitrary variations of the curve $q (t)$ with fixed extremities at $t = 0, T$ .

Euler-Poincaré and Lie-Poisson equations. We will be especially interested in the case where the configuration manifold is a Lie group, $Q = G$ , and the Lagrangian $L : T G \to R$ is right G-invariant. In this case, L induces a reduced Lagrangian ℓ on the quotient space $(T G) / G$ identified with the Lie algebra $g$ , i.e., we get $ℓ : g \to R$ defined by the relation $L (g, \dot{g}) = ℓ (\dot{g} g^{- 1})$ . The Euler-Lagrange equations for L are equivalent to equations on $g$ written in terms of the reduced Lagrangian $ℓ : g \to R$ , called the Euler-Poincaré equations. They are obtained by computing the variational principle for ℓ induced by the Hamilton principle Equation (74). It is given by

δ \int_{0}^{T} ℓ (ξ (t)) d t = 0, for δ ξ = \partial_{t} η + [η, ξ]

(75)

and yields the Euler-Poincaré equations

\frac{d}{d t} \frac{δ ℓ}{δ ξ} + {ad}_{ξ}^{*} \frac{δ ℓ}{δ ξ} = 0

(76)

for the curve $ξ (t) \in g$ . In Equation (75), $η (t)$ is an arbitrary curve in $g$ vanishing at the extremities. If the Lagrangian is hyperregular, one can rewrite the Euler-Lagrange equations and the Euler-Poincaré equations in terms of the Hamiltonian associated with L or ℓ. In terms of the Hamiltonian $h : g^{*} \to R$ obtained by the Legendre transform of ℓ, Equation (76) become the Lie-Poisson equations

\frac{d}{d t} μ + {ad}_{\frac{δ h}{δ μ}}^{*} μ = 0,

(77)

see [31].

Variational integrators. Let Q be a configuration manifold and let $L : T Q \to R$ be a Lagrangian. Suppose that a time step $Δ t$ was fixed, denote by ${t_{k} = k Δ t ∣ k = 0, \dots, N}$ the sequence of time, and by $q_{d} : {t_{k}}_{k = 0}^{N} \to Q$ , $q_{d} (t_{k}) = q_{k}$ a discrete curve. A discrete Lagrangian is a map $L_{d} : Q \times Q \to R$ , $L_{d} = L_{d} (q_{k}, q_{k + 1})$ that approximates the action integral of L along the curve segment between $q_{k}$ and $q_{k + 1}$ , that is, we have

L_{d} (q_{k}, q_{k + 1}) \approx \int_{t_{k}}^{t_{k + 1}} L (q (t), \dot{q} (t)) d t,

where $q (t_{k}) = q_{k}$ and $q (t_{k + 1}) = q_{k + 1}$ . Usually this approximation is related to some numerical quadrature rule of the integral above. The discrete analogue of Hamilton’s principle Equation (74) reads

δ \sum_{k = 0}^{N - 1} L_{d} (q_{k}, q_{k + 1}) = 0

(78)

for all variations $δ q_{d}$ of $q_{d}$ with vanishing endpoints. After taking variations and applying a discrete integration by parts formula (change of indices), we obtain the discrete Euler-Lagrange equations:

D_{2} L_{d} (q_{k - 1}, q_{k}) + D_{1} L_{d} (q_{k}, q_{k + 1}) = 0, \forall k \in {1, \dots, N - 1} .

(79)

These equations define, under appropriate conditions, an algorithm which solves for $q_{k + 1}$ knowing the two previous configuration variables $q_{k}$ and $q_{k - 1}$ .

To define the discrete momentum maps, one first needs to consider the discrete Legendre transforms defined by

\begin{matrix} F^{+} L_{d} (q_{k}, q_{k + 1}) & : = D_{2} L_{d} (q_{k}, q_{k + 1}) \in T_{q_{k + 1}}^{*} Q \\ F^{-} L_{d} (q_{k}, q_{k + 1}) & : = - D_{1} L_{d} (q_{k}, q_{k + 1}) \in T_{q_{k}}^{*} Q . \end{matrix}

(80)

Then, given a Lie group action $Φ : G \times Q \to Q$ , the discrete Lagrangian momentum maps $J_{L_{d}}^{+}, J_{L_{d}}^{-} : Q \times Q \to g^{*}$ are defined by

\begin{matrix} 〈J_{L_{d}}^{+} (q_{k}, q_{k + 1}), ξ〉 & = 〈D_{2} L_{d} (q_{k}, q_{k + 1}), ξ_{Q} (q_{k + 1})〉 \\ 〈J_{L_{d}}^{-} (q_{k}, q_{k + 1}), ξ〉 & = 〈- D_{1} L_{d} (q_{k}, q_{k + 1}), ξ_{Q} (q_{k})〉 . \end{matrix}

(81)

If the discrete curve ${q^{j}}_{j = 0}^{N}$ satisfies the discrete Euler-Lagrange equations then we have the equality

J_{L_{d}}^{+} (q_{k - 1}, q_{k}) = J_{L_{d}}^{-} (q_{k}, q_{k + 1}), for all j = 1, \dots, N - 1 .

(82)

If the discrete Lagrangian $L_{d}$ is G-invariant under the diagonal action of G induced by $Φ$ on $Q \times Q$ , then the two discrete momentum maps coincide, $J_{L_{d}}^{-} = J_{L_{d}}^{+} = : J_{L_{d}}$ , therefore from Equation (82), we obtain that $J_{L_{d}}$ is a conserved quantity along the discrete curve solution of Equation (79), that is,

J_{L_{d}} (q_{k}, q_{k + 1}) = J_{L_{d}} (q_{k - 1}, q_{k}), for all j = 1, \dots, N - 1 .

(83)

This result is referred to as the discrete Noether’s theorem.

The symplectic character of the integrator is obtained by showing that the scheme $(q_{k - 1}, q_{k}) \mapsto (q_{k}, q_{k + 1})$ preserves the discrete symplectic two-forms $Ω_{L_{d}}^{\pm} : = {(F^{\pm} L_{d})}^{*} Ω_{can}$ on $Q \times Q$ , where $Ω_{can}$ is the canonical symplectic two-form on $T^{*} Q$ , see [59].

Discrete Euler-Poincaré equations. For Lie groups, variational discretization and the associated discrete Lagrangian reductions, was started in [60,67], and referred to as Lie group variational integrators. The essential idea behind such integrators is to discretize Hamilton’s principle and to update group elements using group operations. For the case of invariant systems on Lie group, one chooses a discrete Lagrangian that inherits the invariance of the continuous Lagrangian, i.e., $L_{d} : G \times G \to R$ satisfies $L_{d} (g_{k} h, g_{k + 1} h) = L_{d} (g_{k}, g_{k + 1})$ , for all $h \in G$ .

From this invariance, one defines the reduced discrete Lagrangian $L_{d}$ on the associated quotient space $(G \times G) / G$ identified with G with quotient map $(g_{k}, g_{k + 1}) \in G \times G \mapsto g_{k + 1} g_{k}^{- 1} \in G$ , i.e., the two discrete Lagrangians are related as $L_{d} (g_{k}, g_{k + 1}) = L_{d} (g_{k + 1} g_{k}^{- 1})$ , this is the point of view developed in [60]. The discrete Hamilton principle Equation (78) for $L_{d}$ induces a discrete Euler-Poincaré variational principle for $L_{d}$ that yields the discrete Euler-Poincaré equations on G. Numerically speaking it is desirable to obtain the algorithm on a vector space rather than on a Lie group. For this aim, a local diffeomorphism $τ : g \to G$ with $τ (0) = e$ is introduced to express small discrete changes in the group configuration through unique Lie algebra elements. Such a map is referred to as a retraction map ([68,69]). The discrete reduced Lagrangian is transported into a discrete Lagrangian $ℓ_{d}$ defined on a neighborhood of 0 in $g$ via the relation

ℓ_{g} (ξ_{k}) = L_{d} (g_{k + 1} g_{k}^{- 1}), with τ (Δ t ξ_{k}) = g_{k + 1} g_{k}^{- 1} .

(84)

The relation on the right in Equation (84) is thought of as a discrete version of $ξ = \dot{g} g^{- 1}$ .

The discrete Euler-Poincaré equations for $ℓ_{d}$ are obtained by computing the discrete variational principle induced on the discrete action $\sum_{k = 0}^{N - 1} ℓ_{d} (ξ_{k})$ from the discrete Hamilton principle $δ \sum_{k = 0}^{N - 1} L (g_{k}, g_{k + 1}) = 0$ recalled above in Equation (78). The main step in this process is to compute the variations $δ ξ_{k}$ of $ξ_{k} = \frac{1}{Δ t} τ^{- 1} (g_{k + 1} g_{k}^{- 1})$ induced by arbitrary variations $δ g_{k}$ . One finds the expression

δ ξ_{k} = \frac{1}{Δ t} d^{L} τ^{- 1} (Δ t ξ_{k}) \cdot ({Ad}_{τ {(Δ t ξ_{k})}^{- 1}} η_{k + 1} - η_{k}),

(85)

where $η_{k} = δ g_{k} g_{k}^{- 1}$ and $d^{L} τ^{- 1} (ξ) : g \to g$ is the inverse to the left trivialized derivative of $τ$ , $d^{L} τ (ξ) : g \to g$ defined by

d^{L} τ (ξ) \cdot η = τ {(ξ)}^{- 1} D τ (ξ) \cdot η .

(86)

The discrete Euler-Poincaré variational principle thus reads

δ \sum_{k = 0}^{N - 1} ℓ_{d} (ξ_{k}) = 0,

(87)

with respect to variations $δ ξ_{k}$ of the form Equation (85) with $η_{k}$ vanishing at the endpoints. It yields the discrete Euler-Poincaré equations.

{Ad}_{τ {(Δ t ξ_{k - 1})}^{- 1}}^{*} μ_{k - 1} - μ_{k} = 0, μ_{k} : = d^{L} τ^{- 1} {(Δ t ξ_{k})}^{*} \frac{δ ℓ}{δ ξ_{k}} .

(88)

Here $d^{L} τ^{- 1} {(ξ)}^{*} : g^{*} \to g^{*}$ denotes the dual map to $d^{L} τ^{- 1} (ξ) : g \to g$ . We refer to [60,67,69,70] for the discrete Euler-Poincaré equations.

Being equivalent to the discrete Euler-Lagrange equations on the Lie group, this scheme is equivalent to a symplectic scheme $(g_{k - 1}, g_{k}) \mapsto (g_{k}, g_{k + 1})$ on $G \times G$ . From the discrete Noether theorem, the scheme also preserves the discrete momentum map and the coadjoint orbits $O \subset g^{*}$ . Moreover, the scheme $μ_{k - 1} \in O \mapsto μ_{k} \in O$ is symplectic on coadjoint orbits with respect to the Kirillov-Kostant-Souriau symplectic form, see [60]. Please note that the discrete momentum map is computed as

J_{L_{d}} (g_{k}, g_{k + 1}) = \frac{1}{Δ t} {Ad}_{g_{k}}^{*} (d^{L} τ^{- 1} {(Δ t ξ_{k})}^{*} \frac{δ ℓ}{δ ξ_{k}}) = \frac{1}{Δ t} {Ad}_{g_{k}}^{*} μ_{k},

which is readily seen to be preserved, $J_{L_{d}} (g_{k - 1}, g_{k}) = J_{L_{d}} (g_{k}, g_{k + 1})$ , along the solutions of Equation (88)

4.2. Central Extensions and Variational Principle for the Lie-Poisson Equations with Cocycle

We considered in Equation (36) the Lie-Poisson equations with cocycle given by

\frac{d}{d t} μ + {ad}_{\frac{δ h}{δ μ}}^{*} μ = Θ (\frac{δ h}{δ μ}, \cdot),

(89)

associated with the Souriau symplectic model. Our aim is to derive a geometric integrator for this system that is symplectic and preserves the affine coadjoint orbits for general Hamiltonian. One systematic step is to look at Equation (89) from the Lagrangian side, as it was done for the ordinary Lie-Poisson equations above. Assuming that h is hyperregular, we can take the associated Lagrangian $ℓ : g \to R$ and rewrite the equations as

\frac{d}{d t} \frac{δ ℓ}{δ ξ} + {ad}_{ξ}^{*} \frac{δ ℓ}{δ ξ} = Θ (ξ, \cdot),

(90)

for a curve $ξ (t) \in g$ . However, in general (i.e., for arbitrary ℓ, arbitrary $g$ , and arbitrary $Θ$ ) there is no natural variational principle for these equations, in the sense of a variational principle induced from the ordinary Hamilton principle for a Lagrangian $L : T G \to R$ .

Nevertheless, there is a way to interpret the system Equation (90) as being induced by an ordinary Euler-Poincaré equations on a central extension of the Lie group G, integrating the Lie algebra cocycle $Θ$ . This is related to a well-known fact that affine coadjoint orbits can be seen as ordinary coadjoint orbits of a central extension. We recall this fact below.

Lie group operations on central extensions. We shall focus on topologically trivial central extensions of finite dimensional Lie groups by $R$ . The central extended group is thus of the form $\hat{G} = G \times R$ with group multiplication

(g, α) (h, β) = (g h, α + β + B (g, h))

where $B : G \times G \to R$ is a group two-cocycle, i.e., it satisfies

B (f, g) + B (f g, h) = B (f, g h) + B (g, h)

for all $f, g, h \in G$ . It can always been chosen such that $B (e, g) = B (g, e) = 0$ , in which case we have $B (g, g^{- 1}) = B (g^{- 1}, g)$ and ${(g, α)}^{- 1} = (g^{- 1}, - α - B (g^{- 1}, g))$ . One obtains from this the expression of the adjoint and coadjoint actions as

\begin{matrix} {Ad}_{(g, α)} (η, v) & = ({Ad}_{g} η, v + 〈θ (g^{- 1}), η〉) \end{matrix}

(91)

\begin{matrix} {Ad}_{(g, α)}^{*} (μ, a) & = ({Ad}_{g}^{*} μ + a θ (g^{- 1}), a) \end{matrix}

(92)

where the group one-cocycle $θ \in C^{\infty} (G, g^{*})$ is defined by

〈θ (g), η〉 = D_{2} B (g^{- 1}, g) \cdot η g - D_{1} B (g, g^{- 1}) \cdot η g .

(93)

Equation (92) shows that the ordinary coadjoint orbits of $\hat{G}$ through $(μ, 1)$ are affine coadjoint orbits of G. We have the corresponding formulas

\begin{matrix} {ad}_{(ξ, u)} (η, v) & = ([ξ, η], - Θ (ξ, η)) \end{matrix}

(94)

\begin{matrix} {ad}_{(ξ, u)}^{*} (μ, a) & = ({ad}_{ξ}^{*} μ - a Θ (ξ, \cdot), 0) . \end{matrix}

(95)

Euler-Poincaré and Lie-Poisson equations on central extensions. From Equation (95), the Euler-Poincaré equations for a reduced Lagrangian $\hat{ℓ} : \hat{g} = g \times R \to R$ take the form

\{\begin{matrix} \frac{d}{d t} \frac{δ \hat{ℓ}}{δ ξ} + {ad}_{ξ}^{*} \frac{δ \hat{ℓ}}{δ ξ} = \frac{δ \hat{ℓ}}{δ u} Θ (ξ, \cdot) \\ \frac{d}{d t} \frac{δ \hat{ℓ}}{δ u} = 0 . \end{matrix}

(96)

They are the critical conditions for the Euler-Poincaré variational principle

δ \int_{0}^{T} \hat{ℓ} (ξ (t), u (t)) d t = 0, for δ ξ = \partial_{t} η + [η, ξ], δ u = \partial_{t} v - Θ (η, ξ),

(97)

which is just a special instance of Equation (75) applied to central extensions. In Equation (97) $η (t) \in g$ and $v (t) \in R$ are arbitrary curves vanishing at the extremities.

Given a Lagrangian $ℓ : g \to R$ , one can then define the Lagrangian

\hat{ℓ} (ξ, u) = ℓ (ξ) + \frac{1}{2} u^{2}

(98)

on $\hat{g}$ for which Equation (96) does reduce to Equation (90) if the initial condition for the curve $u (t)$ is $u (0) = 1$ . This means that Equation (90) have a natural Euler-Poincaré variational formulation, if one interprets them as an invariant subsystem of an Euler-Poincaré equation on a central extension of G via a group two-cocycle B that integrates the one-cocycle $θ$ as in Equation (93).

The same reasoning also directly applies on the Hamiltonian side, in which case the Lie-Poisson equation with cocycle Equation (89) is an invariant subsystem of an ordinary Lie-Poisson equation associated with a central extension of G.

All these considerations are standard, see, e.g., [29,31].

4.3. Variational Symplectic Integrators for the Lie-Poisson Equations with Cocycle

Here we shall present a geometric symplectic Lie group integrator for Lie-Poisson equations with cocycle Equation (36) that preserves the affine coadjoint orbits for general Hamiltonian. In particular, the scheme preserves the affine Kirillov-Kostant-Souriau symplectic form on these affine coadjoint orbits. We shall use the Euler-Poincaré variational formulation on central extensions presented in Section 4.2.

Some useful identities. Given a central extension $\hat{G} = G \times R$ , we shall consider the retraction map $τ : \hat{g} \to \hat{G}$ defined by

τ (ξ, u) = (\bar{τ} (ξ), u)

(99)

where $\bar{τ} : g \to G$ is a retraction map for G. To derive the discrete Euler-Poincaré equations we shall need several identities involving $d^{L} τ$ and $d^{L} \bar{τ}$ , see Equation (86), that are shown in the next Lemma.

Lemma 14.

For a local diffeomorphism of the form Equation (99) on central extension, we have the following identities

(a)
$d^{L} τ (ξ, u) \cdot (η, v) = (d^{L} \bar{τ} (ξ) \cdot η, v - D_{2} B (\bar{τ} (ξ), e) \cdot (d^{L} \bar{τ} (ξ) \cdot η))$

(b)
$d^{L} τ {(ξ, u)}^{*} \cdot (μ, a) = (d^{L} \bar{τ} {(ξ)}^{*} (μ - a D_{2} B (\bar{τ} (ξ), e)), a)$

(c)
$d^{L} τ^{- 1} (ξ, u) \cdot (ζ, w) = (d^{L} {\bar{τ}}^{- 1} (ξ) \cdot ζ, w + D_{2} B (\bar{τ} (ξ), e) \cdot ζ)$

(d)
$d^{L} τ^{- 1} {(ξ, u)}^{*} \cdot (μ, a) = (d^{L} {\bar{τ}}^{- 1} {(ξ)}^{*} \cdot μ + a D_{2} B (\bar{τ} (ξ), e), a)$ ,

where $B : G \times G \to R$ is the group two-cocycle.

Proof.

These identities are proven as follows.

(a)
Using the definition of $d^{L} τ$ , we compute
$\begin{matrix} d^{L} τ (ξ, u) \cdot (η, v) & = τ {(ξ, u)}^{- 1} (D τ (ξ, u) \cdot (η, v)) \\ = {(\bar{τ} (ξ), u)}^{- 1} (\bar{τ} (ξ), D \bar{τ} (ξ) \cdot η, u, v) \\ = (\bar{τ} {(ξ)}^{- 1} D \bar{τ} (ξ) \cdot η, v + D_{2} B (\bar{τ} {(ξ)}^{- 1}, \bar{τ} (ξ)) \cdot (D \bar{τ} (ξ) \cdot η)) \\ = (d^{L} \bar{τ} (ξ) \cdot η, v + D_{2} B (\bar{τ} {(ξ)}^{- 1}, \bar{τ} (ξ)) \cdot (\bar{τ} (ξ) d^{L} \bar{τ} (ξ) \cdot η)) \end{matrix}$
where in the third equality, we used the formula for the tangent lift of left translation on $\hat{G}$ . Using the properties of the group two-cocycle B, we get the identity
$D_{2} B (g^{- 1}, g) \cdot (g η) = - D_{2} B (g, e) \cdot η,$
for all $g \in G$ and $η \in g$ . Hence we get the result.

(b)
Taking the dual map and using (a), we get
$\begin{matrix} 〈d^{L} τ {(ξ, u)}^{*} \cdot (μ, a), (η, v)〉 \\ = 〈(μ, a), d^{L} τ (ξ, u) \cdot (η, v)〉 \\ = 〈μ, d^{L} \bar{τ} (ξ) \cdot η〉 + a (v - D_{2} B (\bar{τ} (ξ), e) \cdot (d^{L} \bar{τ} (ξ) \cdot η)) \\ = 〈d^{L} \bar{τ} {(ξ)}^{*} \cdot μ - a d^{L} \bar{τ} {(ξ)}^{*} D_{2} B (\bar{τ} (ξ), e), η〉 + a v, \end{matrix}$
which proves the result.

(c)
It follows by (a) and by inverting the relation $(ζ, w) = d^{L} τ (ξ, u) \cdot (η, v)$
$(ζ, w) = d^{L} τ (ξ, u) \cdot (η, v) = (d^{L} \bar{τ} (ξ) \cdot η, v + D_{2} B (\bar{τ} {(ξ)}^{- 1}, \bar{τ} (ξ)) \cdot (\bar{τ} (ξ) d^{L} \bar{τ} (ξ) \cdot η))$
is equivalent to
$(η, v) = (d^{L} {\bar{τ}}^{- 1} (ξ) \cdot ζ, w - D_{2} B (\bar{τ} {(ξ)}^{- 1}, \bar{τ} (ξ)) \cdot (\bar{τ} (ξ) ζ))$

(d)
This follows by taking the dual map and using (c) as earlier. $□$

☐

Variational discretization of the Lie-Poisson equations with cocycle. With the previous result, we first give below a symplectic integrator for the Euler-Poincaré equations on central extensions. Then we will show how this provides a symplectic integrator for the Lie-Poisson equations with cocycle.

Proposition 15

(Discrete Euler-Poincaré equations on central extensions). The following are equivalent:

(a)
The discrete curve $(ξ_{k}, u_{k})$ is critical for the discrete Euler-Poincaré variational principle
$δ \sum_{k} \hat{ℓ} (ξ_{k}, u_{k}) = 0,$
with respect to variations
$\{\begin{matrix} δ ξ_{k} = \frac{1}{Δ t} d^{L} {\bar{τ}}^{- 1} (Δ t ξ_{k}) \cdot ({Ad}_{τ {(Δ t ξ_{k})}^{- 1}} η_{k + 1} - η_{k}) \\ δ u_{k} = \frac{1}{Δ t} (v_{k + 1} - v_{k} - D_{2} B (\bar{τ} (Δ t ξ_{k}), e) \cdot η_{k} + D_{1} B (e, \bar{τ} (Δ t ξ_{k})) \cdot η_{k + 1}) \end{matrix}$
where $η_{k} \in g$ and $v_{k} \in R$ are arbitrary discrete curves vanishing at the endpoints.

(b)
The discrete curve $(ξ_{k}, u_{k})$ is a solution of the discrete Euler-Poincaré equations
$\{\begin{matrix} {Ad}_{τ {(Δ t ξ_{k - 1})}^{- 1}}^{*} μ_{k - 1} + a_{k - 1} θ (\bar{τ} (Δ t ξ_{k - 1})) - μ_{k} = 0 \\ a_{k - 1} - a_{k} = 0 \end{matrix}$ (100)
with
$\{\begin{matrix} μ_{k} = d^{L} {\bar{τ}}^{- 1} {(Δ t ξ_{k})}^{*} \frac{δ \hat{ℓ}}{δ ξ_{k}} + \frac{δ \hat{ℓ}}{δ u_{k}} D_{2} B (\bar{τ} (Δ t ξ_{k}), e) \\ a_{k} = \frac{δ \hat{ℓ}}{δ u_{k}} . \end{matrix}$ (101)

Proof.

We use the discrete Euler-Poincaré formulation Equations (87)–(118). For (a), we use Equation (85) and Lemma 14, and we compute

$\begin{matrix} δ (ξ_{k}, u_{k}) \\ = \frac{1}{Δ t} d^{L} τ^{- 1} (Δ t ξ_{k}, Δ t u_{k}) \cdot ({Ad}_{τ {(Δ t ξ_{k}, Δ t u_{k})}^{- 1}} (η_{k + 1}, v_{k + 1}) - (η_{k}, v_{k})) \\ = \frac{1}{Δ t} d^{L} τ^{- 1} (Δ t ξ_{k}, Δ t u_{k}) \cdot ({Ad}_{\bar{τ} {(Δ t ξ_{k})}^{- 1}} η_{k + 1} - η_{k}, v_{k + 1} - v_{k} + 〈θ (\bar{τ} (Δ t ξ_{k})), η_{k + 1}〉) \\ = \frac{1}{Δ t} (d^{L} {\bar{τ}}^{- 1} (Δ t ξ_{k}) \cdot ({Ad}_{τ {(Δ t ξ_{k})}^{- 1}} η_{k + 1} - η_{k}), \\ v_{k + 1} - v_{k} + 〈θ (\bar{τ} (Δ t ξ_{k})), η_{k + 1}〉 + D_{2} B (\bar{τ} (Δ t ξ_{k}), e) \cdot ({Ad}_{τ {(Δ t ξ_{k})}^{- 1}} η_{k + 1} - η_{k})) . \end{matrix}$

Using the identity $〈θ (g), ξ〉 + D_{2} B (g, e) \cdot {Ad}_{g^{- 1}} ξ = D_{1} B (e, g) \cdot η$ , we get the desired result.

For (b), we use the formula for the coadjoint action on central extension to get

$\begin{matrix} {Ad}_{τ {(Δ t ξ_{k - 1}, Δ t u_{k - 1})}^{- 1}}^{*} (μ_{k - 1}, a_{k - 1}) - (μ_{k}, a_{k}) \\ = ({Ad}_{τ {(Δ t ξ_{k - 1})}^{- 1}}^{*} μ_{k - 1} + a_{k - 1} θ (\bar{τ} (Δ t ξ_{k - 1})), a_{k - 1}) - (μ_{k}, a_{k}) \end{matrix}$

which proves Equation (100). Then, to get Equation (101), we note that

$\begin{matrix} (μ_{k}, a_{k}) : & = d^{L} τ^{- 1} {(Δ t ξ_{k}, Δ t u_{k})}^{*} (\frac{δ ℓ}{δ ξ_{k}}, \frac{δ ℓ}{δ u_{k}}) \\ = (d^{L} {\bar{τ}}^{- 1} {(Δ t ξ_{k})}^{*} \frac{δ ℓ}{δ ξ_{k}} + \frac{δ ℓ}{δ u_{k}} D_{2} B (\bar{τ} (Δ t ξ_{k}), e), \frac{δ ℓ}{δ u_{k}}) \end{matrix}$

by Lemma 14. ☐

We note that the relation with the solution $(g_{k}, α_{k})$ of the discrete Euler-Lagrange on the Lie group $\hat{G}$ is given as

(ξ_{k}, u_{k}) = \frac{1}{Δ t} τ^{- 1} ((g_{k + 1}, α_{k + 1}) {(g_{k}, α_{k})}^{- 1})

which is explicitly given by the relations

\begin{matrix} ξ_{k} & = \frac{1}{Δ t} τ^{- 1} (g_{k + 1} g_{k}^{- 1}) \\ u_{k} & = \frac{1}{Δ t} (α_{k + 1} - α_{k} - B (g_{k}, g_{k}^{- 1}) + B (g_{k + 1}, g_{k}^{- 1})) . \end{matrix}

(102)

Similarly, the variations $η_{k}, v_{k}$ used in discrete Euler-Poincaré variational principle are related to the variations $δ g_{k}$ , $δ α_{k}$ used in the discrete Hamilton principle via the equality $(η_{k}, v_{k}) = (δ g_{k}, δ α_{k}) {(g_{k}, α_{k})}^{- 1} = (δ g_{k} g_{k}^{- 1}, δ α_{k} + D_{1} B (g_{k}, g_{k}^{- 1}) \cdot δ g_{k})$ .

The discrete momentum map $J_{L_{d}} : \hat{G} \times \hat{G} \to {\hat{g}}^{*}$ is computed as

J_{L_{d}} ((g_{k}, α_{k}), (g_{k + 1}, α_{k + 1}) = \frac{1}{Δ t} ({Ad}_{g_{k}}^{*} μ_{k} + a_{k} θ (g_{k}^{- 1}), a_{k})

where $(μ_{k}, a_{k})$ are given in Equation (101) and relation Equation (102) are assumed. It is readily seen that $J_{L_{d}}$ is preserved along the solutions of Equation (100).

The symplectic integrator for the Lie-Poisson equations with cocycle is deduced as follows.

Proposition 16.

(Symplectic integrator for Lie-Poisson equations with cocycle) Let $h : g^{*} \to R$ be a Hamiltonian assumed to be hyperregular, with associated Lagrangian $ℓ : g \to R$ . Then the numerical scheme

${Ad}_{τ {(Δ t ξ_{k - 1})}^{- 1}}^{*} μ_{k - 1} + θ (\bar{τ} (Δ t ξ_{k - 1})) - μ_{k} = 0$ (103)

with

$μ_{k} = d^{L} {\bar{τ}}^{- 1} {(Δ t ξ_{k})}^{*} \frac{δ ℓ}{δ ξ_{k}} + D_{2} B (\bar{τ} (Δ t ξ_{k}), e)$ (104)

is a symplectic scheme for the Lie-Poisson equations with cocycle

$\frac{d}{d t} μ + {ad}_{\frac{δ h}{δ μ}}^{*} μ = Θ (\frac{δ h}{δ μ}, \cdot) .$ (105)

It preserves the affine coadjoint orbits

$O = {{Ad}_{g^{- 1}}^{*} μ + θ (g) ∣ G \in G}$

and $μ_{k - 1} \mapsto μ_{k}$ is symplectic relative to the affine Kirillov-Kostant-Souriau symplectic form

$ω_{O} (μ) ({ad}_{ξ}^{*} μ - Θ (ξ, \cdot), {ad}_{η}^{*} μ - Θ (η, \cdot)) = 〈μ, [ξ, η]〉 - Θ (ξ, η) .$

Proof.

It is a direct consequence of Proposition 15, by choosing the reduced Lagrangian Equation (98), taking the initial condition $a_{0} = 1$ and noting that $a_{k + 1} = a_{k} = 1$ . ☐

It is possible to rewrite the scheme in a way that is more advantageous from the point of view of implementation. By inserting Equation (104) in Equation (103) and using the identity

{Ad}_{g^{- 1}}^{*} D_{2} B (g, e) + θ (g) = D_{1} B (e, g)

we get the scheme in terms of $ξ_{k}$ as

\begin{matrix} {Ad}_{τ {(Δ t ξ_{k - 1})}^{- 1}}^{*} d^{L} {\bar{τ}}^{- 1} {(Δ t ξ_{k - 1})}^{*} \frac{δ ℓ}{δ ξ_{k - 1}} - d^{L} {\bar{τ}}^{- 1} {(Δ t ξ_{k})}^{*} \frac{δ ℓ}{δ ξ_{k}} \\ + D_{1} B (e, \bar{τ} (Δ t ξ_{k - 1})) - D_{2} B (\bar{τ} (Δ t ξ_{k}), e) = 0 . \end{matrix}

(106)

It is also often assumed that the retraction map $τ$ satisfies $τ (- ξ) τ (ξ) = e$ . In this case, we have the identity ${Ad}_{τ (ξ)}^{*} d^{L} τ^{- 1} {(- ξ)}^{*} = d^{L} τ^{- 1} (ξ)$ , see [69], and the scheme Equation (106) takes the form

\begin{matrix} d^{L} {\bar{τ}}^{- 1} {(- Δ t ξ_{k - 1})}^{*} \frac{δ ℓ}{δ ξ_{k - 1}} - d^{L} {\bar{τ}}^{- 1} {(Δ t ξ_{k})}^{*} \frac{δ ℓ}{δ ξ_{k}} + D_{1} B (e, \bar{τ} (Δ t ξ_{k - 1})) - D_{2} B (\bar{τ} (Δ t ξ_{k}), e) = 0 . \end{matrix}

(107)

In absence of the last two terms, we recover the most practically used form of the ordinary discrete Euler-Poincaré equations, e.g., [69]. The last two terms correspond to a discretization of the cocycle which ensures that the resulting scheme is symplectic on each affine coadjoint orbit. It is clear that such a form is not likely to be guessed from the continuous equations without having at hands the discrete variational principle.

Remark 17

(Choice of retraction map). For an exposition of retraction maps, such as canonical coordinates of the first and second kind, and their applications to Lie group methods, the reader is referred to [68]. A possible choice is the exponential map $exp : g \to G$ . In this case, $d^{L} exp (ξ) \cdot η$ and $d^{L} {exp}^{- 1} (ξ) \cdot η$ are given as series which are truncated in order to achieve a desired order of accuracy [58]. A standard choice is the Cayley map $cay : g \to G$ defined by $cay (ξ) = {(e - ξ / 2)}^{- 1} (e + ξ / 2)$ which is valid for a general class of quadratic matrix groups (which include the groups $S O (3)$ , $S E (2)$ , and $S E (3)$ ). Based on this simple form, the derivative maps become

$\begin{matrix} d^{L} cay (ξ) \cdot η & = {(e + ξ / 2)}^{- 1} η {(e - ξ / 2)}^{- 1} \\ d^{L} {cay}^{- 1} (ξ) \cdot η & = (e + ξ / 2) η (e - ξ / 2), \end{matrix}$

for each $ξ, η \in g$ .

Example 17.

Consider the Lie-Poisson equations with cocycle for $S E (2)$ derived in Section 3.3. The central extension integrating the group one-cocycle Equation (70) is $\hat{S E (2)} = S E (2) \times R$ with group two-cocyle $B : S E (2) \times S E (2) \to R$ given by

$B ((φ, a), (ψ, b)) = \frac{1}{2} a \cdot J R_{φ} b .$

This group is referred to as the oscillator group. To apply the scheme Equation (107) to this case, we use the identities

$D_{1} B ((I, 0), (φ, a)) = (0, \frac{1}{2} J a) and D_{2} B ((φ, a), (I, 0)) = (0, - \frac{1}{2} J R_{φ^{- 1}} a)$

as well as the Cayley map for $S E (2)$ given by

$cay (λ, u_{1}, u_{2}) = (R (λ), \frac{2}{4 + λ^{2}} (- λ u_{2} + 2 u_{1}, λ u_{1} + 2 u_{2})), R (λ) = \frac{1}{4 + λ^{2}} [\begin{matrix} λ^{2} - 4 & - 4 λ \\ 4 λ & λ^{2} - 4 \end{matrix}]$

and the expression $d^{L} {cay}^{- 1} (λ, u_{1}, u_{2}) : se (2) \to se (2)$ given in matrix representation as

$I_{3} + \frac{1}{2} [\begin{matrix} 0 & 0 & 0 \\ u_{2} & 0 & - λ \\ - u_{1} & λ & 0 \end{matrix}] + \frac{1}{4} [\begin{matrix} λ^{2} & 0 & 0 \\ λ u_{1} & 0 & 0 \\ λ u_{2} & 0 & 0 \end{matrix}]$

see [70].

4.4. Multisymplectic Lie Group Variational Integrators

In this paragraph, we briefly indicate how the discrete variational setting of the previous section can be extended to variational discretization in several independent variables, i.e., when the unknown is a field rather than a curve. At the continuous setting, the underlying geometric variational setting is the multisymplectic framework of field theories, see, e.g., [48]. Discrete multisymplectic variational versions of this setting were developed and applied in [61,62]. Multisymplectic variational discretization on Lie groups and the discrete Euler-Poincaré field equations were carried out in [63,64].

We will focus on the special case of fields defined on an open subset U of $R^{n}$ with smooth boundary, with values in a configuration manifold Q. We also assume that the Lagrangian only depends on the values of the fields and their first derivatives, not on the parameter $x \in R^{n}$ , so it is a map $L : T Q \oplus \dots \oplus T Q \to R$ . Hamilton’s principle for a field $q : U \subset R^{n} \to Q$ is

δ \int_{U} L (q (x), \partial_{1} q (x), \dots, \partial_{n} q (x)) d x = 0,

for arbitrary variations of the field q that vanish on the boundary of U, from which the Euler-Lagrange equations for the field $q (x)$ are obtained.

We shall focus on the case $Q = G$ a Lie group and for right-invariant Lagrangians, i.e.,

L (g h, v_{1} h, \dots, v_{n} h) = L (g, v_{1}, \dots, v_{n}),

for every $v_{1}, \dots, v_{n} \in T_{g} G$ and every $h \in G$ . In this case, L induces the reduced Lagrangian $ℓ : g \oplus \dots \oplus g \to R$ defined by $ℓ (v_{1} g^{- 1}, \dots, v_{n} g^{- 1}) = L (g, v_{1}, \dots, v_{n})$ . As in the ordinary Euler-Poincaré case recalled above, Hamilton’s principle yields the reduced variational principle

δ \int_{U} ℓ (ξ_{1}, \dots, ξ_{n}) d x = 0, δ ξ_{k} = \partial_{k} η + [η, ξ_{k}],

(108)

for an arbitrary field $η : U \to g$ vanishing on the boundary, which results in the Euler-Poincaré field equations

\sum_{k = 1}^{n} \partial_{k} \frac{δ ℓ}{δ ξ_{k}} + \sum_{k = 1}^{n} {ad}_{ξ_{k}}^{*} \frac{δ ℓ}{δ ξ_{k}} = 0 .

(109)

To guarantee the existence of a field $g : U \to G$ such that $ξ_{k} = \partial_{k} g g^{- 1}$ , $k = 1, \dots, n$ , the fields $ξ_{i}$ in Equation (109) must satisfy the relation $\partial_{k} ξ_{i} - \partial_{i} ξ_{k} = [ξ_{k}, ξ_{i}]$ . In terms of the associated Hamiltonian $h : g^{*} \oplus \dots \oplus g^{*} \to R$ , these equations give Equation (54) without cocycle, i.e., with $Θ^{k} = 0$ .

To include the case with cocycle in a variational setting, we shall proceed exactly as in Section 4.2, by passing to a central extension of G. This is here done in the context of the Euler-Poincaré field equations, rather than for the ordinary Euler-Poincaré equations. This is the content of the next paragraph.

Variational principle for the Lie-Poisson field equations with cocycle. The goal of this paragraph is to obtain a variational principle for the Lie-Poisson field equations with cocycle Equation (54) associated with Souriau’s polysymplectic model. By considering the Euler-Poincaré field equations Equation (109) on a central extension, we get the system

\{\begin{matrix} \sum_{k} \partial_{k} \frac{δ \hat{ℓ}}{δ ξ_{k}} + \sum_{k} {ad}_{ξ_{k}}^{*} \frac{δ \hat{ℓ}}{δ ξ_{k}} = \sum_{k} \frac{δ \hat{ℓ}}{δ u_{k}} Θ (ξ_{k}, \cdot) \\ \sum_{k} \partial_{k} \frac{δ \hat{ℓ}}{δ u_{k}} = 0 . \end{matrix}

(110)

They are the critical conditions for the variational principle

\begin{matrix} δ \int_{0}^{T} \hat{ℓ} ((ξ_{1}, u_{1}), \dots, (ξ_{n}, u_{n})) d t = 0, \\ for variations δ ξ_{k} = \partial_{k} η + [η, ξ_{k}], δ u_{k} = \partial_{k} v - Θ (η, ξ_{k}), \end{matrix}

(111)

which is just a special instance of Equation (109) applied to central extensions. The existence of a field $(g, α) : U \to \hat{G}$ imposes the conditions $\partial_{k} ξ_{i} - \partial_{i} ξ_{k} = [ξ_{k}, ξ_{i}]$ and $\partial_{k} u_{i} - \partial_{i} u_{k} = - Θ (ξ_{k}, ξ_{i})$ .

Given a Lagrangian $ℓ : g \oplus \dots \oplus g \to R$ , one can define the Lagrangian

\hat{ℓ} ((ξ_{1}, u_{1}), \dots, (ξ_{n}, u_{n})) = ℓ (ξ_{1}, \dots, ξ_{n}) + \sum_{k} u_{k}

(112)

on $\hat{g} \oplus \dots \oplus \hat{g}$ for which Equation (110) does reduce to a Lagrangian version of the Lie-Poisson field equation with cocycle Equation (54), as desired, where it is assumed that $Θ^{k} = Θ$ , for all k.

The same reasoning also directly applies on the Hamiltonian side, in which case the Lie-Poisson field equation with cocycle Equation (89) is an invariant subsystem of an ordinary Lie-Poisson field equation associated with a central extension of G.

Remark 18

(Polysymplectic vs multisymplectic setting). The field theories that are used in this paper can be described within the restricted setting of polysymplectic geometry. For such particular field theories the configuration bundle of the theory is trivial, the base is Euclidean, and the Lagrangian does not depend on the variables in the base. General classical field theories cannot be described by polysymplectic geometry and fit into the more general setting of multisymplectic geometry. This mainly comes from the fact that the field theoretic analogue to the cotangent bundle (endowed with the canonical symplecic form) of mechanics is the dual jet bundle of the configuration bundle (endowed with a canonical multisymplectic form). We shall apply below multisymplectic variational integrators to field equations belonging to the setting of polysysmplectic geometry. In this case, they could logically be called polysymplectic variational integrators. Please note that we kept the original naming multisymplectic variational integrators, since the theory applies to general multisymplectic field theories not just polysymplectic ones. See, e.g., [62,64] for applications of multisymplectic integrators to situations that are not covered by the polysymplectic formalism.

Multisymplectic Lie group integrators. To present multisymplectic integrators, we shall focus on the two dimensional case and assume that the fields are defined on a rectangle $U = [0, A] \times [0, B] \subset R^{2}$ . We shall write $(x_{1}, x_{2}) = (x, y)$ . Let Q be a configuration manifold and let $L : T Q \oplus T Q \to R$ be a Lagrangian. We shall consider the very special case of a discrete grid determined by ${(x_{k}, y_{a}) = (k Δ x, a Δ y) ∣ k = 0, \dots, N_{1}, a = 1, \dots, N_{2}}$ with given $Δ x$ and $Δ y$ . We shall denote by $q_{d} : {(x_{k}, y_{a})}_{k = 0}^{N} \to Q$ , $q_{d} (x_{k}, y_{a}) = q_{k}^{a}$ a discrete field. A discrete Lagrangian is a map $L_{d} : Q \times Q \times Q \to R$ , $L_{d} = L_{d} (q_{k}^{a}, q_{k + 1}^{a}, q_{a + 1}^{k})$ that approximates the action integral of L on the rectangle $[x_{k}, x_{k + 1}] \times [y_{a}, y_{a + 1}]$ for a field interpolating the values $q_{k}^{a}, q_{k + 1}^{a}, q_{a + 1}^{k}$ . The discrete Hamilton principle reads

δ \sum_{k = 0}^{N_{1} - 1} \sum_{a = 0}^{N_{2} - 1} L_{d} (q_{k}^{a}, q_{k + 1}^{a}, q_{a + 1}^{k}) = 0,

(113)

for all variations $δ q_{d}$ of $q_{d}$ with vanishing boundary values. The discrete Euler-Lagrange equations are obtained as the critical point condition for a discrete field $q_{d}$ .

Given a Lie group action $Φ : G \times Q \to Q$ , the discrete Lagrangian field momentum maps $J_{L_{d}}^{i}, : Q \times Q \times Q \to g^{*}$ , $i = 1, 2, 3$ are defined by

\begin{matrix} 〈J_{L_{d}}^{1} (q_{k}^{a}, q_{k + 1}^{a}, q_{k}^{a + 1}), ξ〉 & = 〈D_{1} L_{d} (q_{k}^{a}, q_{k + 1}^{a}, q_{k}^{a + 1}), ξ_{Q} (q_{k}^{a})〉 \\ 〈J_{L_{d}}^{2} (q_{k}^{a}, q_{k + 1}^{a}, q_{k}^{a + 1}), ξ〉 & = 〈D_{2} L_{d} (q_{k}^{a}, q_{k + 1}^{a}, q_{k}^{a + 1}), ξ_{Q} (q_{k + 1}^{a})〉 \\ 〈J_{L_{d}}^{3} (q_{k}^{a}, q_{k + 1}^{a}, q_{k}^{a + 1}), ξ〉 & = 〈D_{3} L_{d} (q_{k}^{a}, q_{k + 1}^{a}, q_{k}^{a + 1}), ξ_{Q} (q_{k}^{a + 1})〉 \end{matrix}

(114)

which satisfies $J_{L_{d}}^{1} + J_{L_{d}}^{2} + J_{L_{d}}^{3} = 0$ .

We refer to [61,62] for an introduction to multisymplectic variational integrators, including the notion of discrete multisymplecticity, discrete Cartan forms, and discrete field momentum maps, see also [71]. These integrators, also satisfy a discrete Noether theorem in presence of symmetries, as we shall see below in the special case of Lie groups.

Multisymplectic variational integrators on Lie groups were developed in [63,71], for application to geometrically exact (Cosserat) rods. As above, we shall focus on the two dimensional case and $U = [0, A] \times [0, B] \subset R^{2}$ . For $Q = G$ a Lie group, the discrete Lagrangian is a map $L_{d} : G \times G \times G \to R$ . We assume that the continuous Lagrangian is G invariant and that the discrete Lagrangian $L_{d}$ inherits this invariance, i.e.,

L (g_{k}^{a} h, g_{k + 1}^{a} h, g_{a + 1}^{k} h) = L (g_{k}^{a}, g_{k + 1}^{a}, g_{a + 1}^{k}),

for every $h \in G$ . Hence, by passing to the quotient associated with this action we get a reduced Lagrangian $L_{d} : G \times G \to R$ , $L_{d} (g_{k + 1}^{a} {(g_{k}^{a})}^{- 1}, g_{k}^{a + 1} {(g_{k}^{a})}^{- 1}) = L (g_{k}^{a}, g_{k + 1}^{a}, g_{a + 1}^{k})$ . As mentioned earlier, it is advantageous to introduce a retraction map $τ : g \to G$ , $τ (0) = e$ , from which the discrete reduced Lagrangian can be defined on a neighborhood of $(0, 0)$ in $g \times g$ via the relation

\begin{matrix} ℓ_{g} (ξ_{k}^{a}, ζ_{k}^{a}) = L_{d} (g_{k + 1}^{a} {(g_{k}^{a})}^{- 1}, g_{k}^{a + 1} {(g_{k}^{a})}^{- 1}) \\ with τ (Δ x ξ_{k}^{a}) = g_{k + 1}^{a} {(g_{k}^{a})}^{- 1} and τ (Δ y ζ_{k}^{a}) = g_{k}^{a + 1} {(g_{k}^{a})}^{- 1} . \end{matrix}

(115)

The last two relations are thought of as discrete versions of $ξ = \partial_{1} g g^{- 1}$ , $ζ = \partial_{2} g g^{- 1}$ .

The discrete Euler-Poincaré field equations for $ℓ_{d}$ are obtained by computing the discrete variational principle induced on the discrete action $\sum_{k = 0}^{N_{1} - 1} \sum_{a = 0}^{N_{2} - 1} ℓ_{d} (ξ_{k}^{a}, ζ_{k}^{a})$ from the discrete Hamilton principle $δ \sum_{k = 0}^{N_{1} - 1} \sum_{a = 0}^{N_{2} - 1} L (g_{k}^{a}, g_{k + 1}^{a}, g_{k}^{a + 1}) = 0$ recalled above in Equation (113). The main step in this process is to compute the variations $δ ξ_{k}^{a}$ $δ ζ_{k}^{a}$ induced by arbitrary variations $δ g_{k}^{a}$ . One finds the expression

\begin{matrix} δ ξ_{k}^{a} & = \frac{1}{Δ x} d^{L} τ^{- 1} (Δ x ξ_{k}^{a}) \cdot ({Ad}_{τ {(Δ x ξ_{k}^{a})}^{- 1}} η_{k + 1}^{a} - η_{k}^{a}) \\ δ ζ_{k}^{a} & = \frac{1}{Δ y} d^{L} τ^{- 1} (Δ y ζ_{k}^{a}) \cdot ({Ad}_{τ {(Δ y ζ_{k}^{a})}^{- 1}} η_{k}^{a + 1} - η_{k}^{a}) . \end{matrix}

(116)

The discrete field Euler-Poincaré variational principle thus reads

δ \sum_{k = 0}^{N_{1} - 1} \sum_{a = 0}^{N_{2} - 1} ℓ_{d} (ξ_{k}^{a}, ζ_{k}^{a}) = 0,

(117)

with respect to variations $δ ξ_{k}^{a}$ , $δ ζ_{k}^{a}$ of the form Equation (116) with $η_{k}^{a}$ vanishing at the boundary. It yields the discrete Euler-Poincaré field equations.

\begin{matrix} \frac{1}{Δ x} ({Ad}_{τ {(Δ x ξ_{k - 1}^{a})}^{- 1}}^{*} μ_{k - 1}^{a} - μ_{k}^{a}) + \frac{1}{Δ y} ({Ad}_{τ {(Δ y ζ_{k}^{a - 1})}^{- 1}}^{*} ν_{k}^{a - 1} - ν_{k}^{a}) = 0, \\ μ_{k}^{a} : = d^{L} τ^{- 1} {(Δ x ξ_{k}^{a})}^{*} \frac{δ ℓ}{δ ξ_{k}^{a}}, ν_{k}^{a} : = d^{L} τ^{- 1} {(Δ y ζ_{k}^{a})}^{*} \frac{δ ℓ}{δ ζ_{k}^{a}} . \end{matrix}

(118)

We refer to [63,71] for details, including the treatment of boundary conditions, the description of the associated discrete Cartan forms, the discrete field momentum maps, as well as the symplectic and multisymplectic characters of the scheme.

We just recall below the expression of the field momentum maps Equation (114) which take the following form:

\begin{matrix} J_{L_{d}}^{1} (g_{k}^{a}, g_{k + 1}^{a}, g_{k}^{a + 1}) & = - \frac{1}{Δ x} {Ad}_{g_{k}}^{*} μ_{k}^{a} - \frac{1}{Δ y} {Ad}_{g_{k}}^{*} ν_{k}^{a} \\ J_{L_{d}}^{2} (g_{k}^{a}, g_{k + 1}^{a}, g_{k}^{a + 1}) & = \frac{1}{Δ x} {Ad}_{g_{k}}^{*} μ_{k}^{a} \\ J_{L_{d}}^{3} (g_{k}^{a}, g_{k + 1}^{a}, g_{k}^{a + 1}) & = \frac{1}{Δ y} {Ad}_{g_{k}}^{*} ν_{k}^{a} . \end{matrix}

The discrete Noether theorem, then asserts that a certain $g^{*}$ -valued discrete integral of $J_{L_{d}}^{i}$ along the boundary of any subgrid domain is zero, see [71].

Multisymplectic variational discretization for Lie-Poisson field equations with cocycle. Based on the previous result, we first give below a multisymplectic integrator for the Euler-Poincaré field equations on central extensions. Then we deduce a multisymplectic integrator for the Lie-Poisson field equations with cocycle appearing in the polysymplectic Souriau model. The next proposition is the multisymplectic version of Proposition 15.

Proposition 19.

(Discrete Euler-Poincaré field equations on central extensions) The following are equivalent:

(a)
The discrete curve $(ξ_{k}, u_{k})$ is critical for the discrete Euler-Poincaré field variational principle
$δ \sum_{a, k} \hat{ℓ} ((ξ_{k}^{a}, u_{k}^{a}), (ζ_{k}^{a}, w_{k}^{a})) = 0,$
with respect to variations
$\{\begin{matrix} δ ξ_{k}^{a} = \frac{1}{Δ x} d^{L} {\bar{τ}}^{- 1} (Δ x ξ_{k}^{a}) \cdot ({Ad}_{τ {(Δ x ξ_{k}^{a})}^{- 1}} η_{k + 1}^{a} - η_{k}^{a}) \\ δ ζ_{k}^{a} = \frac{1}{Δ y} d^{L} τ^{- 1} (Δ y ζ_{k}^{a}) \cdot ({Ad}_{τ {(Δ y ζ_{k}^{a})}^{- 1}} η_{k}^{a + 1} - η_{k}^{a}) \\ δ u_{k}^{a} = \frac{1}{Δ x} (v_{k + 1}^{a} - v_{k}^{a} - D_{2} B (\bar{τ} (Δ x ξ_{k}^{a}), e) \cdot η_{k}^{a} + D_{1} B (e, \bar{τ} (Δ x ξ_{k}^{a})) \cdot η_{k + 1}^{a}) \\ δ w_{k}^{a} = \frac{1}{Δ y} (v_{k}^{a + 1} - v_{k}^{a} - D_{2} B (\bar{τ} (Δ y ζ_{k}^{a}), e) \cdot η_{k}^{a} + D_{1} B (e, \bar{τ} (Δ y ζ_{k}^{a})) \cdot η_{k}^{a + 1}) \end{matrix}$
where $η_{k}^{a} \in g$ and $v_{k}^{a} \in R$ are arbitrary discrete fields vanishing at the boundary.

(b)
The discrete curve $(ξ_{k}^{a}, u_{k}^{a}, ζ_{k}^{a}, w_{k}^{a})$ is a solution of the discrete Euler-Poincaré field equations
$\{\begin{matrix} \frac{1}{Δ x} ({Ad}_{τ {(Δ x ξ_{k - 1}^{a})}^{- 1}}^{*} μ_{k - 1}^{a} + a_{k - 1}^{a} θ (\bar{τ} (Δ x ξ_{k - 1}^{a})) - μ_{k}^{a}) \\ + \frac{1}{Δ y} ({Ad}_{τ {(Δ y ζ_{k}^{a - 1})}^{- 1}}^{*} ν_{k}^{a - 1} + b_{k}^{a - 1} θ (\bar{τ} (Δ y ζ_{k}^{a - 1})) - ν_{k}^{a}) = 0 \\ \frac{1}{Δ x} (a_{k - 1}^{a} - a_{k}^{a}) + \frac{1}{Δ y} (b_{k}^{a - 1} - b_{k}^{a}) = 0 \end{matrix}$ (119)
with
$\{\begin{matrix} μ_{k}^{a} = d^{L} {\bar{τ}}^{- 1} {(Δ x ξ_{k}^{a})}^{*} \frac{δ \hat{ℓ}}{δ ξ_{k}^{a}} + \frac{δ \hat{ℓ}}{δ u_{k}^{a}} D_{2} B (\bar{τ} (Δ x ξ_{k}^{a}), e) \\ ν_{k}^{a} = d^{L} {\bar{τ}}^{- 1} {(Δ y ζ_{k}^{a})}^{*} \frac{δ \hat{ℓ}}{δ ζ_{k}^{a}} + \frac{δ \hat{ℓ}}{δ w_{k}^{a}} D_{2} B (\bar{τ} (Δ y ζ_{k}^{a}), e) \\ a_{k}^{a} = \frac{δ \hat{ℓ}}{δ u_{k}^{a}}, b_{k}^{a} = \frac{δ \hat{ℓ}}{δ w_{k}^{a}} . \end{matrix}$ (120)

Proof.

The proof can be obtained by appropriate extension of the proof of Proposition 15, by using the multisymplectic variational setting recalled in the previous paragraph. ☐

We note that the relation between the solution of the discrete Euler-Poincaré equations and the solution $(g_{k}^{a}, α_{k}^{a})$ of the discrete Euler-Lagrange field equations on the Lie group $\hat{G}$ is given as

\begin{matrix} (ξ_{k}^{a}, u_{k}^{a}) & = \frac{1}{Δ x} τ^{- 1} ((g_{k + 1}^{a}, α_{k + 1}^{a}) {(g_{k}^{a}, α_{k}^{a})}^{- 1}) \\ (ζ_{k}^{a}, w_{k}^{a}) & = \frac{1}{Δ y} τ^{- 1} ((g_{k}^{a + 1}, α_{k}^{a + 1}) {(g_{k}^{a}, α_{k}^{a})}^{- 1}) \end{matrix}

which is explicitly given by the relations

\begin{matrix} ξ_{k}^{a} & = \frac{1}{Δ x} τ^{- 1} (g_{k + 1}^{a} {(g_{k}^{a})}^{- 1}), ζ_{k} = \frac{1}{Δ y} τ^{- 1} (g_{k}^{a + 1} {(g_{k}^{a})}^{- 1}) \\ u_{k}^{a} & = \frac{1}{Δ x} (α_{k + 1}^{a} - α_{k}^{a} - B (g_{k}^{a}, {(g_{k}^{a})}^{- 1}) + B (g_{k + 1}^{a}, {(g_{k}^{a})}^{- 1})) \\ w_{k}^{a} & = \frac{1}{Δ y} (α_{k}^{a + 1} - α_{k}^{a} - B (g_{k}^{a}, {(g_{k}^{a})}^{- 1}) + B (g_{k}^{a + 1}, {(g_{k}^{a})}^{- 1})) . \end{matrix}

(121)

Similarly, the variations $η_{k}^{a}, v_{k}^{a}$ used in discrete Euler-Poincaré variational principle are related to the variations $δ g_{k}^{a}$ , $δ α_{k}^{a}$ used in the discrete Hamilton principle via the equality $(η_{k}^{a}, v_{k}^{a}) = (δ g_{k}^{a}, δ α_{k}^{a}) {(g_{k}^{a}, α_{k}^{a})}^{- 1} = (δ g_{k}^{a} {(g_{k}^{a})}^{- 1}, δ α_{k}^{a} + D_{1} B (g_{k}^{a}, {(g_{k}^{a})}^{- 1}) \cdot δ g_{k}^{a})$ .

The discrete field momentum maps are computed as

\begin{matrix} J_{L_{d}}^{1} ((g_{k}^{a}, α_{k}^{a}), (g_{k + 1}^{a}, α_{k + 1}^{a}), (g_{k}^{a + 1}, α_{k}^{a + 1})) & = - \frac{1}{Δ x} ({Ad}_{g_{k}^{a}}^{*} μ_{k}^{a} + a_{k}^{a} θ ({(g_{k}^{a})}^{- 1}), a_{k}^{a}) \\ - \frac{1}{Δ y} ({Ad}_{g_{k}^{a}}^{*} ν_{k}^{a} + b_{k}^{a} θ ({(g_{k}^{a})}^{- 1}), b_{k}^{a}) \\ J_{L_{d}}^{2} ((g_{k}^{a}, α_{k}^{a}), (g_{k + 1}^{a}, α_{k + 1}^{a}), (g_{k}^{a + 1}, α_{k}^{a + 1})) & = \frac{1}{Δ x} ({Ad}_{g_{k}^{a}}^{*} μ_{k}^{a} + a_{k}^{a} θ ({(g_{k}^{a})}^{- 1}), a_{k}^{a}) \\ J_{L_{d}}^{3} ((g_{k}^{a}, α_{k}^{a}), (g_{k + 1}^{a}, α_{k + 1}^{a}), (g_{k}^{a + 1}, α_{k}^{a + 1})) & = \frac{1}{Δ y} {Ad}_{g_{k}}^{*} ({Ad}_{g_{k}^{a}}^{*} ν_{k}^{a} + b_{k}^{a} θ ({(g_{k}^{a})}^{- 1}), b_{k}^{a}) \end{matrix}

from which the discrete field Noether theorem can be stated for the solutions of Equation (119).

The multisymplectic integrator for the Lie-Poisson field equations with cocycle is obtained in the next Proposition, which is the multisymplectic analogue to Proposition 16.

Proposition 20.

(Multisymplectic integrator for Lie-Poisson field equations with cocycle) Let $h : g^{*} \times g^{*} \to R$ be a Hamiltonian assumed to be hyperregular, with associated Lagrangian $ℓ : g \times g \to R$ . Then the numerical scheme

$\begin{matrix} \frac{1}{Δ x} ({Ad}_{τ {(Δ x ξ_{k - 1}^{a})}^{- 1}}^{*} μ_{k - 1}^{a} + θ (\bar{τ} (Δ x ξ_{k - 1}^{a})) - μ_{k}^{a}) \\ + \frac{1}{Δ y} ({Ad}_{τ {(Δ y ζ_{k}^{a - 1})}^{- 1}}^{*} ν_{k}^{a - 1} + θ (\bar{τ} (Δ y ζ_{k}^{a - 1})) - ν_{k}^{a}) = 0 \end{matrix}$ (122)

with

$\{\begin{matrix} μ_{k}^{a} = d^{L} {\bar{τ}}^{- 1} {(Δ x ξ_{k}^{a})}^{*} \frac{δ ℓ}{δ ξ_{k}^{a}} + D_{2} B (\bar{τ} (Δ x ξ_{k}^{a}), e) \\ ν_{k}^{a} = d^{L} {\bar{τ}}^{- 1} {(Δ y ζ_{k}^{a})}^{*} \frac{δ ℓ}{δ ζ_{k}^{a}} + D_{2} B (\bar{τ} (Δ y ζ_{k}^{a}), e) \end{matrix}$ (123)

is a multisymplectic scheme for the Lie-Poisson field equations with cocycle

$\partial_{x} μ + \partial_{y} ν + {ad}_{\frac{δ h}{δ μ}}^{*} μ + {ad}_{\frac{δ h}{δ ν}}^{*} ν = Θ (\frac{δ h}{δ μ}, \cdot) + Θ (\frac{δ h}{δ ν}, \cdot) .$ (124)

Proof.

This follows from Proposition 19 and the choice Equation (112). ☐

If the retraction map $τ$ satisfies $τ (- ξ) τ (ξ) = e$ , the scheme can be rewritten in a simpler way, as done in Equation (107) in the symplectic case.

The benefit of the structure preserving properties of the proposed numerical schemes will be exploited in a future work.

5. Conclusions

In the context of artificial intelligence, machine learning algorithms use more and more methodological tools coming from physics or statistical mechanics. The laws and principles that underpin this physics can shed new light on the conceptual basis of artificial intelligence. Thus, the principles of maximum entropy and François Massieu’s notions of characteristic functions enrich the variational formalism of machine learning. Conversely, the pitfalls encountered by artificial intelligence to extend its application domains, question the foundations of statistical physics, such as the generalization of the notions of Gibbs densities in spaces of more elaborate representation such as data on homogeneous symplectic manifolds and Lie groups. The porosity between the two disciplines has been established since the birth of artificial intelligence with the use of Boltzmann machines and the problem of robust methods for calculating partition function. More recently, gradient algorithms for neural network learning use large-scale robust extensions of the natural gradient of Fisher-based information geometry (to ensure reparameterization invariance), and stochastic gradient based on the Langevin equation (to ensure regularization), or their coupling called “Natural Langevin Dynamics”. Concomitantly, during the last fifty years, statistical physics has been the object of new geometrical formalizations (contact, Dirac, or symplectic geometry, variational principles, etc.) to try to give a new covariant formalization to the thermodynamics of dynamical systems, as Lie Groups thermodynamics. Finally, the study of geometric integrators as symplectic integrators with good properties of covariances and stability (use of symmetries, preservation of invariants and momentum maps) will open the door to new generation of numerical schemes. Machine learning inference processes are just beginning to adapt these new integration schemes and their remarkable stability properties to increasingly abstract data representation spaces. Artificial intelligence currently uses only a very limited portion of the conceptual and methodological tools of statistical physics. The purpose of this paper was to encourage constructive dialogue around a common foundation, to allow the establishment of new principles and laws governing the two disciplines in a unified approach.

Author Contributions

Writing—original draft preparation, F.B., F.G.-B.; The authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

1.Souriau J.-M. Structure des Systèmes Dynamiques. Dunod; Paris, France: 1969. [Google Scholar]
2.Marle C.-M. From tools in symplectic and Poisson geometry to J.-M. Souriau’s theories of statistical mechanics and thermodynamics. Entropy. 2016;18:370. doi: 10.3390/e18100370. [DOI] [Google Scholar]
3.Barbaresco F. Koszul information geometry and Souriau geometric temperature/capacity of Lie Group Thermodynamics. Entropy. 2014;16:4521–4565. doi: 10.3390/e16084521. [DOI] [Google Scholar]
4.Barbaresco F. Geometric theory of heat from Souriau Lie groups thermodynamics and Koszul Hessian geometry: Applications in information geometry for exponential families. Entropy. 2016;18:386. doi: 10.3390/e18110386. [DOI] [Google Scholar]
5.Barbaresco F. Higher order geometric theory of information and heat based on polysymplectic geometry of Souriau Lie groups thermodynamics and their contextures: The bedrock for Lie Group machine learning. Entropy. 2018;20:840. doi: 10.3390/e20110840. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Barbaresco F. Geometric Structures of Information. Springer; Berlin, Germany: 2018. Jean-Louis Koszul and the Elementary Structures of Information Geometry; pp. 333–392. [Google Scholar]
7.Barbaresco F. Souriau exponential map algorithm for machine learning on matrix Lie groups. In: Nielsen F., Barbaresco F., editors. GSI 2019. LNCS. Volume 11712 Springer; Berlin/Heidelberg, Germany: 2019. [Google Scholar]
8.Barbaresco F. Lie group machine learning and Gibbs density on Poincaré unit disk from Souriau Lie groups thermodynamics and SU(1,1) coadjoint orbits. In: Nielsen F., Barbaresco F., editors. GSI 2019. LNCS. Volume 11712 Springer; Cham, Switzerland: 2019. [Google Scholar]
9.Barbaresco F. Application Exponentielle de Matrice par l’extension de l’algorithme de Jean-Marie Souriau, Utilisable pour le tir Géodésique et l’apprentissage Machine pour les Groupes de Lie. Colloque GRETSI 2019. [(accessed on 20 April 2020)];2019 Available online: http://gretsi.fr/colloque2019/
10.Barbaresco F. Les Structures Géométriques de l’information de Jean-Louis Koszul. Colloque GRETSI. [(accessed on 20 April 2020)];2019 Available online: http://gretsi.fr/colloque2019/
11.Miolane N., Le Brigant A., Cabanes Y. Geomstats: A Python Package for Riemannian Geometry in Machine Learning. [(accessed on 20 April 2020)];2020 Available online: https://hal.inria.fr/hal-02536154/file/main.pdf.
12.Berthet Q., Blondel M., Teboul O., Cuturi M., Vert J.-P., Bach F. Learning with Differentiable Perturbed Optimizers. arXiv. 20202002.08676 [Google Scholar]
13.Blondel M., Martins A.F.T., Niculae V. Learning with Fenchel-Young Losses. J. Mach. Learn. Res. 2020;21:1–69. [Google Scholar]
14.Wainwright M.J., Jordan M.I. Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 2008;1:1–305. doi: 10.1561/2200000001. [DOI] [Google Scholar]
15.Fréchet M.R. Sur l’extension de certaines évaluations statistiques au cas de petits échantillons. Rev. Inst. Int. Stat. 1943;11:182–205. doi: 10.2307/1401114. (In French) [DOI] [Google Scholar]
16.Jaynes E.T. Information theory and statistical mechanics I, II. Phys. Rev. 1957;106:620. doi: 10.1103/PhysRev.106.620. [DOI] [Google Scholar]
17.Koszul J.L. Variétés localement plates et convexité. Osaka J. Math. 1965;2:285–290. (In French) [Google Scholar]
18.Koszul J.L. Exposés sur les Espaces Homogènes Symétriques. Publicação da Sociedade de Matematica de São Paulo; São Paulo, Brazil: 1959. (In French) [Google Scholar]
19.Koszul J.L. Déformations des variétés localement plates. Ann. Inst. Fourier. 1968;18:103–114. doi: 10.5802/aif.279. (In French) [DOI] [Google Scholar]
20.Vinberg E.B. Homogeneous convex cones. Trans. Mosc. Math. Soc. 1963;12:340–363. [Google Scholar]
21.Vinberg E.B. The Theory of Homogeneous Convex Cones. Tr. Mosk. Mat. Obs. 1963;12:303–358. [Google Scholar]
22.Koszul J.L. Introduction to Symplectic Geometry. Science Press; Beijing, China: 1986. (In Chinese) [Google Scholar]
23.Koszul J.L. Series in Pure Mathematics. Volume 17 World Scientific Publishing; Singapore: 1994. Selected Papers. [Google Scholar]
24.Souriau J.-M. Mécanique Statistique, Groupes de Lie et Cosmologie. [(accessed on 20 April 2020)];2020 Available online: https://www.academia.edu/42630654/Statistical_Mechanics_Lie_Group_and_Cosmology_1_st_part_Symplectic_Model_of_Statistical_Mechanics.
25.Nencka H., Streater R.F. Information geometry for some Lie algebras. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 1999;2:441–460. doi: 10.1142/S0219025799000254. [DOI] [Google Scholar]
26.Gay-Balmaz F., Holm D.D. Selective decay by Casimir dissipation in inviscid fluids. Nonlinearity. 2013;26:495–524. doi: 10.1088/0951-7715/26/2/495. [DOI] [Google Scholar]
27.Gay-Balmaz F., Holm D.D. A geometric theory of selective decay with applications in MHD. Nonlinearity. 2014;27:1747–1777. doi: 10.1088/0951-7715/27/8/1747. [DOI] [Google Scholar]
28.Balian R., Alhassid Y., Reinhardt H. Dissipation in many-body systems: A geometric approach based on information theory. Phys. Rep. 1986;131:1–146. doi: 10.1016/0370-1573(86)90005-0. [DOI] [Google Scholar]
29.Libermann P., Marle C.-M. Symplectic Geometry and Analytical Mechanics. Reidel; Kufstein, Austria: 1987. [Google Scholar]
30.Abraham R., Marsden J.E. Foundations of Mechanics. Benjamin-Cummings Publ. Co.; San Francisco, CA, USA: 1978. [Google Scholar]
31.Marsden J.E., Ratiu T.S. Introduction to Mechanics and Symmetry. Springer; Berlin, Germany: 2003. [Google Scholar]
32.Chirco G., Laudato M., Mele F.M. Covariant momentum map thermodynamics for parametrized field theories. arXiv. 2019abs/1911.06224 [Google Scholar]
33.Gay-Balmaz F., Ratiu T.S. Affine Lie-Poisson reduction, Yang-Mills magnetohydrodynamics, and superfluids. J. Phys. A Math. Theor. 2008;41:344007. doi: 10.1088/1751-8113/41/34/344007. [DOI] [Google Scholar]
34.Gay-Balmaz F., Ratiu T.S. The geometric structure of complex fluids. Adv. Appl. Math. 2009;42:176–275. doi: 10.1016/j.aam.2008.06.002. [DOI] [Google Scholar]
35.Gay-Balmaz F., Ratiu T.S., Tronci C. Equivalent theories of liquid crystal dynamics. Arch. Ration. Mech. Anal. 2013;210:773–811. doi: 10.1007/s00205-013-0673-1. [DOI] [Google Scholar]
36.Ellis D.C.P., Gay-Balmaz F., Holm D.D., Putkaradze V., Ratiu T.S. Symmetry reduced dynamics of charged molecular strands. Arch. Ration. Mech. Anal. 2009;197:811–902. doi: 10.1007/s00205-010-0305-y. [DOI] [Google Scholar]
37.Gay-Balmaz F., Holm D.D., Ratiu T.S. Variational principles for spin systems and the Kirchhoff rod. J. Geom. Mech. 2009;1:417–444. doi: 10.3934/jgm.2009.1.417. [DOI] [Google Scholar]
38.de Saxcé G. Euler-Poincaré equation for Lie groups with non null symplectic cohomology. Application to the mechanics. In: Nielsen F., Barbaresco F., editors. GSI 2019. LNCS. Volume 11712 Springer; Berlin, Germany: 2019. [Google Scholar]
39.Marle C.-M. Projection Stéréographique et Moments. [(accessed on 20 April 2020)];2019 Available online: https://hal.archives-ouvertes.fr/hal-02157930/
40.Bismut J.-M. Mécanique Aléatoire. Volume 866 Springer; Berlin/Heidelberg, Germany: New York, NY, USA: 1981. Lecture Notes in Math. [Google Scholar]
41.Lazaro-Cami J.A., Ortega J.-P. Stochastic Hamiltonian dynamical systems. Rep. Math. Phys. 2008;61:65–122. doi: 10.1016/S0034-4877(08)80003-1. [DOI] [Google Scholar]
42.Bou-Rabee N., Owhadi H. Stochastic variational integrators. IMA J. Numer. Anal. 2009;29:421–443. doi: 10.1093/imanum/drn018. [DOI] [Google Scholar]
43.Holm D.D. Variational principles for stochastic fluid dynamics. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 2015;471:2176. doi: 10.1098/rspa.2014.0963. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Gay-Balmaz F., Holm D.D. Stochastic geometric models with non-stationary spatial correlations in Lagrangian fluid flows. J. Nonlin. Sci. 2018;28:873–904. doi: 10.1007/s00332-017-9431-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Gay-Balmaz F., Holm D.D. Predicting uncertainty in geometric fluid mechanics. Disc. Cont. Dyn. Syst. Ser. S. 2020;13:1229–1242. doi: 10.3934/dcdss.2020071. [DOI] [Google Scholar]
46.Arnaudon A., De Castro A.L., Holm D.D. Noise and Dissipation on Coadjoint Orbits. J. Nonlinear Sci. 2018;28:91–145. doi: 10.1007/s00332-017-9404-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Günther C. The polysymplectic Hamiltonian formalism in field theory and calculus of variations I: The local case. J. Diff. Geom. 1987;25:23–53. doi: 10.4310/jdg/1214440723. [DOI] [Google Scholar]
48.Gotay M.J., Isenberg J., Marsden J.E., Montgomery R., Sniatycki J., Yasskin P.B. Momentum maps and classical fields. Part I: Covariant field theory. arXiv. 1997physics/9801019v2 [Google Scholar]
49.Ingarden R.S., Nakagomi T. The second order extension of the Gibbs state. Open Syst. Inf. Dyn. 1992;1:259–268. doi: 10.1007/BF02228947. [DOI] [Google Scholar]
50.Ingarden R.S., Kossakowski A., Ohya M. Information Dynamics and Open Systems. Volume 86 Springer; Berlin, Germany: 1997. Classical and Quantum Approach, Fundamental Theories of Physics. [Google Scholar]
51.Ingarden R.S., Meller J. Temperatures in linguistics as a model of thermodynamics. Open Syst. Inf. Dyn. 1994;2:211–230. doi: 10.1007/BF02228965. [DOI] [Google Scholar]
52.Jaworski W., Lngarden R.S. On the partition function in information thermodynamics with higher order temperatures. Bull. Acad. Pol. Sci. Sér. Phys. Astron. 1980;1:28–119. [Google Scholar]
53.Jaworski W. Ph.D. Thesis. Institute of Physics, Nicolaus Copernicus University; Torun, Poland: 1983. Investigation of the Thermodynamic Limit for the States Maximizing Entropy under Auxiliary Conditions for Higher-Order Statistical Moments. (In Polish) [Google Scholar]
54.Jaworski W. On the thermodynamic limit in information thermodynamics with higher-order temperatures. Acta Phys. Pol. 1983;A63:3–19. [Google Scholar]
55.Eriksen P.S. Geodesics Connected with the Fisher Metric on the Multivariate Normal Manifold. Institute of Electronic Systems, Aalborg University; Aalborg, Denmark: 1986. Technical Report 86-13. [Google Scholar]
56.Eriksen P.S. Proceedings of the GST Workshop. University of Lancaster; Lancaster, UK: 1987. Geodesics connected with the Fisher metric on the multivariate normal manifold. [Google Scholar]
57.Balian R. The entropy-based quantum metric. Entropy. 2014;16:3878–3888. doi: 10.3390/e16073878. [DOI] [Google Scholar]
58.Hairer E., Lubich C., Wanner G. Geometric Numerical Integration, Structure-Preserving Algorithms for Ordinary Differential Equations. Volume 31 Springer; Berlin, Germany: 2010. (Springer Series in Computational Mathematics). [Google Scholar]
59.Marsden J.E., West M. Discrete mechanics and variational integrators. Acta Numer. 2001;10:357–514. doi: 10.1017/S096249290100006X. [DOI] [Google Scholar]
60.Marsden J.E., Pekarsky S., Shkoller S. Discrete Euler-Poincaré and Lie-Poisson equations. Nonlinearity. 1998;12:1647–1662. doi: 10.1088/0951-7715/12/6/314. [DOI] [Google Scholar]
61.Marsden J.E., Patrick G.W., Shkoller S. Multisymplectic geometry, variational integrators, and nonlinear PDEs. Commun. Math. Phys. 1998;199:351–395. doi: 10.1007/s002200050505. [DOI] [Google Scholar]
62.Lew A., Marsden J.E., Ortiz M., West M. Asynchronous variational integrators. Arch. Ration. Mech. Anal. 2003;167:85–146. doi: 10.1007/s00205-002-0212-y. [DOI] [Google Scholar]
63.Demoures F., Gay-Balmaz F., Kobilarov M., Ratiu T.S. Multisymplectic Lie group variational integrators for a geometrically exact beam in $R^{3}$ . Commun. Nonlinear Sci. Numer. Simul. 2014;19:3492–3512. doi: 10.1016/j.cnsns.2014.02.032. [DOI] [Google Scholar]
64.Demoures F., Gay-Balmaz F., Ratiu T.S. Forum of Mathematics, Sigma. Volume 4. Cambridge University Press; Cambridge, UK: 2016. Multisymplectic variational integrators for nonsmooth Lagrangian continuum mechanics.54p [Google Scholar]
65.Demoures F., Gay-Balmaz F., Desbrun M., Ratiu T.S., Aragón A. A multisymplectic integrator for elastodynamic frictionless impact problems. Comput. Methods Appl. Mech. Eng. 2017;315:1025–1052. doi: 10.1016/j.cma.2016.11.011. [DOI] [Google Scholar]
66.Gay-Bamaz F., Putkaradze V. Variational discretizations for the dynamics of fluid-conveying flexible tubes. Comptes Rendus Mécanique. 2016;344:769–775. doi: 10.1016/j.crme.2016.08.004. [DOI] [Google Scholar]
67.Bobenko A.I., Suris Y.B. Discrete Lagrangian reduction, discrete Euler-Poincaré equations, and semi-direct products. Lett. Math. Phys. 1999;49:79–93. doi: 10.1023/A:1007654605901. [DOI] [Google Scholar]
68.Iserles A., Munthe-Kaas H.Z., Nørsett S.P., Zanna A. Lie-group methods. Acta Numer. 2000;9:215–365. doi: 10.1017/S0962492900002154. [DOI] [Google Scholar]
69.Bou-Rabee N., Marsden J.E. Hamilton-Pontryagin integrators on Lie groups. Found. Comput. Math. 2009;9:197–219. doi: 10.1007/s10208-008-9030-4. [DOI] [Google Scholar]
70.Kobilarov M.B., Marsden J.E. Discrete geometric optimal control on Lie groups. IEEE Trans. Robot. 2011;27:641–655. doi: 10.1109/TRO.2011.2139130. [DOI] [Google Scholar]
71.Demoures F., Gay-Balmaz F., Ratiu T.S. Multisymplectic variational integrators and space/time symplecticity. Anal. Appl. 2016;14:341–391. doi: 10.1142/S0219530515500025. [DOI] [Google Scholar]

[B1-entropy-22-00498] 1.Souriau J.-M. Structure des Systèmes Dynamiques. Dunod; Paris, France: 1969. [Google Scholar]

[B2-entropy-22-00498] 2.Marle C.-M. From tools in symplectic and Poisson geometry to J.-M. Souriau’s theories of statistical mechanics and thermodynamics. Entropy. 2016;18:370. doi: 10.3390/e18100370. [DOI] [Google Scholar]

[B3-entropy-22-00498] 3.Barbaresco F. Koszul information geometry and Souriau geometric temperature/capacity of Lie Group Thermodynamics. Entropy. 2014;16:4521–4565. doi: 10.3390/e16084521. [DOI] [Google Scholar]

[B4-entropy-22-00498] 4.Barbaresco F. Geometric theory of heat from Souriau Lie groups thermodynamics and Koszul Hessian geometry: Applications in information geometry for exponential families. Entropy. 2016;18:386. doi: 10.3390/e18110386. [DOI] [Google Scholar]

[B5-entropy-22-00498] 5.Barbaresco F. Higher order geometric theory of information and heat based on polysymplectic geometry of Souriau Lie groups thermodynamics and their contextures: The bedrock for Lie Group machine learning. Entropy. 2018;20:840. doi: 10.3390/e20110840. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6-entropy-22-00498] 6.Barbaresco F. Geometric Structures of Information. Springer; Berlin, Germany: 2018. Jean-Louis Koszul and the Elementary Structures of Information Geometry; pp. 333–392. [Google Scholar]

[B7-entropy-22-00498] 7.Barbaresco F. Souriau exponential map algorithm for machine learning on matrix Lie groups. In: Nielsen F., Barbaresco F., editors. GSI 2019. LNCS. Volume 11712 Springer; Berlin/Heidelberg, Germany: 2019. [Google Scholar]

[B8-entropy-22-00498] 8.Barbaresco F. Lie group machine learning and Gibbs density on Poincaré unit disk from Souriau Lie groups thermodynamics and SU(1,1) coadjoint orbits. In: Nielsen F., Barbaresco F., editors. GSI 2019. LNCS. Volume 11712 Springer; Cham, Switzerland: 2019. [Google Scholar]

[B9-entropy-22-00498] 9.Barbaresco F. Application Exponentielle de Matrice par l’extension de l’algorithme de Jean-Marie Souriau, Utilisable pour le tir Géodésique et l’apprentissage Machine pour les Groupes de Lie. Colloque GRETSI 2019. [(accessed on 20 April 2020)];2019 Available online: http://gretsi.fr/colloque2019/

[B10-entropy-22-00498] 10.Barbaresco F. Les Structures Géométriques de l’information de Jean-Louis Koszul. Colloque GRETSI. [(accessed on 20 April 2020)];2019 Available online: http://gretsi.fr/colloque2019/

[B11-entropy-22-00498] 11.Miolane N., Le Brigant A., Cabanes Y. Geomstats: A Python Package for Riemannian Geometry in Machine Learning. [(accessed on 20 April 2020)];2020 Available online: https://hal.inria.fr/hal-02536154/file/main.pdf.

[B12-entropy-22-00498] 12.Berthet Q., Blondel M., Teboul O., Cuturi M., Vert J.-P., Bach F. Learning with Differentiable Perturbed Optimizers. arXiv. 20202002.08676 [Google Scholar]

[B13-entropy-22-00498] 13.Blondel M., Martins A.F.T., Niculae V. Learning with Fenchel-Young Losses. J. Mach. Learn. Res. 2020;21:1–69. [Google Scholar]

[B14-entropy-22-00498] 14.Wainwright M.J., Jordan M.I. Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 2008;1:1–305. doi: 10.1561/2200000001. [DOI] [Google Scholar]

[B15-entropy-22-00498] 15.Fréchet M.R. Sur l’extension de certaines évaluations statistiques au cas de petits échantillons. Rev. Inst. Int. Stat. 1943;11:182–205. doi: 10.2307/1401114. (In French) [DOI] [Google Scholar]

[B16-entropy-22-00498] 16.Jaynes E.T. Information theory and statistical mechanics I, II. Phys. Rev. 1957;106:620. doi: 10.1103/PhysRev.106.620. [DOI] [Google Scholar]

[B17-entropy-22-00498] 17.Koszul J.L. Variétés localement plates et convexité. Osaka J. Math. 1965;2:285–290. (In French) [Google Scholar]

[B18-entropy-22-00498] 18.Koszul J.L. Exposés sur les Espaces Homogènes Symétriques. Publicação da Sociedade de Matematica de São Paulo; São Paulo, Brazil: 1959. (In French) [Google Scholar]

[B19-entropy-22-00498] 19.Koszul J.L. Déformations des variétés localement plates. Ann. Inst. Fourier. 1968;18:103–114. doi: 10.5802/aif.279. (In French) [DOI] [Google Scholar]

[B20-entropy-22-00498] 20.Vinberg E.B. Homogeneous convex cones. Trans. Mosc. Math. Soc. 1963;12:340–363. [Google Scholar]

[B21-entropy-22-00498] 21.Vinberg E.B. The Theory of Homogeneous Convex Cones. Tr. Mosk. Mat. Obs. 1963;12:303–358. [Google Scholar]

[B22-entropy-22-00498] 22.Koszul J.L. Introduction to Symplectic Geometry. Science Press; Beijing, China: 1986. (In Chinese) [Google Scholar]

[B23-entropy-22-00498] 23.Koszul J.L. Series in Pure Mathematics. Volume 17 World Scientific Publishing; Singapore: 1994. Selected Papers. [Google Scholar]

[B24-entropy-22-00498] 24.Souriau J.-M. Mécanique Statistique, Groupes de Lie et Cosmologie. [(accessed on 20 April 2020)];2020 Available online: https://www.academia.edu/42630654/Statistical_Mechanics_Lie_Group_and_Cosmology_1_st_part_Symplectic_Model_of_Statistical_Mechanics.

[B25-entropy-22-00498] 25.Nencka H., Streater R.F. Information geometry for some Lie algebras. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 1999;2:441–460. doi: 10.1142/S0219025799000254. [DOI] [Google Scholar]

[B26-entropy-22-00498] 26.Gay-Balmaz F., Holm D.D. Selective decay by Casimir dissipation in inviscid fluids. Nonlinearity. 2013;26:495–524. doi: 10.1088/0951-7715/26/2/495. [DOI] [Google Scholar]

[B27-entropy-22-00498] 27.Gay-Balmaz F., Holm D.D. A geometric theory of selective decay with applications in MHD. Nonlinearity. 2014;27:1747–1777. doi: 10.1088/0951-7715/27/8/1747. [DOI] [Google Scholar]

[B28-entropy-22-00498] 28.Balian R., Alhassid Y., Reinhardt H. Dissipation in many-body systems: A geometric approach based on information theory. Phys. Rep. 1986;131:1–146. doi: 10.1016/0370-1573(86)90005-0. [DOI] [Google Scholar]

[B29-entropy-22-00498] 29.Libermann P., Marle C.-M. Symplectic Geometry and Analytical Mechanics. Reidel; Kufstein, Austria: 1987. [Google Scholar]

[B30-entropy-22-00498] 30.Abraham R., Marsden J.E. Foundations of Mechanics. Benjamin-Cummings Publ. Co.; San Francisco, CA, USA: 1978. [Google Scholar]

[B31-entropy-22-00498] 31.Marsden J.E., Ratiu T.S. Introduction to Mechanics and Symmetry. Springer; Berlin, Germany: 2003. [Google Scholar]

[B32-entropy-22-00498] 32.Chirco G., Laudato M., Mele F.M. Covariant momentum map thermodynamics for parametrized field theories. arXiv. 2019abs/1911.06224 [Google Scholar]

[B33-entropy-22-00498] 33.Gay-Balmaz F., Ratiu T.S. Affine Lie-Poisson reduction, Yang-Mills magnetohydrodynamics, and superfluids. J. Phys. A Math. Theor. 2008;41:344007. doi: 10.1088/1751-8113/41/34/344007. [DOI] [Google Scholar]

[B34-entropy-22-00498] 34.Gay-Balmaz F., Ratiu T.S. The geometric structure of complex fluids. Adv. Appl. Math. 2009;42:176–275. doi: 10.1016/j.aam.2008.06.002. [DOI] [Google Scholar]

[B35-entropy-22-00498] 35.Gay-Balmaz F., Ratiu T.S., Tronci C. Equivalent theories of liquid crystal dynamics. Arch. Ration. Mech. Anal. 2013;210:773–811. doi: 10.1007/s00205-013-0673-1. [DOI] [Google Scholar]

[B36-entropy-22-00498] 36.Ellis D.C.P., Gay-Balmaz F., Holm D.D., Putkaradze V., Ratiu T.S. Symmetry reduced dynamics of charged molecular strands. Arch. Ration. Mech. Anal. 2009;197:811–902. doi: 10.1007/s00205-010-0305-y. [DOI] [Google Scholar]

[B37-entropy-22-00498] 37.Gay-Balmaz F., Holm D.D., Ratiu T.S. Variational principles for spin systems and the Kirchhoff rod. J. Geom. Mech. 2009;1:417–444. doi: 10.3934/jgm.2009.1.417. [DOI] [Google Scholar]

[B38-entropy-22-00498] 38.de Saxcé G. Euler-Poincaré equation for Lie groups with non null symplectic cohomology. Application to the mechanics. In: Nielsen F., Barbaresco F., editors. GSI 2019. LNCS. Volume 11712 Springer; Berlin, Germany: 2019. [Google Scholar]

[B39-entropy-22-00498] 39.Marle C.-M. Projection Stéréographique et Moments. [(accessed on 20 April 2020)];2019 Available online: https://hal.archives-ouvertes.fr/hal-02157930/

[B40-entropy-22-00498] 40.Bismut J.-M. Mécanique Aléatoire. Volume 866 Springer; Berlin/Heidelberg, Germany: New York, NY, USA: 1981. Lecture Notes in Math. [Google Scholar]

[B41-entropy-22-00498] 41.Lazaro-Cami J.A., Ortega J.-P. Stochastic Hamiltonian dynamical systems. Rep. Math. Phys. 2008;61:65–122. doi: 10.1016/S0034-4877(08)80003-1. [DOI] [Google Scholar]

[B42-entropy-22-00498] 42.Bou-Rabee N., Owhadi H. Stochastic variational integrators. IMA J. Numer. Anal. 2009;29:421–443. doi: 10.1093/imanum/drn018. [DOI] [Google Scholar]

[B43-entropy-22-00498] 43.Holm D.D. Variational principles for stochastic fluid dynamics. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 2015;471:2176. doi: 10.1098/rspa.2014.0963. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B44-entropy-22-00498] 44.Gay-Balmaz F., Holm D.D. Stochastic geometric models with non-stationary spatial correlations in Lagrangian fluid flows. J. Nonlin. Sci. 2018;28:873–904. doi: 10.1007/s00332-017-9431-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B45-entropy-22-00498] 45.Gay-Balmaz F., Holm D.D. Predicting uncertainty in geometric fluid mechanics. Disc. Cont. Dyn. Syst. Ser. S. 2020;13:1229–1242. doi: 10.3934/dcdss.2020071. [DOI] [Google Scholar]

[B46-entropy-22-00498] 46.Arnaudon A., De Castro A.L., Holm D.D. Noise and Dissipation on Coadjoint Orbits. J. Nonlinear Sci. 2018;28:91–145. doi: 10.1007/s00332-017-9404-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B47-entropy-22-00498] 47.Günther C. The polysymplectic Hamiltonian formalism in field theory and calculus of variations I: The local case. J. Diff. Geom. 1987;25:23–53. doi: 10.4310/jdg/1214440723. [DOI] [Google Scholar]

[B48-entropy-22-00498] 48.Gotay M.J., Isenberg J., Marsden J.E., Montgomery R., Sniatycki J., Yasskin P.B. Momentum maps and classical fields. Part I: Covariant field theory. arXiv. 1997physics/9801019v2 [Google Scholar]

[B49-entropy-22-00498] 49.Ingarden R.S., Nakagomi T. The second order extension of the Gibbs state. Open Syst. Inf. Dyn. 1992;1:259–268. doi: 10.1007/BF02228947. [DOI] [Google Scholar]

[B50-entropy-22-00498] 50.Ingarden R.S., Kossakowski A., Ohya M. Information Dynamics and Open Systems. Volume 86 Springer; Berlin, Germany: 1997. Classical and Quantum Approach, Fundamental Theories of Physics. [Google Scholar]

[B51-entropy-22-00498] 51.Ingarden R.S., Meller J. Temperatures in linguistics as a model of thermodynamics. Open Syst. Inf. Dyn. 1994;2:211–230. doi: 10.1007/BF02228965. [DOI] [Google Scholar]

[B52-entropy-22-00498] 52.Jaworski W., Lngarden R.S. On the partition function in information thermodynamics with higher order temperatures. Bull. Acad. Pol. Sci. Sér. Phys. Astron. 1980;1:28–119. [Google Scholar]

[B53-entropy-22-00498] 53.Jaworski W. Ph.D. Thesis. Institute of Physics, Nicolaus Copernicus University; Torun, Poland: 1983. Investigation of the Thermodynamic Limit for the States Maximizing Entropy under Auxiliary Conditions for Higher-Order Statistical Moments. (In Polish) [Google Scholar]

[B54-entropy-22-00498] 54.Jaworski W. On the thermodynamic limit in information thermodynamics with higher-order temperatures. Acta Phys. Pol. 1983;A63:3–19. [Google Scholar]

[B55-entropy-22-00498] 55.Eriksen P.S. Geodesics Connected with the Fisher Metric on the Multivariate Normal Manifold. Institute of Electronic Systems, Aalborg University; Aalborg, Denmark: 1986. Technical Report 86-13. [Google Scholar]

[B56-entropy-22-00498] 56.Eriksen P.S. Proceedings of the GST Workshop. University of Lancaster; Lancaster, UK: 1987. Geodesics connected with the Fisher metric on the multivariate normal manifold. [Google Scholar]

[B57-entropy-22-00498] 57.Balian R. The entropy-based quantum metric. Entropy. 2014;16:3878–3888. doi: 10.3390/e16073878. [DOI] [Google Scholar]

[B58-entropy-22-00498] 58.Hairer E., Lubich C., Wanner G. Geometric Numerical Integration, Structure-Preserving Algorithms for Ordinary Differential Equations. Volume 31 Springer; Berlin, Germany: 2010. (Springer Series in Computational Mathematics). [Google Scholar]

[B59-entropy-22-00498] 59.Marsden J.E., West M. Discrete mechanics and variational integrators. Acta Numer. 2001;10:357–514. doi: 10.1017/S096249290100006X. [DOI] [Google Scholar]

[B60-entropy-22-00498] 60.Marsden J.E., Pekarsky S., Shkoller S. Discrete Euler-Poincaré and Lie-Poisson equations. Nonlinearity. 1998;12:1647–1662. doi: 10.1088/0951-7715/12/6/314. [DOI] [Google Scholar]

[B61-entropy-22-00498] 61.Marsden J.E., Patrick G.W., Shkoller S. Multisymplectic geometry, variational integrators, and nonlinear PDEs. Commun. Math. Phys. 1998;199:351–395. doi: 10.1007/s002200050505. [DOI] [Google Scholar]

[B62-entropy-22-00498] 62.Lew A., Marsden J.E., Ortiz M., West M. Asynchronous variational integrators. Arch. Ration. Mech. Anal. 2003;167:85–146. doi: 10.1007/s00205-002-0212-y. [DOI] [Google Scholar]

[B63-entropy-22-00498] 63.Demoures F., Gay-Balmaz F., Kobilarov M., Ratiu T.S. Multisymplectic Lie group variational integrators for a geometrically exact beam in $R^{3}$ . Commun. Nonlinear Sci. Numer. Simul. 2014;19:3492–3512. doi: 10.1016/j.cnsns.2014.02.032. [DOI] [Google Scholar]

[B64-entropy-22-00498] 64.Demoures F., Gay-Balmaz F., Ratiu T.S. Forum of Mathematics, Sigma. Volume 4. Cambridge University Press; Cambridge, UK: 2016. Multisymplectic variational integrators for nonsmooth Lagrangian continuum mechanics.54p [Google Scholar]

[B65-entropy-22-00498] 65.Demoures F., Gay-Balmaz F., Desbrun M., Ratiu T.S., Aragón A. A multisymplectic integrator for elastodynamic frictionless impact problems. Comput. Methods Appl. Mech. Eng. 2017;315:1025–1052. doi: 10.1016/j.cma.2016.11.011. [DOI] [Google Scholar]

[B66-entropy-22-00498] 66.Gay-Bamaz F., Putkaradze V. Variational discretizations for the dynamics of fluid-conveying flexible tubes. Comptes Rendus Mécanique. 2016;344:769–775. doi: 10.1016/j.crme.2016.08.004. [DOI] [Google Scholar]

[B67-entropy-22-00498] 67.Bobenko A.I., Suris Y.B. Discrete Lagrangian reduction, discrete Euler-Poincaré equations, and semi-direct products. Lett. Math. Phys. 1999;49:79–93. doi: 10.1023/A:1007654605901. [DOI] [Google Scholar]

[B68-entropy-22-00498] 68.Iserles A., Munthe-Kaas H.Z., Nørsett S.P., Zanna A. Lie-group methods. Acta Numer. 2000;9:215–365. doi: 10.1017/S0962492900002154. [DOI] [Google Scholar]

[B69-entropy-22-00498] 69.Bou-Rabee N., Marsden J.E. Hamilton-Pontryagin integrators on Lie groups. Found. Comput. Math. 2009;9:197–219. doi: 10.1007/s10208-008-9030-4. [DOI] [Google Scholar]

[B70-entropy-22-00498] 70.Kobilarov M.B., Marsden J.E. Discrete geometric optimal control on Lie groups. IEEE Trans. Robot. 2011;27:641–655. doi: 10.1109/TRO.2011.2139130. [DOI] [Google Scholar]

[B71-entropy-22-00498] 71.Demoures F., Gay-Balmaz F., Ratiu T.S. Multisymplectic variational integrators and space/time symplecticity. Anal. Appl. 2016;14:341–391. doi: 10.1142/S0219530515500025. [DOI] [Google Scholar]

PERMALINK

Lie Group Cohomology and (Multi)Symplectic Integrators: New Geometric Tools for Lie Group Machine Learning Based on Souriau Geometric Statistical Mechanics

Frédéric Barbaresco

François Gay-Balmaz

Abstract

1. Introduction

2. A General Framework for Lie Group Statistical Mechanics and Symmetries

2.1. A Class of Generalized Gibbs Probability Densities, Its Associated Entropy and Fisher Metric

Lemma 1.

Proof.

Proposition 2.

Proof.

Proposition 3.

Proof.

Proposition 4

Proof.

2.2. Equivariance with Respect to Lie Group Actions

Proposition 5.

Proof.

2.3. Souriau Symplectic Model of Statistical Mechanics

Remark 6

2.3.1. Souriau Symplectic Model of Satistical Mechanics

2.3.2. Lie-Poisson Equations with Cocycle and Property of the Entropy in Souriau’s Model

Remark 7

Corollary 8.

Proof.

2.3.3. Dynamics with Casimir Dissipation/Production

2.3.4. Stochastic Hamiltonian Dynamics

2.4. Polysymplectic Model of Statistical Mechanics

2.5. The Fisher Metric on Orbits and Equivariance

Proposition 9.

Proof.

Corollary 10.

Corollary 11.

3. Applications

3.1. Multivariate Gaussian Probability Densities

Lemma 12.

Proof.

Lemma 13.

Proof.

3.2. Unitary Representations and Quantum Fisher Metric

3.3. Souriau Symplectic Model for SE(2), Lie-Poisson Equations with Cocycle, and Casimir Dissipation

4. Variational Principles and (Multi)Symplectic Integrators

4.1. Preliminaries on Variational Lie Group Integrators

4.2. Central Extensions and Variational Principle for the Lie-Poisson Equations with Cocycle

4.3. Variational Symplectic Integrators for the Lie-Poisson Equations with Cocycle

Lemma 14.

Proof.

Proposition 15

Proof.

Proposition 16.

Proof.

Remark 17

Example 17.

4.4. Multisymplectic Lie Group Variational Integrators

Remark 18

Proposition 19.

Proof.

Proposition 20.

Proof.

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3.3. Souriau Symplectic Model for $S E (2)$ , Lie-Poisson Equations with Cocycle, and Casimir Dissipation