Abstract
Two-field functional integrals (2FFI) are an important class of solution methods for generating functions of dissipative processes, including discrete-state stochastic processes, dissipative dynamical systems, and decohering quantum densities. The stationary trajectories of these integrals describe a conserved current by Liouville’s theorem, despite the absence of a conserved kinematic phase space current in the underlying stochastic process. We develop the information geometry of generating functions for discrete-state classical stochastic processes in the Doi-Peliti 2FFI form, and exhibit two quantities conserved along stationary trajectories. One is a Wigner function, familiar as a semiclassical density from quantum-mechanical time-dependent density-matrix methods. The second is an overlap function, between directions of variation in an underlying distribution and those in the directions of relative large-deviation probability that can be used to interrogate the distribution, and expressed as an inner product of vector fields in the Fisher information metric. To give an interpretation to the time invertibility implied by current conservation, we use generating functions to represent importance sampling protocols, and show that the conserved Fisher information is the differential of a sample volume under deformations of the nominal distribution and the likelihood ratio. We derive a pair of dual affine connections particular to Doi-Peliti theory for the way they separate the roles of the nominal distribution and likelihood ratio, distinguishing them from the standard dually-flat connection of Nagaoka and Amari defined on the importance distribution, and show that dual flatness in the affine coordinates of the coherent-state basis captures the special role played by coherent states in Doi-Peliti theory.
Keywords: Information geometry, Doi-Peliti theory, Liouville’s theorem, Fisher information, importance sampling, duality
Introduction: understanding the Liouville theorems that emerge in the probability analysis of dissipative systems
Hamiltonian state spaces, conserved phase-space densities, and two-field functional integral representations
The defining feature of dissipative systems, whether classical or quantum, is that trajectories initially distinct can merge and that distributions or densities that differ at their initial conditions become more similar over time as they are increasingly governed by local generating parameters at the expense of memory. Because such systems are intrinsically irreversible, they obey no kinematic Liouville theorem (see [35]) describing phase space densities that are conserved along flow lines. For Markov jump processes on discrete state spaces, a kinematic phase space is not defined, while for stochastically perturbed dynamical systems, its volume is not preserved under time evolution. Yet in the study of the time evolution of probability distributions for stochastic processes, a conserved volume element does arise, through a Hamilton–Jacobi theory that evolves the gradients of log-probabilities together with their values [13, 69].
In the classical dynamics of Hamiltonian systems, conserved densities in phase space are associated with reversibility and made possible by the presence of canonically conjugate pairs of momentum and position coordinates specifying states. Convergence of trajectories in the projection onto either component in such a pair is disambiguated by separation in the conjugate variable, such that time evolution defines a symplectomorphism on the state space, and the data specifying a state or distribution at one time are the image under a coordinate transformation of the data at any time.
In the Hamilton–Jacobi theory that describes distributions evolving under stochastic processes, the role of the log-probability gradient as a conjugate momentum suggests that it is the resolution in continuous-valued probability distributions that can preserve the data required for backward- as well as forward-time evolution from any instant. Two questions that remain to be clarified, then, are: (1) what is the general dynamical role of these gradients as conjugate variables in a state space, in relation to the “coordinate” variables that appear in the Hamilton–Jacobi theory; and (2) what is the conceptual interpretation of conserved densities in this phase space?
An important insight into the abstract role of dual fields was given in early work by Martin, Siggia, and Rose [52], who observed that a complete system of equations similar to the Dyson equations of quantum field theory could not be formulated for stochastically perturbed dynamical systems from functions only of the coordinate field that is dynamically evolved. Their solution—appealing explicitly to quantum field theory as a model—was to re-interpret the coordinate variables in a perturbed dynamical system as operator-valued fields and to introduce a parallel set of dual fields, non-commuting with the coordinates, with the interpretation of sources of perturbation. From the two fields together, they obtained a system of relations equivalent to a Dyson equation between the advanced and retarded “response functions” to perturbations, and the correlation function of fluctuations of the coordinate variables.
The coordinate field along with the response field introduced in [52] can be re-cast from operators back to ordinary classical fields if the classical fields become variables of integration in a path integral, through the equivalence of operator formulations and path-integral formulations known in quantum mechanics since Feynman and Hibbs [28]. An especially useful observation by Kamenev [42] is that the response and correlation functions of classical stochastic processes have exactly the algebra of “observable” and “response” fields defined by Keldysh [43] as a rotation of field variables from the pairs of forward- and backward-evolving state fields in Schwinger’s time-loop formulation [63] of time-dependent quantum density matrices. Through this algebraic equivalence the role of systems of response fields introduced as duals to coordinate fields is represented generally in what we may call two-field functional integrals (2FFI). The path-integral formulation of the Schwinger time-loop was the original example, which the operator methods of [52] show may be extended to quite general stochastically perturbed dynamical systems.
For a certain class of Markov jump processes on discrete state spaces, which we may generally describe as stochastic population processes, a direct construction of the 2FFI representation for generating functions, from the transition matrix of the master equation, was worked out by Doi [21, 22] and Peliti [57, 58]. (This case was the path integral studied in [42].) The direct construction [44, 68, 71] makes clear the correspondence between the response field and the jump operator in the master equation, giving a mechanistic meaning to the operator-field interpretation of [52]. The Doi-Peliti integral representation also shows that it is the Liouville operator—the representation of the generator of time translation acting in the space of generating functions [74]—that is the Hamiltonian of the Hamilton–Jacobi theory [13, 69].
In the large-deviation limit, which will be the subject of this paper, the Liouville operator can be replaced with a function on deterministic phase-space trajectories, which becomes the generating function for the Lagrange–Hamilton duality between the tangent and cotangent bundles on the coordinate manifold of first moments of the dynamical distribution. Doi-Peliti theory is particularly expressive of the origin and nature of two-field duality because Peliti’s construction [57, 58] of the functional integral employs a representation of unity in the space of generating functions in terms of outer products of basis functions parametrized by the observable field, and projection operators parametrized by the response field introduced in [52]. The conservation of “data” entailed by a Liouville theorem is nothing other than the requirement of completeness in a representation of unity in a Hilbert space of generating functions.
The information geometry induced by symplectomorphisms acting on probability distributions evolved under stochastic processes
The 2FFI representation of Hamilton–Jacobi theory will give a direct mathematical interpretation of the conserved phase space density, as the Wigner function of the two-field integral (see Sect. 3). This function is known in quantum mechanics as a semiclassical approximation to quantum density matrices in time-dependent thermal ensembles.
We may better understand its statistical interpretation and its relation to time reversal, however, by going beyond the simple Hamiltonian volume conservation already inherent in 2FFI theories, to derive dual parallel transport relations for vector fields induced by symplectomorphisms on the cotangent bundle, as these relate to the Riemannian geometry induced by cumulant generating functions and their Legendre duals, the large-deviation functions [73]. The construction of those aspects of the information geometry of two-field integrals is the main work of this paper.
Information geometry derives from divergence functions between probability distributions [2, 7], and provides coordinate-independent constructions for the metric distance between pairs of distributions, or the overlap between two directions of change in the form of the extended Pythagorean theorem [3, 54]. In particular, these measures are invariant under the symplectomorphisms generated by time translation, so that dynamical conservation is implied by coordinate invariance.
The probability distributions corresponding to phase-space points along the large-deviation trajectories in Doi-Peliti theory may vary either through change of the underlying probability distribution (e.g., through change in initial conditions) or through change of the argument of the generating function at which that underlying distribution is being studied. The second source of change has an interpretation in terms of biased sampling from the underlying distribution, and establishes a connection between generating functions and methods of importance sampling [56] in statistical inference.
Inference is the pertinent context within which to understand the time-reversal of Hamilton–Jacobi equations for stochastic processes, because it relates directly to the adjoint duality between the Kolmogorov forward and backward equations [33], equivalently either evolving probability distributions forward in time or operators corresponding to observables backward in time. The representation of adjoint duality in terms of time-reversed evolution has been studied extensively [37, 38, 69] in the context of fluctuation theorems [64].
Here we will study shifts in the underlying distributions evolving under some stochastic process, and shifts in the sampling bias in generating functions, as sources of vector fields which are parallel transported under the canonical transformations on phase space induced by time translation. In particular we derive affine connections under which, as differences in underlying distributions are attenuated over time due to dissipation, differences between degrees of sampling bias are amplified in the Doi-Peliti integral. The exact compensation of these two effects in Hamilton–Jacobi theory identifies conserved phase-space densities as densities of the information relevant to inference between protocols for biased sampling and features of underlying distributions.
We derive a particular form of this relation, in which the information delivered under varying degrees of sampling bias about the difference between nearby distributions is the inner product of two vector fields in the Fisher information metric. The conserved densities of sample-bias information under time translation are none other than the eigenvalues of the Fisher information. Dual parallel transport of the vector fields describing shifts in the underlying distribution and shifts in the sample-bias factor preserves this inner product along the one-parameter family of symplectomorphisms in the phase space generated by time translation, making the sample information about distribution change, like the phase-space density, an invariant of motion.
Organization of the presentation
Section 2 opens with brief reviews of the three background areas needed in this work, beginning with spaces and dualities defined at a single time and then continuing to those added with time evolution. We first consider the way general probability distributions are represented by Laplace transforms as contours in phase space, on which the fundamental Riemannian metric of information geometry, and Legendre duality between conjugate phase-space coordinates, are defined. We then review the correspondence of a generating function to an exponential family of tilted distributions with the interpretation of importance distributions produced by biased sampling of the base distribution. Third we introduce time evolution under a stochastic process generator. We review the construction of the Doi-Peliti two-field integral representation of generating functions, followed by the large-deviation limit in which the Liouville operator becomes a classical Hamiltonian and the generating function for duality between coordinates in the tangent and cotangent bundles over the configuration space.
The main results of the paper are derived in Sect. 3. The Wigner function is defined and shown to be the phase space density conserved under Liouville’s theorem. The two vector fields associated with the dual coordinates in phase space are constructed, to describe variations within exponential families through change of the sample bias, and change across families through change in the base distribution. The transport law for the Fisher metric, and dual affine connections respecting the symplectic structure, are then derived.
Section 4 contains a simple worked example illustrating all aspects of the 2FFI Liouville theorem and its associated dual geometry. Section 5 concludes, relating the familiar treatment of adjoint duality in terms of time reversal [37] to the interpretation developed here as a duality between dynamics and statistical inference.
Review: Legendre duality and information geometry, importance sampling, and Lagrange/Hamilton duality
Phase spaces with conjugate coordinates play several roles in the constructions to follow. At single times, they host the Legendre duality and Riemannian geometry of surfaces representing probability distributions through their Laplace transforms. Within the exponential family over any one distribution, conjugate fields separate the nominal distribution from other members of the family that take on the interpretation of importance distributions. When probability distributions are evolved in time, the phase spaces in which they are embedded become cotangent bundles of the dynamics, with Lagrangian–Hamiltonian dualities to the tangent bundles that depend on the generator of time translation but not on the states being evolved. In this section we establish notations for all three roles and show how they are instantiated in Doi-Peliti functional integrals.
Phase space, foliation by exponential families, Legendre duality, and geometric relations
Embedding of probability distributions as surfaces in phase space, and the geometry in those surfaces
The state spaces on which the Doi-Peliti construction can be carried out to produce generating functions and functionals are non-negative integer lattices in some number of dimensions D, which here will be assumed finite for simplicity. Dimensions are indexed , and lattice points in the state space are indexed by vectors with integer components with each . Examples include such phenomena as stochastic population processes, in which i indexes the types of members in the population and each is the count of individuals of type i. Well-developed applications include evolutionary populations [70] and chemical reaction networks [44, 71]. The object of study will be continuous-valued probability distributions over states , indexed .
The Laplace transform (for a countable basis , also termed the z-transform) of is the moment-generating function
| 1 |
If we write each component , is the cumulant-generating function (CGF), which will be convex when written in coordinates . As a shorthand we have written as the component-wise th power of z. If is regarded as a column vector and a row vector, the Cartesian inner product is abbreviated , adopting the Einstein summation convention.
The gradient of , which we denote
| 2 |
is the expectation of in a distribution
| 3 |
tilted with weight , which we will interpret later as a sampling bias or likelihood function.
The continuous coordinate vectors n make up the points in a manifold Q that will be the coordinate manifold with respect to which geometries and dynamics are defined. Pairs of values define a 2D-dimensional phase space over the coordinate manifold Q. Later, when dynamics is introduced, that phase space will become the cotangent bundle over manifold Q, on which a Lagrangian–Hamiltonian duality is defined. The first relation we study in this phase space, however, does not depend on its interpretation as a cotangent bundle; it is Legendre duality between values and n induced by a convex CGF .
For convex , the function is invertible, and the inverse function is a D-dimensional surface, single-valued at each n, in the phase space. Each such surface has an intrinsic Riemannian geometry induced by the Hessian of ,
| 4 |
known as the Fisher tensor, introduced in the interpretation as a Riemannian metric by Rao [61] (reprinted as [62]). Using the differential geometry notation in which is the set of basis elements in the tangent space to the exponential family, the Fisher metric is an inner product, which we denote
| 5 |
The manifold Q with metric (4) admits a variety of dual transport relations induced by pairs of connections compatible with that metric, as developed by Amari and Nagaoka [3, 54]. Each hypersurface is generated by a single underlying distribution (which we will call the “base distribution”). The whole CGF is thus identified with an exponential family of distributions tilted from at each value of . The dual connections of information geometry provide a way to understand the information divergences of distributions in such a family with others either inside or outside the family, in familiar geometric terms such as least-distance projections onto the family.
Foliations in phase space and the inner product between variations within and between exponential families
We will be interested here in families of base distributions, denoted , for which the exponential families defining the CGFs, over all values of the index , provide a foliation of the phase space, as shown in Fig. 1. The CGFs and tilted distributions that make up each leaf are denoted
| 6 |
The index is the value of at in the distribution .
Fig. 1.

Foliation of a 1-species phase space by the contours in the exponential families over each of a sequence of Poisson base distributions with mean values (at ). The exponential family and its associated Legendre-dual function for one value of is drawn bold and labeled. Overcomplete bases of such Poisson distributions with continuous Poisson parameter are used in the construction of a representation of unity in the Peliti [57, 58] functional integral construction. Covariant vector fields corresponding to (within leaves) characterize change in tilted distributions driven through sampling bias, while contravariant vector fields corresponding to (across leaves) characterize change in the base distribution. The inner product will be preserved by parallel transport of these vectors under maps of the phase space generated by time translation
Equation (2) implies that the Fisher metric (4) is a coordinate transformation from contravariant to covariant coordinates:
| 7 |
If is convex, the transformation (7) is invertible. The inverse of the coordinate transform, and with it the Fisher metric, is obtained from the Legendre transform of ,
| 8 |
where is the maximizer of the argument in Eq. (8) over values. is the Large-Deviation function (LDF). By construction its gradient gives the inverse function
| 9 |
From the definition (1) of the generating function and the extremization condition (8), is seen to be, in leading exponential approximation, a kind of convex approximation to for , of which is the gradient. These interpretations help to give meaning to the evolution of under a Hamilton–Jacobi equation derived below in Sect. 2.3.4.
From Eq. (9), the inverse coordinate transform
| 10 |
is the inverse of from Eq. (7). The distance between two distributions in the exponential family is then written in either contravariant coordinate differentials or covariant coordinate differentials as
| 11 |
The Fisher metric can be obtained as the projection of the Euclidean metric in under a spherical embedding of the distribution , briefly reviewed in Appendix A.1, providing a third set of coordinates for the tilted distribution . We note this embedding because for the exponential families that will play an important role in stationary-point methods later, the distance between distributions on the Fisher sphere in potentially infinite dimensions reduce to divergences of the same form on the phase space, as shown in Appendix A.2.
The distance element (11) measures the divergence between two nearby distributions in an exponential family in the geometry derived from . For two distributions related to a common base distribution through shifts respectively in or in , we seek a measure of the degree to which the two changes overlap. The extended Pythagorean theorem [3, 54] for Kullback–Leibler (KL) divergences provides such a measure, where in the small-coordinate limit it becomes an inner product of the two coordinate displacements in the Fisher metric:
| 12 |
Recognizing that , the inner product in the final line of Eq. (12) is just the sensitivity
| 13 |
of the mean in the tilted distribution to variations in the base.
Preservation of the inner product in connection with Liouville’s theorem
A coordinate change from the mean in the base distribution to the mean in the tilted exponential distribution produces the metric in mixed coordinates that by construction is the Kronecker ,
| 14 |
and the coordinate inner product
| 15 |
We may ask, for what one-parameter families of coordinate systems is the coordinate inner product (14) conserved across the family? A one-parameter family of coordinates generates a one-parameter family of maps of vector fields and by the action
| 16 |
The condition
| 17 |
will be met whenever
| 18 |
where denotes d/dt. Eq. (18) is satisfied if there is a symplectic form in terms of which the velocity vectors along trajectories can be written
| 19 |
Alternatively, if Eq. (18) holds everywhere, the form can be constructed by integration.
Eq. (18) relates the dual contravariant and covariant coordinates under the Fisher metric as canonically conjugate variables in a Hamiltonian dynamical system. We return in Sect. 2.3 to derive symplectic forms from the generators of stochastic processes.
Tilted distributions as importance distributions, and particular application to large deviations
In order to furnish an interpretation of the 2FFI symplectic transport structure, we appeal to statistical inference to assign informational meanings to the conjugate fields z that exponentially tilt the base distributions in generating functions. In importance sampling, z takes on the interpretation of a sampling bias known as a likelihood ratio, which transforms the measure for samples.
In the terminology of importance sampling [56], the base distribution corresponds to the nominal distribution, and the tilted distribution of Eq. (6) plays the role of an importance distribution. The normalized exponential tilt is the corresponding likelihood ratio, also called the Radon–Nikodym derivative of the measure between the base and the importance distributions.
Importance distributions can be chosen to concentrate the density of samples away from the mode of to values of that are more informative about observables of interest. Tilts are typically chosen to minimize some cost function, such as the variance of samples. The large-deviation function can be derived as a leading exponential approximation to the tail weight of the base distribution, in a protocol tuned to minimize sample variance, as shown in the following construction from [66].
To illustrate with an example in one dimension, an estimate of the probability that a particle count exceeds some bound can be obtained by sampling values of the random variable
| 20 |
the indicator function for . In the base distribution , the probability for is
| 21 |
An unbiased estimator for can be obtained by using the tilted distribution of Eq. (6) and instead of accumulating the values of the indicator , accumulating values of the tilted observable
| 22 |
The tilted estimator is unbiased because
| 23 |
A few lines of algebra, provided in Appendix B, show that an exponential bound for the estimator at any choices of and is given by
| 24 |
The variance of the same sample estimator has a corresponding bound (see Eq. (146))
| 25 |
The parameter that minimizes the bound on sample variance (25) also gives the tightest bound (24) on the tail weight. It is the minimizing argument of Eq. (8), so the bound is given in terms of the LDF as.
| 26 |
Without further assumptions about it is not possible to say more about the ratio . The relevant additional property, which is also associated with the use and tightness of saddle-point approximations [36] in Doi-Peliti theory, is the onset of large-deviations scaling; that is, if and are increased together in proportion to some scale factor N as , , the following two limits should exist:
| 27 |
Then the variance-minimizing tilt likewise has a limit, the variance in Eq. (4) scales as N, and the relative variance scales as 1/N. Appendix B shows that in this limit the log ratio , compared to .
The Doi-Peliti integral construction of generating functions for time-dependent probability distributions
Canonical transformations on phase space [35] are introduced when distributions are evolved under the generators of stochastic processes. A formally exact representation of the time-translation of distributions over finite intervals, acting directly in the representation by generating functions, is given by the Doi-Peliti construction [21, 22, 57, 58]. (See [9, 42, 68] for other self-contained introductions.) Phase-space coordinates in Doi-Peliti theory take on interpretations in terms of distributions and dual projection operators, equivalent in the path integral to the non-commuting operator fields of [52], through Peliti’s construction [57, 58] of a representation of the identity in the Hilbert space of generating functions.
In the path integral, tangent vectors to configurations, and phase-space dual coordinates , are independent dummy variables of integration. It is the large-deviation limit, realized in Doi-Peliti theory by saddle-point evaluation of the functional integral, that couples coordinate pairs and as Lagrangian and Hamiltonian dual coordinate systems, with the Liouville operator-representation of the stochastic process generator as the Hamiltonian and the generating function of the coordinate transformation. The tangent vectors to stationary trajectories generate a family of symplectomorphisms in phase space, and the LDF evolves as a solution to the corresponding Hamilton–Jacobi equation [13, 69].
A tutorial on the defining steps in Doi-Peliti theory is given in this subsection. Supporting algebra, where needed to provide a complete self-contained definition of the method, is provided in Appendix C. Doi-Peliti theory was invented to solve Markov jump processes on integer lattices of the kind that describe population processes, and while providing a definite notation and formally-exact integral solution, neither the indexing nor the particular exponential generating functions used here are meant to be a general notation describing all 2FFIs. Appendix D, working in the simplifying limit of Gaussian fluctuations, provides derivations starting from Doi-Peliti theory of equivalent representations in terms of the Langevin stochastic differential equation and the Fokker–Planck equation [74], which manifestly have no restriction to the specific assumptions made in the Doi-Peliti construction. The derivations leading to them may be “inverted” to equivalence classes of 2FFI path integrals in the Gaussian limit, defined by a common integral kernel form explained in [42], and equivalent to the Dyson equations for general stochastic processes derived in [52]. Saddle-point methods in Doi-Peliti theory are seen to be one instance of the eikonal approximations for large-deviation functions, developed in general form by Freidlin and Wentzell [30], and the forms derived from the action here are related to specific forms arising in that work.
Generator, Hilbert space, and quadrature
The starting representation for time evolution is the continuous-time master equation of the probability distribution
| 28 |
is known as the transition matrix, and is the representation of the generator of the stochastic process acting in the space of probability distributions. It is left-stochastic, meaning , , ensuring conservation of probability. The matrix elements can depend on the time t, though in the example developed in Sect. 4 we will use a time-independent generator for simplicity.
The z-transform (1) induces a representation of the generator of time translation acting in the space of moment generating functions in the form of a Liouville equation
| 29 |
, called the Liouville operator, is conventionally defined with the minus sign of Eq. (29) because its spectrum is non-negative.
For many purposes, the properties of the generating function as an analytic function of a complex variable [29] are not needed, and the algebra of the MGF as a formal power series [77] is sufficient. An operator algebra due to Doi [21, 22] replaces the variable z and derivative with formal raising and lowering operators
| 30 |
In the condensed notation (1) for vector inner products, we will regard a as a column vector and as a row vector.
Associated with operators and are bilinear number operators (no Einstein sum), of which the basis monomials under the mapping (30) correspond to number eigenstates. Number states are denoted , and the MGF is represented as a state vector defined in terms of number states as
| 31 |
The Liouville equation (29) becomes
| 32 |
in which the the Liouville operator is the former function under the substitution (30).
An inner product must be defined to make generating functions states in a Hilbert space. In the Doi theory this is done formally by introducing a dual state satisfying . Conjugate number states with inner product are then built up using lowering operators. The correspondence of the Doi dual ground state with a projector on analytic generating functions, and the inner product known as the Glauber norm, corresponding to the evaluation of any MGF at argument , are given in Appendix C.1. The analytic form (1) of the MGF can be recovered using a variant of the Glauber norm, as
| 33 |
The objective in introducing the Doi operator formalism is to more conveniently compute the quadrature of the Liouville equation (29), formally written
| 34 |
is the generating function for the distribution evolved to time from the generating function for an initial distribution given at time . denotes time-ordering of the exponential integral, defined operationally in the second line of Eq. (34), in terms of a time-ordered product of applications of evaluated at the sequence of times .
The Peliti coherent-state expansion and two-field functional integral
The implications of the time-ordered operator algebra in the quadrature (34) for correlation functions may not be at all easy to derive or approximate, and the purpose of Peliti’s functional-integral construction is to supply evaluation methods, and even more importantly, approximation methods including saddle-point evaluations and a systematic approach to perturbation theory such as exists for path-integral formulations in quantum mechanics [28].
A straightforward step in the Peliti construction is to expand arbitrary generating functions in a basis of eigenstates of the lowering operator a. However, whereas the indices are countable, the eigenvalues of a are continuous, so such a basis is overcomplete. The important and non-trivial step in the Peliti construction is to identify a representation of the identity operator in the space of generating functions, which requires defining projectors that are left-eigenstates of the raising operator , and showing that an expansion in these two continuously-indexed sets of states is equivalent to the identity operator on number states . It is in this second step that the response field in phase space enters the functional-integral construction for stochastic processes.
The eigenstates of the lowering operator will be the moment-generating functions of products of Poisson distributions, and these distributions will play a central role in the interpretation of stationary-point approximations in the functional integral. In the Poisson distribution with mean n,
| 35 |
the expectations of factorial moments [10], defined (again, component-wise) as , are for all k. Products of Poisson marginals, or multinomial distributions that are sections through such products at fixed ,
| 36 |
both serve as saddle-point approximations to more general distributions in the Doi-Peliti integral, and also form an important class of exact solutions for some applications such as chemical reaction network models [4]. Tilts of Poisson (35) or multinomial (36) distributions remain of the same form, so that nominal and importance distributions will have a uniform relation to phase-space points at all pairs in Fig. 1.
The fixed form of all moments in Poisson and multinomial distributions as functions of the mean makes these minimum-information distributions. For these distributions the Fisher spherical embedding of divergences of distributions on the infinite index set can be reduced to only D dimensions in the coordinates , as shown in Appendix A.2. Related simplifications, from forms of adjoint duality [38] that generally require infinitely many parameters, to simple coordinate transforms in the Doi-Peliti functional integral, are reviewed below in Sect. 2.3.6.
The generating function of a product distribution with a vector of Poisson parameters corresponds to the state
| 37 |
as may be checked by an elementary expansion of the Taylor’s series for the exponential. These states are eigenstates of the lowering operators:
| 38 |
From the inner product of the base projection operator and number states introduced above, a similar Taylor’s series expansion of the exponential verifies that left eigenstates of the raising operator must take the form
| 39 |
The normalization in Eq. (39) has been chosen for convenience, so that the inner product of two such coherent states will evaluate to
| 40 |
and in particular .
Note that the inner product (40), if expanded in the power series for the two coherent states, identifies a tilted distribution
| 41 |
so we recognize that the coherent-state parameters index a space of base distributions, and the dual fields are the biased-sampling weights in an exponential family of generating functions in which takes the place of the coordinate in Eq. (3). In the language of importance sampling from Sect. 2.2, Eq. (41) are importance distributions with mean parameters .
The result enabling the Peliti functional integral construction is that the following integral of outer products constitutes a representation of the identity in the space of generating functions:
| 42 |
a result demonstrated in Eq. (160) of Appendix C.2.
From the insertion of copies of the identity (42) at a sequence of times for some small into the quadrature (34), some algebra gives the generating function (1) in the form of a functional integral with what Feynman and Hibbs [28] term a “skeletonized” functional measure (defined in Eq. (162)), as
| 43 |
is the initial generating function in Eq. (34). S in Eq. (43) has the form of a Lagrange–Hamilton action functional,
| 44 |
in which the “kinetic term” that creates a Doi-Peliti Lagrangian comes from the inner product (40) between projectors and states in two expansions for the identity operator at closely-spaced times.
While the CGF (43) is formally a function of the coordinate , the functional integral (44) and action (43) are at the moment expressed in terms of coherent-state fields . As noted in Eq. (41), these variables are on one hand mathematically expressive, as they distinguish the mean in a base distribution from the likelihood ratio in an importance distribution. On the other hand they make calculation of the mean in the importance distribution inconvenient because it is a bilinear form in and , and more importantly, the Hessian of the function is not generally positive-definite in coordinates z.
Therefore we introduce the first canonical transformation from the Peliti coherent-state coordinates that will be of interest in this work, which is a simple logarithmic point transformation
| 45 |
The fields n and would have the interpretation of a mean molecule number and conjugate chemical potential in a chemical-reaction model, so we refer to these as number-potential coordinates, distinguishing them from coherent-state coordinates. The action (44) becomes
| 46 |
in which is with and written as functions of n and by Eq. (45).
The large-deviation limit and Hamiltonian stationary trajectories
The large-deviation limit of the path integral (43) is a leading-exponential approximation in some value such as that becomes large to reflect large population size or a similar scale variable. In that limit the integral is approximated by the value of the exponential kernel at its saddle point, which is any stationary solution of the action S and the initial and final boundary terms. The stationary trajectories for S in the form (44) satisfy the pair of equations of motion
| 47 |
The final-time boundary value for the field is given by the vanishing derivative of the exponent in Eq. (43) with respect to , resulting in . The initial time boundary value for is given, after an integration by parts, by the vanishing derivative with respect to , and depends on the form of the CGF and the stationary-path value of its argument . The two conditions are solved self-consistently through the Eq. (47).
Joint stationary values for at intermediate times t, in a generating function with argument z imposed at a final time T, represent the bundle of rays in the statistical model that dominate the contribution to the importance distribution at later times. For this reason the stationary trajectory in the base distribution is not generally independent of the trajectory for the tilt in systems with non-linear equations of motion.
The stationary-path equations corresponding to Eq. (47) in the logarithmic number-potential coordinates (45) are
| 48 |
These equations are instances of the symplectomorphisms (19). The triangle inequality of divergences (12) within and across exponential families is therefore an invariant of motion.
Duality with the Lagrangian, and the Hamilton–Jacobi equation
The equation of motion (48) for n introduces a coordinate transform from phase-space coordinates to coordinates in the tangent bundle TQ over manifold Q of n values. is the generating function for the transformation and the Hamiltonian on phase space.
The Lagrangian dual to ,
| 49 |
is the argument of the action S up to a total time derivative in . By construction, L is the generating function of the inverse coordinate transform, satisfying
| 50 |
Thus Lagrange–Hamilton duality identifies the phase space with the cotangent bundle to the coordinate manifold.
For any initial distribution encoded in the initial-time generating function in Eq. (43), we may write the action (46) evaluated along a stationary trajectory as
| 51 |
After performing the Legendre transform (8) to the LDF to cancel the surface term in Eq. (51), we find that , at any time, satisfies the pair of equations
| 52 |
The Hamiltonian equations (48) generate a family of maps of phase space, and under any of these maps the initial-value contour for any distribution is mapped to a contour , along which is a solution to the Hamilton–Jacobi equation (52).
The commutative diagram of maps from Legendre and Lagrangian–Hamiltonian dualities
The Legendre and Lagrange–Hamilton dualities of Doi-Peliti theory define a system of D-dimensional coordinate transformations that form a commutative diagram, summarized by Loek and Zhang [47]. Equations (48, 49) define a map between the tangent and cotangent bundles, , describing the same trajectories in respectively Lagrangian and Hamiltonian coordinates. Integration of these equations over any finite time interval t defines a map from pairs of initial and final points in to points in TQ at either initial or final times, by assigning initial and final velocities at those points. Combined with the maps at both times, the Lagrangian map between times generates a symplectomorphism in the cotangent bundle equal to the direct integration along stationary trajectories (48).
The Legendre duality at a single time is defined separately on the exponential families over each of the base distributions that form the leaves in the foliation of Fig. 1, but does not depend on the generator of time translation. The Lagrange–Hamilton duality depends on the generator of time translations and not on the choice of exponential families used to foliate the cotangent bundle. Because points on stationary trajectories in Doi-Peliti theory correspond to tilted distributions over Poisson base distributions (35), the canonical foliation of phase space in Fig. 1 is singled out by the Peliti representation of unity (42). Note that these distributions constitute a 2D-dimensional subspace of the space of all distributions on the lattice . It is shown in [69] that these special distributions, because they are minimum-information distributions, have the interpretation of macrostates for a general equilibrium or non-equilibrium thermodynamics.
In the framework of [47], the Doi-Peliti integral (43) and other two-field functional integrals are defined on the Pontryagin bundle of triples in which all three variables are independent. It is only with the reduction to saddle-point trajectories (and only if the Doi-Peliti integral is evaluated in its continuous-time limit, and not with a finite time-step in the inner product (40)), that these triples are reduced to Lagrangian and Hamiltonian descriptions of the same trajectories with no “duality gap”.
Coordinate transformations in phase space expressing adjoint duality as time reversal
A large body of work on “fluctuation theorems” [64], although not required to derive the foregoing dualities or their implied conservation laws, is nonetheless helpful in interpreting the meaning of the conserved volumes in Doi-Peliti theory. Fluctuation theorems may be seen as extensions to finite time intervals of the study of instantaneous adjoint duality between the Kolmogorov-backward and -forward evolution [33] under the transition matrix .
Hatano and Sasa [38] recognized that application of a suitable similarity transform to the transition matrix exchanges the forms of and its adjoint: that is, the similarity transform exchanges the forms of the Kolmogorov-backward and -forward equations, making the evolution of distributions appear like that of observables and vice versa. Performing the similarity transform at every moment in an extended-time quadrature of the form (34) has the effect of multiplying each path of the stochastic coordinate by a weight [65], creating an extended-time generating functional for paths.1 The required similarity transform is generally non-locally determined and requires solution for stationary distributions on the whole state space.
An exception arises for cases in which the stationary distributions are minimum-information distributions such as the Poisson (35), for which the phase-space coordinates determine all fluctuation moments. In these cases the similarity transform in the potentially-infinite-dimensional state space can be projected to a coordinate transform in the phase space [69]. Moreover, the required path weights of [65] for all paths are given by a D-dimensional, time-local correction to the Liouville function , making evolution under the adjoint generator appear like ordinary stochastic evolution in reverse time.
One such coordinate transformation, exchanging the roles of coordinate and response fields in phase space, was first used in Doi-Peliti integrals by Baish [11]. Let be the saddle-point value of the field n in Eq. (45) in the steady state that would be annihilated by at the parameters it possesses at some time. If and thus is explicitly time-dependent, then the scale factor will generally be different for each time. Define coherent state fields rescaled locally by as
| 53 |
The form of the action in fields remains as in Eq. (44) but the Liouville function is replaced by a possibly-shifted function
| 54 |
If we follow the Baish transformation (53) with a logarithmic transform equivalent to (45), but in the descaled fields,
| 55 |
we coordinatize the base distributions in Fig. 1 as exponential families, and the dependence of number n on in mixture families. The evaluation of the triangle inequality (12) is unchanged, but the coordinates in which vector fields are expanded are transformed.
The action (44) in the new variables becomes
| 56 |
where the modified Liouville function from Eq. (54) must be used, such that in the new variables
| 57 |
The symmetric exponential families on phase space represented by the coordinate systems from Eq. (45) and from Eq. (55) provide natural variables in which to expand the coefficients of affine connections constructed from the -divergence below in Sect. 3.3.
The Liouville theorem connecting dynamics to inference induced by two-field stationary trajectories
In the review of standard results in Sect. 2, we have shown where a phase space with conserved volume under mappings along Hamiltonian stationary trajectories originates within Doi-Peliti theory, and identified a particular form (15) for the volume element preserved by a mapping of coordinate differentials along these trajectories. In this section we compute two quantities that are conserved under time translation and develop their interpretations as information measures. The first is a scalar phase-space density, which we identify as the Wigner function of semiclassical Liouville evolution. The second is an inner product of vector fields, which we express in coordinate-invariant terms by defining affine connections for dual parallel transport compatible with the Riemannian metric (4).
The distinctive feature of the parallel transport derived here is that it is affine-flat in the coherent-state variables of Sect. 2.3.2, rather than in the exponential coordinates normally used to define dually-flat parallel transport, as in Ch. 6 of [3, 54]. The importance-sampling interpretation of Sect. 2.2 identifies a geometric role for coherent-state variables that we believe has not been recognized. Dual parallel transport under these affine connections expresses conservation of the Liouville volume element as a result of compensation, in the flow of phase-space trajectories, between the resolution of underlying base distributions and the discrimination imposed by sampling bias. The information available from the large-deviation probability as a sample estimator, about differences between possible initial distributions, is carried by the eigenvalues of the Fisher information metric, which are transported as invariants along stationary trajectories.
The canonical transformation of Sect. 2.3.6, which exchanges the roles of fields representing the base distribution and the sampling bias in the Doi-Peliti integral, is used to express the conserved information volume in symmetric form in terms of the Fisher metric. The symmetric representation of adjoint duality [37, 38] in terms of apparent time-reversal in the Doi-Peliti integral makes clear that the “reversibility” entailed by conserved phase-space volumes in the Hamilton–Jacobi equation should be understood as a reflection of duality between dynamics (the forward propagation of base distributions) and inference (the backward propagation of sample biases).
We begin with the conserved scalar density and its interpretation as a statistical model in the terms of importance sampling, and then define vector fields corresponding to the extended Pythagorean theorem (12) and derive appropriate affine connections for them.
The Peliti functional integral as a statistical model
The Peliti basis of coherent states (37) defines a statistical model for the stochastic process. The role of the projection operators in the representation of unity (42) in populating the model can be clarified by splitting the functional integral (44) at any intermediate time t, in the same fashion as the Chapman–Kolmogorov equation splits the time evolution of a probability distribution by a sum over intermediate states. This is done by integrating Eq. (43) up to time t, inserting an explicit representation of unity in terms of a pair of fields , and resuming the functional integral on the generating function extracted by that representation of unity:
| 58 |
The argument is the CGF coordinate for an exponential family of distributions tilted from whatever base distribution the functional integral produces at time t. The inner product is a dual mixture coordinate, corresponding to the mean of n in the importance distribution with as the base distribution and as the likelihood ratio. It is the mean of this importance distribution, together with the log-likelihood that is its dual coordinate in the exponential family, that must transform under symplectomorphism to satisfy the condition (19) for preservation of inner products. We show next that the stationary-path conditions provide the necessary mapping.
The Wigner function from the two-field identity operator plays the role of a phase-space density
The scalar density in 2FFI that fills the role of a phase space density in classical Hamiltonian mechanics is the Wigner function [76], of which versions exist for both classical and quantum systems.2 It is defined in terms of the representation of unity in Eq. (58), as
| 59 |
Eq. (59) implies that, for at any time,
| 60 |
In a saddle-point approximation, one identifies arguments for which, to leading exponential order,
| 61 |
Since Eq. (61) approximates the same function at any time t, its total time derivative along the sequence of stationary points must vanish,
| 62 |
Moreover, the stationary points should coincide with values along the stationary trajectories (47) of the functional integral (58), which satisfy
| 63 |
Equation (62) may thus be recast as the conservation law for a 2D-dimensional current ,
| 64 |
which is Liouville’s theorem.
is a density of rays for joint base distributions and likelihood ratios that is conserved along the Doi-Peliti stationary trajectories. is the leading exponential approximation to the value of the CGF. It therefore integrates information along the trajectory from the final-time imposed value of z and the initial-time structure of the generating function .
The indirect definition (59) of the Wigner function in terms of the functional integral is convenient to manipulate but perhaps not very self-explanatory. Appendix E gives a direct construction of the stationary-point approximation in terms of a density over the basis of coherent states and their Laplace transforms, and verifies that the sequence of stationary points do indeed coincide with the equations of motion (47).
Constraints and conserved current flows in reduced dimensions
Often systems of interest will evolve under constraints arising from conservation laws, such as conserved quantities of the stoichiometry in chemical reaction networks [44, 60, 71]. Conserved quantities result in flat directions in the CGF and zero eigenvalues of the Fisher metric. Since generally the constraints will involve multiple species, and because the logarithmic canonical transform (45) is defined in the species basis, it will not be possible to factor out non-dynamical combinations. Then the transport equation (64) for the current of the Wigner density will occupy only a sub-manifold of the 2D-dimensional Doi-Peliti coordinate space needed to define the system.
A convenient way to handle constraints is to work in the eigenbasis of the Fisher metric which we will index with subscript , where a number-potential counterpart to the transport equation (64) reads
| 65 |
The picture of the Liouville equation as implying a conserved volume element
| 66 |
with the product index taken only over nonzero eigenvalues of the Fisher metric, remains nondegenerate and has a direct interpretation in terms of the product of eigenvectors of the Fisher inner product in independent dimensions.
The Fisher metric and cubic tensor in dual canonical coordinates
The leading-exponential equivalence of the Wigner density to the CGF from Eq. (61) suggests that the 2D-dimensional differential of the stationary-point CGF should likewise obey a symplectic transport law, implying a transport law for the Fisher metric. To derive those results we return to the expression of the differential of the CGF in terms of the generalized Pythagorean theorem (12), and derive the Fisher metric from the -divergence following Amari [2, Sec. 6.2].
Base distributions corresponding to points along stationary paths under the action (44) form exponential families, because they are in the class of coherent-state distributions described in Appendix A.2. Therefore label importance distributions (6) symmetrically as with the exponential coordinates in the two logarithmic transformations (45) and (55). To study their independent variations about a reference value , introduce two distinct exponential families, labeled
| 67 |
The -divergence , a Bregman divergence of the CGF, is related to the Kullback–Leibler divergence of from as
| 68 |
The mixed second partial derivative of gives the same variance that defines the Fisher metric. At general , , it is labeled
| 69 |
At , , the second line of Eq. (69) recovers exactly the differential form of the Pythagorean theorem of Eq. (12) in dual exponential coordinates.
Two third-order mixed partials define the connection coefficients for Amari’s dually-flat connections on exponential and mixture coordinates. Written in all-contravariant indices,3 these are given by
| 70 |
Evaluated at and ,
| 71 |
is the Fisher metric introduced in Eq. (4), and is the cubic tensor, also called the Amari–Chentsov tensor [2]. Below we remove the subscript R and write and as the arguments of these tensors.
The dual vector fields induced by base-distribution initial conditions, and final-time tilts
From the construction of in Sect. 3.3, we can see how to use the stationary-path equations of motion to induce two mappings of vector fields that respect the dual arguments of the -divergence. Variations in the likelihood act on the argument, while variations in the base distribution act on the argument, in Eq. (67). The stationary-path equations are then used to define a 1-parameter family of maps of any basis of dual variations in initial base and final tilt parameters to intermediate times. Conservation of the Liouville volume then translates to a conserved inner product of pairs of vector fields transported respectively under the two branches of the dual mapping. Conservation of the inner product will imply a transport equation for the Fisher metric corresponding to the Eq. (65) for the Wigner function.
Introduce two vector fields corresponding to variations in at the final time T, and to variations in at the initial time 0. The first can be independently imposed through the arguments in , while the second can be independently imposed in the initial data. Fields and are written in components as
| 72 |
The stationary-path conditions map the dual initial and final coordinates to pairs of coordinates at any intermediate time, which we denote , . A one-parameter family of vector fields is defined by assigning to each such coordinate image the field values
| 73 |
Under the change of coordinates from to at each time t, the vector fields (73) may be written in terms of the local coordinate differentials as
| 74 |
Below we suppress the explicit coordinate and time arguments of and , and indicate the time t in a subscript only where it is needed to avoid confusion.
The vector fields (74) have a time dependence that can be defined through the dependences of on the boundary coordinates and then transformed to the local coordinate system, becoming
| 75 |
Eq. (48) is used to arrive at the third form of each equation in terms of mixed partials of and .
The coordinate transformation (7) from contravariant exponential coordinates to covariant mixture coordinates may be used in two ways to write the inner product of vector fields and in mixed form. From the definition of the inner product in terms of in Eq. (69) and its equivalence to the Hessian definition of g in Eq. (71),
| 76 |
Although the field variable n is the same in either log-transform (45) or (55), the two displacements and are independent vector fields.
The conserved inner product of dual vector fields, and directional transport of the metric
Equation (75) has a symmetric form but evolves and respectively using and , making it not immediately apparent that the inner product is preserved. Writing the field in its dual mixture coordinate as in the first line of Eq. (76) the time derivative becomes
| 77 |
The condition (18) is met and we have
| 78 |
Using Eq. (78) to evaluate the change in the inner product written as , substituting the derivatives (48) for and , and grouping terms, we obtain the transport equation for the metric along stationary paths
| 79 |
The tensor transport equation from Eq. (79) can be compared to Eq. (65) for the transport of the Wigner density.
Dual connections respecting the symplectic structure of canonical transformations in the two-field system
The transport relations derived so far make use of the symplectic structure of maps generated by time-translation along Doi-Peliti stationary paths, but they are not specifically geometric. We now turn to geometric constructions that respect the symplectic structure, render its maps coordinate invariant under canonical transformations, and express the special roles of affine transport in some coordinates such as coherent states through the definition of appropriate Riemannian connections.
Conservation of the inner product through the combined effects of two maps
The inner product (76) is preserved through the complementary action of two maps, one generated by the time-dependence of , and the other by the time-dependence of . By construction, depends on time only through , and only through , while the metric has no explicit time dependence but changes under both maps as the location changes. Denoting by and these separate components of change, the time derivative of the inner product can be partitioned into two canceling terms:
| 80 |
Connection coefficients may be added within either or to make the components of change in the vector field and metric coordinate-invariant, without altering the duality between independent variations in the base distribution and in the likelihood ratio.
To introduce a connection we first replace the total derivative d/dt with a partial-derivative decomposition expressing the same transformation as a flow:
| 81 |
Connection coefficients are defined from the pullbacks or of infinitesimally transformed basis vectors in the tangent spaces to the two exponential families,
| 82 |
Superscripts or refer to the subspace of basis vectors or being pulled back, and the designation or distinguishes the connection associated with displacement or displacement, respectively. Because the component of time translation does not act in and vice versa, we set connection coefficients and to zero.
Covariant derivatives of the vector fields and in the connections , of Eq. (82) are defined as
| 83 |
The covariant part of the flow decomposition in Eq. (81) is defined by subtraction of the nonzero connection coefficients from the total derivatives (75), as
| 84 |
Compensating covariant derivatives of the metric are
| 85 |
Equation (84) extracts a coordinate-invariant component of the time derivative of vector fields and under canonical transformations, while Eq. (85) extracts the corresponding coordinate-invariant part of the change in the Fisher metric.
Referencing arbitrary dual connections to dually flat connections in the exponential family
The manifold for a Doi-Peliti system with D independent components has dimension 2D, with parallel subspaces for the base distribution and likelihood ratio. The dual connections (82) act within these two independent subspaces, in contrast to the dually-flat connections and of Eq. (70), which act within the same D-dimensional exponential family. Although the Fisher metric is a function only of the overall importance distribution, which aggregates dependence from the base distribution and likelihood, the symplectic transformations from translation along stationary paths separate components of variation from within the two independent subspaces. The subspace decomposition cannot be recovered from the importance distribution alone, and thus no connection defined only from the properties of the Fisher metric is sufficient to identify the dual symplectic connections for a Doi-Peliti system.
Nonetheless, we may relate the symplectic dual connections to Amari’s dually flat connections and the Amari–Chentsov tensor through the relation (see [2, Eq. (6.27)])
| 86 |
Substituting Eq. (86) into Eq. (85) gives expressions for the dual covariant derivatives of the Fisher metric
| 87 |
Flat connections for coherent-state coordinates
Of particular interest in Doi-Peliti theory will be the canonical transformations (45) and (55) between coherent-state and number-potential coordinates. We note that the forms of the connection coefficients for which affine transport in fields is flat in the likelihood subspace, and affine transport in fields is flat in the base-distribution subspace, are4
| 89 |
On the roles of coherent-state versus number-potential coordinates in the Doi-Peliti representation
The Doi-Peliti solution method is almost always introduced through the coherent-state representation [9, 42, 53], and for many applications such as chemical reaction networks [8, 10, 44, 71] or evolutionary population processes [70], coherent states are also the “native” representation in the sense that the Liouville operator is a finite-order (generally low-order) polynomial in fields. Moreover, for the importance-sampling interpretation emphasized in this paper, the coherent-state representation separates the nominal distribution and likelihood ratio.
On the other hand, Legendre duality is defined with respect to potential fields, which are the tilt coordinates in the exponential family of importance distributions, and it is in these coordinates, not the coherent-state coordinates, that the Fisher metric corresponds to the Hessian of the CGF. Indeed, it is not generally possible to define a dual coordinate system from the Hessian of the CGF in coherent-state fields, as we illustrate for the worked example in Sect. 4.5.
The use of Riemannian connections neatly expresses the role of each coordinate system. The elementary eigenvalues of divergence or convergence of bases and tilts, and of information susceptibilities, are often simple in coherent-state coordinates, where they are eigenvalues of coordinate divergence or convergence. In the dual connections (90), covariant derivatives retain these elementary eigenvalues, while inheriting from the exponential family the Fisher geometry that defines contravariant/covariant coordinate duality. A concrete example is given in the next section.
A worked example: the two-state linear system
The foregoing constructions are nicely illustrated in minimal form in a simple, exactly solvable model. It is the stochastic process for N independent random walkers on a network with two states and bidirectional hopping between them. The statistical mechanics of transients, time-dependent generating functionals, and large deviations for this system has been didactically covered within the Doi-Peliti framework in [68]. Though simple, the model is nonetheless rich enough to illustrate the complementary roles of coherent states and number-potential coordinates in Doi-Peliti theory—the former as the “native” coordinates in which the system is simple, and the latter as the coordinate system carrying the Fisher geometry—and the way this relation is captured by the dual coherent-state connection (90) different from both the Levi-Civita connection and the dually-flat connections (71) of Nagaoka and Amari [3, 54].
Two-argument and one-argument generating functions on distributions with a conserved quantity
The two-state model describes (for example) a one-particle chemical reaction in a well-mixed reactor with the schema
| 91 |
The probability per unit time for a reaction event is given by rate constants and , and proportional sampling (the microphysics underlying mass-action rate laws).
A distribution initially in binomial form (36) will retain that form at all times under the master equation for the schema (91), even with time-dependent coefficients. Here for simplicity we will take and to be fixed. Therefor the distribution at any time is specified by descaled mean values , , with .
Although the system has only one dynamical degree of freedom, it is instructive to compute both the two-argument generating function with independent weights on and on , and the 1-argument generating function for the difference coordinate , to illustrate the role of conservation laws and the geometry of the coherent-state connection. The two-argument CGF (1) for the binomial distribution is
| 92 |
Because the total number is fixed, the normalized 1-variable distribution may be written
| 93 |
and the terms in the generating function (92) regrouped as
| 94 |
Introducing rotated coordinates on the exponential family of tilts
| 95 |
and dividing the two-argument MGF (94) by , we obtain an expression for the one-argument MGF in the difference coordinate :
| 96 |
In what follows, will always be used to refer to the two-argument CGF (92), and the 1-argument generating function, when needed, will be written out explicitly as , as in Eq. (96).
Generator and conserved volume element in coherent-state coordinates
The master equation for the two-state system is developed in [68], but introduces further notation, and will not be needed here. We move directly to the expression for the Liouville function of Eq. (44) after conversion to field variables, which is
| 97 |
In what follows, math boldface will be reserved for parameters in the generator such as or functions of these such as the associated steady states used in Eq. (53).
Two descalings reduce the problem to parameters which are dimensionless ratios. The first defines a time coordinate in units of the sum of rate parameters,
| 98 |
The second expresses the equilibrium steady state under generator (97) in terms of relative hopping rates,
| 99 |
As for the discrete index , define .
Conservation of total number N results in a generator that is a function only of the difference variable . Therefore it is natural to rotate the coherent-state fields to components corresponding to conserved N and dynamical , and their dual coordinates in the generating-function argument z:
| 100 |
In rotated fields (100) the action (44) becomes
| 101 |
A descaled Liouville function has been introduced as . Absence of the field from implies constancy of the expectation for .5
Splitting the symplectic structure between coherent-state conjugate field pairs
Although obeys certain time-translation invariances in correlation functions, its value even along stationary paths will not generally be 1. Therefore the coherent-state variables cannot directly be interpreted as mean values of number variables in the nominal distribution or mean weights in its dual likelihood ratio. To express the functions that are these expectation values, we recall the mean number components in the importance distribution, which are bilinear quantities in and , and then introduce a pair of dual number coordinates that, while not linear functions of the coherent-state fields, are functions respectively of or of extracted by making use of the steady-state measure under the instantaneous value in the generator (97). (Along stationary paths, where some components of or are invariant, these dual number fields will become linear functions of the remaining dynamical components of or , as we show below.)
The two components of the normalized number field in number-potential coordinates (45) are given by
| 102 |
Recall that the instantaneous steady state under the generating process is the scale variable for the dualizing canonical transform (53). To see how this reference steady state is used to separate the two conjugate variables (base and tilt) in the symplectic transformations, it is helpful to recast Eq. (102) as
| 103 |
The action of the tilt alone can be isolated, without regard to the underlying nominal distribution, by referencing the action of the fields to the steady state rather than to , defining an offset as
| 104 |
Likewise, the mean value of in the base (nominal) distribution is isolated by referencing the value of to the uniform measure 1 instead of the dynamic measure , as
| 105 |
Stationary-path solutions and Liouville volume element
Solutions to the stationary-path equations of motion (47) for the Liouville function (97) are evaluated in Appendix F.1.
Stationary-path approximations to the time-dependent density would be binomial distributions even if the exact were not (the stationary point is always a pure coherent state), so the CGF at any time has the form (92), with fields z replaced by the stationary-path values of and the mean values from Eq. (93) replaced by corresponding components of .
In particular, the initial-time generating function appearing in Eq. (43) carries the mean value in the starting density , imposed as an initial-data parameter. It is through this function that the final-time tilt data in the form of the parameter , propagated backward to the stationary-path values of and , determines the stationary path values for the fields of the base distribution, establishing the potential for information coupling between initial properties of the base distribution and final-time queries in the generating function .
is evaluated in Eq. (190), and the value is shown to depend only on an overlap parameter between initial and final data which we denote
| 106 |
The stationary-path values of the displacement coordinates (104) and (105) are shown in Eq. (194) to follow simple exponential laws
| 107 |
Thus under independent variations of and as described in Sect. 3.4, the trajectories of the coherent state fields and trace out an invariant volume, illustrated in Fig. 2.
Fig. 2.

Four trajectories (heavy blue contours), plotted in coordinates , that bound a region specified by , . The steady state under the generating process sets . A time interval between the input distribution and the final-time generating function is shown. Small rectangles (heavy dark red) show the area at five equally spaced times from start to end. The outer four trajectories (thin blue) show the possible range of joint images of and
. Large rectangles (thin green) show the constriction of the possible range . Projections of the total range and the inner trajectories are shown in thin lines on the base plane. Shading of the base plane is a grayscale plot of , which is constant along trajectories but variable over the coordinate range. Min and max of are respectively 0.83 and 1.1
Invariant cumulant-generating function and the incompressible phase-space density
The stationary value of , obtained from the gradient of with respect to the components and , is computed in Eq. (193). It differs from unity—the reason constructions (104) and (105) were needed—and it is equal to the stationary value of at all times as a consequence of conservation of total number N. The value depends only on and T in the combination
| 108 |
Moreover, as a consequence of the conserved Liouville volume element from Eq. (107), the stationary-point evaluation of the CGF at all times takes the same form as Eq. (92) and evaluates to the constant
| 109 |
in Eq. (108) is the basis for all information densities in this simple linear system. Through the stationary-point relation (61) between the Wigner function and the CGF, is the incompressible phase-space density convected along stationary trajectories by Eq. (64). As shown below, it is also the geometrically invariant part of sole nonzero eigenvalue of the Fisher metric.
Fisher metric
The Fisher metric (5) for the two-state system evaluates, along the stationary path at any time, to
| 110 |
The nonzero eigenvalue comes from the single-argument generating function in Eq. (96) for the difference coordinate , and the zero eigenvalue comes from the linear CGF hN for the conserved quantity N.
The term in Eq. (110) may be converted, after some algebra, to the form
| 111 |
The measure terms
and
appearing in Eq. (111) follow the divisions (104) and (105) into independent dimensions of base and tilt variation, and we will show that their effects are canceled in an appropriate covariant derivative. The remaining dependence of the eigenvalue on the initial and final data is all carried in .
Dual coordinates for base and tilt, and the additive exponential family
To relate the Fisher metric in Eq. (110) to the construction of Sect. 3.3 from the -divergence and to dually-symplectic parallel transport, we first express the base and tilt displacements (104) and (105) in terms of the coordinates in their respective exponential families.
Introduce reference values for the fields and h defined in Eq. (95), corresponding to the steady-state measure under the parameters of the generating process, denoted
| 112 |
It is clear, in the two-argument generating function (92), that one component of variation in z couples only to the conserved quantity N and is not needed. It is sufficient therefore to vary along an affine coordinate in z that couples to the dynamical argument , and the natural choice is to fix the component of z corresponding to the component of that is invariant under the stationary-path equations of motion, given in Eq. (189). The resulting contour for z at final time T becomes
| 113 |
The quantity in the first line of Eq. (113) is preserved at all times by Eq. (189), and the quantity in the second line obeys the exponential law of Eq. (194) repeated as (107).
By the definition (104), the contour (113) which is affine in coherent-state fields is written in the coordinates on the exponential family of tilts as
| 114 |
Likewise, in the dual exponential representation (55) of the family of base distributions, the definition (105) giving the mean number offset in the nominal distribution is expressed
| 115 |
in which in the dual number-potential system (55), analogously to in Eq. (95).
Because the two exponential coordinates (base and tilt) are additive, the mean of samples in the importance distribution can likewise be written
| 116 |
It follows that the eigenvalue (111) in the Fisher metric also has the simple expression
| 117 |
exhibiting the equivalence of the -divergence expression (69) and the Hessian (71) for this quantity.
Why coherent-state fields do not generally produce invertible coordinate transformations
The Hessian matrix is not a tensor under coordinate transform, so it is clear that the Hessian of with respect to the argument z equivalent to the coherent-state response field will not be the Fisher metric. However, since coherent states are in many ways a native basis for Doi-Peliti theory, as noted in Sect. 3.6, we may ask whether some other coordinate duality can be defined from the coherent-state Hessian of . In fact such a duality cannot generally be defined, and it is instructive to see where it fails, to better understand why the affine connection (90) and not the Fisher geometry captures the special role of coherent states.
A divergence under the Hessian of in coherent-state variables, which we will denote for reasons to become clear in a moment, if converted from the coordinates to coordinates along the z-affine contour (114), evaluates as
| 118 |
Unlike the Fisher metric, Eq. (118) is negative-semidefinite, and degenerates if , which is shown in Eq. (197) to hold for all z if . At degenerate solutions, we cannot use the Hessian of to define a base-field variation as a dual coordinate for a variation produced by a field , as we could use the Hessian in the exponential family to produce a variation as a dual coordinate to a variation .
The source of the degeneration has a nice description in terms of intrinsic and extrinsic curvatures, and advection, in the natural geometry on the exponential family. The geometric distance element (4), with and h varied independently, is
| 119 |
The z-affine contour (114) specifies a function with extrinsic curvature in the affine coordinate manifold of the exponential family, along which the distance element is
| 120 |
The second coherent-state coordinate derivative of along the contour (114) can be decomposed as
| 121 |
With some algebra, the expression (121) is shown to equal that in Eq. (118). The extrinsic curvature of the embedded contour and the convected quantity cancel against the intrinsic Fisher-Rao curvature, rendering the duality invisible to the fields at degenerate points.
Flat transport in the coherent-state connection
The correct way to capture the simplifying role of coherent-state coordinates for simple models such as the two-state system is with the dual connections of Sect. 3.5.
We first recognize, from the forms (106) or (195) of , a completely-descaled coordinate system for the dynamical parts of the coherent-state fields, defined by
| 122 |
The eigenvalue of the Fisher metric in Eq. (111) then reduces to
| 123 |
The role of the factors
and
in Eq. (111) as measure terms is now explicit, and they can be absorbed by a change of variables to u and v. By Eq. (195) and the definitions (122) and (108), , so the Fisher inner product (12) may be written
| 124 |
Connection coefficients and absorption of measure terms
In this linear model, time evolution of and has no cross-dependence once the initial values have been fixed through the gradients of as explained in Sect. 4.2.2. Thus and .
Appendix F.4 computes connection coefficients and covariant derivatives for the vector fields corresponding to Eq. (84), and for the metric tensor corresponding to Eq. (85). Eq. (205) in the appendix gives the covariant part of the time derivatives of and as
| 125 |
capturing the simple exponential scaling (107) of the coherent-state fields in the exponential-family coordinates.
The covariant part of the change in the Fisher metric, from Eq. (85) is computed in Eq. (207) to be
| 126 |
Only the dependence in the Fisher eigenvalue from Eq. (124) appears.
The two lines of Eq. (126) (which are equal and opposite) scale as , and have an interpretation similar to that of a Le Chatelier principle. The term in Eq. (108) for is a susceptibility of the initial stationary value to the perturbation by the tilt variable , attenuated exponentially from time T to time 0. The role of this attenuation, which takes as , becomes clearer as a constraint on the total extractable information when we consider in Sect. 4.7 the range of all initial distributions and all tilts z.
Duality of dynamics and inference in Doi-Peliti theory
The natural separation of the coordinate transformation of the inner product of vector fields and generated by time translation is not between exponential and mixture coordinates, as in the dually-flat connections of Amari [2], but rather between the symplectically dual contributions from changes in and in . The two contributions group as
| 127 |
The two rows of Eq. (127) add covariant contributions from Eq. (125) and Eq. (126) in the combinations
| 128 |
Eq. (128) captures in the clearest way possible the symplectic balance of distribution dynamics (through ) and inference (through ) in Doi-Peliti theory, through both the direct effects of the exponential growth and decay eigenvalues and the Le Chatelier-like susceptibility of the density .
The Fisher information density and large-deviation ratios as sample estimators
The interpretation of the vector inner product as a convected density of information can be illustrated by using ratios of large-deviation probabilities to define a sample estimator for differences in the tilt coordinate between two base distributions.
Suppose that we sample from a binomial nominal distribution at a parameter that is to be estimated. Recall from Eq. (24) that the probability for the value of a sample to exceed a threshold n is given in terms of the large-deviation function by
| 129 |
In a 1-dimensional system,6 for two threshold values , the conditional probability for to surpass given that it has surpassed is the ratio
| 130 |
The ratio (130) can be estimated from samples of the indicator function for thresholds n as described in Sect. 2.2.
Appendix F.5 shows that if two such conditional probabilities are compared from distributions at unknown parameters and , the log ratio is related to the large-deviation thresholds and the values as
| 131 |
where is one of the two forms of the (differential) inner product appearing in Eq (76).
Thus
| 132 |
is a sample estimator for the difference of exponential parameters in the two underlying distributions.
The quantity (131) may be computed at any time, for instance the final time T when the thresholds and are imposed as experimental conditions, and and characterize evolved nominal distributions at time T from any pair of initial conditions at some earlier time . If we use the stationary-path conditions to propagate values of and through time, and define to be the area inside the image of the rectangle in Eq. (131) along these stationary trajectories, time-invariance of the inner product, and the Liouville conservation of volume elements in dual coordinates, implies that
| 133 |
Note that, with a coordinate transform to coherent-state variables and a corresponding redefinition of the boundary of V, the relation (133) could be recast using Eq. (124) as
| 134 |
which is the conserved integral graphed in Fig. 2.
In Eq. (134) , the 2-dimensional differential of the scaled CGF , appears explicitly as the density of overlap of dv with du that, like itself, is constant along stationary paths. is not independent of the position within the volume V, but because the volume element moves along with the conserved density, the integral measures a fixed quantity of Fisher information as it is transported through different domains of base and tilt.
Although the limits of integration for in Eq. (134) are bounded, the limits on in Eq. (131) are not, so formally the range of the sample estimator (132) remains unbounded over any duration T. However, for any fixed values of and starting uncertainty , the total information obtainable from large-deviations sampling about differences in the initial conditions is finite and decreases as . In Fig. 2 this limit is seen in the way any fixed ranges are squeezed exponentially at the “waist” as . The contraction of boundaries, rather than the asymptotic behavior of the eigenvalue in the Fisher metric, measures the loss of information between initial distributions and final observations with increasing separation between the two.
Conclusions: the duality of dynamics and inference for irreversible and reversible processes
The three-part structure of the Fisher metric, dual Riemannian connections, and symplectic parallel transport of the Wigner density, vector fields, and the metric tensor, elegantly expresses the transport properties along 2FFI stationary paths in terms of geometric invariants. It resolves a feature of two-field constructions that at first seems paradoxical: if memory of initial conditions is continuously lost to dissipation, what concept of time-reversal is implied by invertibility of the map along stationary rays? The answer from the perspective of importance sampling is that, even if samples are finite, their expectations are computed in continuous-valued distributions, and deformations of measure through the Radon–Nikodym derivative can locally compensate for concentration of measure in the nominal distribution by expanding sensitivity of likelihood ratios. Locally in sampling space, then, time is immaterial as it is in Hamiltonian mechanics; the mappings along stationary trajectories make it possible to interpret sampling protocols from different times in an evolving distribution simply as coordinate transformations of a fixed sampling protocol on the original distribution. On the other hand, for any fixed ranges of parameter variation in the initial conditions, and fixed large-deviation thresholds compared at late time, the integrated Liouville density contracts monotonically with the separation between the two times, reflecting the absolute loss of information that can be recovered.
We have wanted to establish a concrete interpretation of time-duality in 2FFI theories as a duality of dynamics and inference, to provide an alternative to the interpretation in terms of physical reversal of paths that is the starting point in most of the literature on fluctuation theorems in stochastic thermodynamics. Microscopic reversibility can always be added later to any class of 2FFI constructions as a restriction on the scope of phenomena under study, and both stronger conclusions and additional interpretations will then follow from the added constraints. Where the existence of a duality in the mathematics itself does not depend on any such additional assumptions, taking the inference interpretation to reflect the core concepts, directly expressing Kolmogorov’s forward/backward adjoint duality, frames the special case of microscopic reversibility as one in which the system’s own dynamics contains an image of certain sampling protocols over itself.
Even if one only cares about microscopically reversible processes, making explicit the step of self-modeling, and having a concrete interpretation of conserved densities such as the Fisher information constructed here, provides a bridge between trajectory reversal in low-level mechanics and operations for sample estimation of the kind that are used by control systems. Linking limitations from path probability in a system’s autonomous dynamics to concepts of information capacity in control loops [5, 6, 18] promises a way to study the limits on spontaneous emergence of dynamical hierarchy, which has been a desired application for stochastic thermodynamics [23, 59]. These are intended topics for future work.
Acknowledgements
The author thanks Supriya Krishnamurthy for ongoing collaboration and the Stockholm University Physics Department for hospitality while much of this work was done, and Nathaniel Virgo for helpful discussion. The work was supported in part by NASA Astrobiology CAN-7 award NNA17BB05A through the College of Science, Georgia Institute of Technology, and by the Japanese Ministry of Education, Culture, Sports, Science, and Technology (MEXT) through the World Premiere International program. A revision was completed under generous support of the Program of Interdisciplinary Studies at the Institute for Advanced Study.
Fisher spherical embeddings
The embedding for general distributions on finite state spaces
Equation (4) in the text can be written in the form
| 135 |
Let be the cardinality of the set of states on which is defined (for example, in chemistry, only a sub-lattice of all integer-valued vectors in the positive orthant may ever be accessible as counts, given a system’s stoichiometry and conserved quantities). Suppose is finite in order to illustrate the Fisher embedding geometry for distributions over finite state spaces. All possible base distributions fall within the simplex of dimension .
Now let be angles associated with independent rotation axes in . Any distribution can be embedded in by arranging the states in an (arbitrary) order , and writing
| 136 |
A recursive calculation gives the line element (135) in terms of the angle coordinates on the radius-2 sphere as
| 137 |
Embeddings in reduced dimension for exponential families on the multinomial
The Poisson (35) and multinomial (36) distributions are both in a class recognized by Anderson, Craciun, and Kurtz (ACK) [4] in connection with uniqueness of stationary solutions for chemical reaction networks. All factorial moments are powers of their first moments, causing the CGF for many particles to scale as a multiple of a single-particle CGF. It is not then surprising that the expression (4) for the Fisher metric in terms of a distribution with possibly indefinitely many independent terms, for the ACK distributions projects to a function of the same form in terms of expected numbers over the D independent species.
To see how this works for a distribution with multinomial form (36), express the expected number fractions as
| 138 |
Then the mean in the distribution is
| 139 |
and the CGF evaluates (up to a constant offset) to
| 140 |
The Hessian giving the Fisher metric is
| 141 |
where is the function of in Eq. (139). Inverting Eq. (141), and projecting onto the plane to fix the undetermined component of , gives the inverse
| 142 |
One can check that both and sum to zero on either index, and the product
| 143 |
is the identity on the subspace or .
If a shift of the tilted distribution in the exponential family is indexed with coordinate , with , the Fisher distance element from Eq. (11) becomes
| 144 |
which is the same function of n as the function of in the third line of Eq. (4).
Sample means and variances in the large-deviation approximation to threshold indicator expectations
The expectation of the tilted indicator function from Eq. (23) may be written in a series of inequalities culminating in the expression for the CGF, as
| 145 |
providing Eq. (24) in the text.
The variance of the same sample estimator has a corresponding bound
| 146 |
giving Eq. (25) in the text.
To estimate the tightness of the bounds, begin by observing that in the large-deviation scaling regime (27), with all cumulants generated as derivatives of , the expansion of central moments in terms of cumulants bounds the scaling of the kth central moment as
| 147 |
The log ratio we wish to bound is the Bregman divergence
| 148 |
The maximum of Eq. (148) occurs at from Eq. (8), and the width of the transition for the log ratio to change by is given by
| 149 |
To estimate its maximum value we write Eq. (148) as the sum of log ratios of the two inequalities in Eq. (145), and observe that they have boundary values
| 150 |
The values of these ratios at intermediate are then obtained by integrating the derivatives
| 151 |
Both log derivatives in Eq. (151) are monotone if is convex, and their values sum to , the derivative of Eq (148), where .
Next, observe the leading-order scaling of the expectation on the half-line (as for any second moment), and likewise for .
At , where , the two derivatives (151) are equal and opposite, and the boundary of the half-line is also the symmetry point of . Because the skewness and higher-order central moments grow more slowly than by Eq. (147), , and given the large-deviations scaling of central moments (147), the derivatives in Eq. (151) scale as
| 152 |
Over the range from Eq. (149), where the total log ratio changes by , the integral of the first derivative (152) saturates the lower limit in the first line of Eq. (150), and the upper limit in the second line of Eq. (150), to within , implying that the log-ratios themselves at the midpoint scale as . Hence also the total log ratio that is their sum scales as , the result used in the text.
Review of Doi Hilbert space and Peliti functional integral constructions
Doi operator algebra and inner product
The main constructs in the Doi operator formulation [21, 22] of moment-generating functions as formal power series are as follows:
The identification (30) of z and with raising and lowering operators and a allows the commutator algebra
| 153 |
to stand for the commutator algebra between components of and factors of z, applied by function composition acting to the right on MGFs.
Monomials in Eq. (1) are basis elements in a linear space of MGFs, built up by multiplication on the number 1. A bracket notation for states and an inner product are introduced by the pair of denotations
| 154 |
Each monomial is denoted as a number state
| 155 |
The number states are eigenstates of the set of number operators (no Einstein sum):
| 156 |
Dual to each number state is a conjugate projection operator
| 157 |
From the commutation relations of variables and their derivatives it follows that the number states and projectors have overlap
| 158 |
the Kronecker symbol on D indices. The number states and projectors are complete, and a sum of Eq. (158) on is the Glauber norm
| 159 |
which defines the asymmetric inner product on the Hilbert space of generating functions.
Replacing the uniform measure in Eq. (159) with the scalar product za gives the map (33) to in the main text. ; the Glauber norm of the Laplace transform of any normalized distribution is the trace of the probability distribution .
Coherent states and Peliti functional integral
The uniform measure in the 2D-dimensional integral for the representation of unity (42) in the main text is known as the Haar measure. Using the definition (38) for coherent states and (39) for their dual projectors, and expanding the exponential functions as sums,
| 160 |
The phase component in each integral vanishes unless , and the remaining modulus component produces a Gamma-function canceling the . Thus the Haar measure on coherent states is equivalent to the uniform measure on states of the classical probability distribution.7
Mapping backward through the correspondences between analytic functions and Doi state vectors from Appendix C.1, an evaluation of the integral in terms of Dirac -functions8 shows that the effect of the representation of unity is the map
| 161 |
To keep track of the scoping rules for application of complex functions and derivatives would require introducing a distinct set of variables for each interval in the quadrature (34). In the Doi operator algebra this scoping is handled by the bracket inner product, and the map (161) becomes simply the identity map on and a.
The integration measure that results from inserting a copy of the representation of unity (42) between each interval of time evolution in Eq. (34) is called a skeletonized measure. Its limit as the interval length
| 162 |
defines the functional integration measure used in Eq. (43) and elsewhere.
Gaussian-order fluctuations, diffusions, and stochastic differential equations
The Doi-Peliti construction used in this paper affords a point of departure to many other representations of perturbed dynamical systems. It is important to emphasize that the motivation and form of 2FFI representations does not depend on the integer-lattice state space and exponential generating function that enable the specific Doi-Peliti construction. To show this we reproduce from [42, 68] a variety of reductions from the full field theory to the limit of Gaussian-order fluctuations that is specified by a drift velocity and diffusion tensor that are functions only of the configuration variable n. These are “classical” representations of perturbed dynamical systems ostensibly lacking two-field structure. We will show how the inverse of the steps reducing Doi-Peliti to these classical limits has been used in [30, 52] to construct two-field theories or their large-deviation asymptotics in the form produced by the Doi-Peliti 2FFI. These classical descriptions from functions on n no longer reference lattice state spaces or population processes, and the 2FFI theories to which they correspond, like the operator formulation of [30, 52] are expected to apply to much more general perturbed dynamical systems.
The Gaussian-order reduction of the general action (46) is obtained by writing the pair , where are a stationary trajectory, and expanding S to quadratic order in primed fields as
| 163 |
Overbar in Eq. (163) indicates evaluation of S or the derivatives of at .
The Hessian is negative-definite and time-local, so the functional determinant of its time integral may be written as a Gaussian integral over a newly-introduced auxiliary field with integrand , where
| 164 |
We may insert the number 1 into the path integral (43) by inserting the explicit functional integral over and dividing by the determinant that is its evaluation, producing a new functional integral with integrand . The quadratic-order expansion of S shifted by the auxiliary-field action satisfies
| 165 |
The action (165) is linear in , meaning that the functional integral over along the imaginary axis at each time-index t produces a functional Dirac -function for the quantity multiplying . The remaining terms are quadratic forms in and . In general, the -functional produced by integration over connects these two fields by a time derivative, so the implied fluctuation distribution is not simple to compute. However, about ordinary relaxation trajectories that are the classical solutions to the equations of motion, for which , both and also . (See [42] for extensive explanations of why this is the case in general, and of its interpretation in terms of causality.) For this case the functional integral (43) (which we evaluate at for simplicity), becomes
| 166 |
The integral (166) is over an ensemble of trajectories each satisfying
| 167 |
Eq. (167) is the approximation to linear order in the perturbing field of the stochastic differential equation
| 168 |
Because the sole remaining Gaussian kernel in Eq. (166) is time-local, is a Langevin white noise with correlation function
| 169 |
The noise covariance (169), although expressed in terms of the value along a stationary trajectory at , may be understood as a function of the local coordinates n in the underlying manifold, since for classical solutions stationary trajectories form a foliation of the configuration manifold.
The Gaussian-order fluctuation limit corresponding to the Langevin equation (168, 169) produces, for the continuum limit of the lattice of states , the diffusion equation for a density indexed by continuous index , the master equation
| 170 |
known as the Fokker–Planck equation [74]. (This form may be obtained directly from the transition matrix corresponding to the Liouville operator—for example, as written in [44, 69, 71] for chemical reaction networks—by writing discrete shifts of the index as if they were exponentiated operators acting on a continuously-indexed density , and then “expanding the Taylor’s series for the exponential” to second order.)
Here we have derived the three equivalent representations of the Gaussian-fluctuation limit specified entirely by a drift velocity and diffusion kernel as functions of n—the quadratic action (163), the Langevin equation (168, 169), and the Fokker–Planck equation (170)—from a Doi-Peliti starting point. The reverse of the derivation from Eqs. (163–170) may be followed to arrive at either two-field forms or their large-deviation asymptotics starting from the stochastic differential equation or Fokker–Planck equation. Kamenev [42] has shown that the “tri-diagonal” arrangement of response and correlation functions in Eq. (4.1) of [52], equivalent to the quadratic action (163) here, is in fact the common motif in 2FFI for stochastic classical mechanics, and for the Schwinger-Keldysh time-loop in quantum mechanics [43, 63], allowing us to bypass the operator formalism altogether and proceed directly from stochastic differential equations (168, 169) to the path integral up to Gaussian order. For a development of correspondence principles, where both the quantum Schwinger-Keldysh time loop and the stochastic two-field integral appear, see [67].
It is central to the aims of [52] that the tri-diagonal response and correlation functions are not limited to the “bare” constitutive parameters of the underlying stochastic differential equation, but rather are a general form for Gaussian order fluctuations in an effective theory where the drift velocity and fluctuation covariance are generally renormalized quantities, in this way subsuming higher-order fluctuation effects, which may appear explicitly in the Doi-Peliti action. Renormalization of the stochastic action as an effective action may also be done directly from the path integral with diagrammatic methods of the type appearing in [52]. For an application to evolutionary games see [70, Ch. 7].
Alternatively, the large-deviation asymptotics for excursions within, or for escapes from, a domain, may be derived directly from the stochastic differential equation, and related to boundary-value problems for solutions of diffusions such as the Fokker–Planck equation. This program, starting from representations identical to Eqs. (168–170) for Gaussian fluctuations, and covering several other classes of perturbations as well, is carried out by Freidlin and Wentzell [30], and has been applied to a wide range of first-passage and escape problems [48–51].
The connection to 2FFI methods is that an action arises as their large-deviation function from the drift and diffusion functions of the Gaussian limit, which is simply obtained from the Doi-Peliti action. By completing the square in (which is in a neighborhood of ), the quadratic action (163) is put into the form
| 171 |
The quadratic form in is integrated out in the functional integral to produce a functional determinant that depends on the background only logarithmically. Again, the term in in Eq. (171) is the leading-linear expansion of the equations of motion in small fluctuations, so at the Gaussian order to which it is defined, Eq. (171) may be approximated as
| 172 |
a form first due to Onsager and Machlup [55], and appearing in Theorem 1.1 on p. 86 of [30].
Two-field constructions originating in classical perturbed dynamical systems (168–170) in the continuous variable n may readily be extended to continuously-indexed state spaces with topologies, on which the drift velocity may include spatial gradients, as in the case of reaction-diffusion systems, without altering the program for construction of the corresponding two-field representation.
Stationary-point approximations to the Wigner function
The functional integral provides the most direct route to the current conservation law (62) for the Wigner function. It is possible, with somewhat more work, to derive the same relations directly from stationary variation of the generating function, and in the process to gain some more intuition for what the Wigner function quantifies.
The Wigner function in terms of an explicit density over coherent-state parameters
Begin by writing any state vector as the integral of a density in the coherent-state basis:
| 173 |
The generalized Glauber norm (33) returns the analytic representation of the MFG:
| 174 |
Now evaluate the integral in Eq. (59) at time , where the stationary value of the field will coincide with the imposed argument z,
| 175 |
It follows then that
| 176 |
A second integral over the fields yields the two equivalent expressions (60) and (174).
Stationary-point approximations
The stationary value of the tilted density is given by
| 177 |
The two stationarity conditions on the arguments of follow from Eq. (175) as
| 178 |
where the stationary-point approximation (177) to the mean gives the leading exponential approximation in the second expression.
The Wigner function in Eq. (175), at argument z, exactly equals integral (174)
| 179 |
independent of the value of . While the first line of Eq. (178) shows that z is a stationary-point argument for , the second line shows that only when is the other argument also a stationary value.
Time dependence along a stationary path
Suppose now that from such a compatible pair , we wish to extend z and along a trajectory that preserves stationarity. The total time derivative of with respect to its final-time argument is given by
| 180 |
(Note that Eq. (180) includes only contributions to the derivative of from quantities defined before time T; this derivative is different from the total derivative d/dt of in the functional integral (59), which also includes effects of the functional integral after time t.)
Ensuring that is a stationary value if z is one requires that
| 181 |
The term is obtained by an integration by parts over , and evaluated in the stationary-point approximation. All other terms from Eq. (180) vanish at . Thus preservation of the stationary-argument condition for gives the stationary-path equation for dz/dt.
To identify the time-dependence of the stationary argument , we work directly from the stationary-point condition (177). The total time derivative of that equation is
| 182 |
In passing from the second to the third line of Eq. (182), to obtain an explicit expression for in terms of , we evaluate z as an inverse function of from Eq. (177). This functional dependence contributes the term in the final line, from which we obtain the stationary-path equation for the trajectory of along which is to be evaluated:
| 183 |
Equations (181) and (183) imply the 2FFI counterpart to conservation of energy in Hamiltonian mechanics: along the stationary path if .
The stochastic effective action in stationary-path evaluations
Note from Eq. (180) that along the contour identified to preserve stationarity of ,
| 184 |
Therefore the extension of the Wigner function to times must include the stationary-path contribution from the action, which was present for in the functional integral definition (59). thus extended satisfies
| 185 |
recovering Eq. (62).
We have termed the stationary-path evaluation of S the stochastic effective action [68]. It is the functional Legendre transform of the large-deviation functional for trajectories in Doi-Peliti integrals. The approximation (179), with set equal to given , together with the contribution from in Eq. (185), provides the desired interpretation of the Wigner function in terms of densities in the statistical model provided by coherent states, and their exponential tilts by likelihood functions.
Stationary-path solutions for the two-state system
The stationary-path equations and both initial and final values for fields are obtained from vanishing of all terms in the variational derivative of the exponential argument in Eq. (43). We begin with solutions in coherent-state variables, and then present the forms for the descaled number coordinates , , and .
Coherent-state and number-potential solutions
Stationary-path equations and final-time conditions for response fields
The stationary-path equations of motion for the components of the field from Eq. (47), in the rotated basis (100), evaluate to
| 186 |
The final-time values are given by variation of , as
| 187 |
Fixing the magnitude of the combination in Eq. (104) requires varying z along the contour
| 188 |
giving Eq. (113) in the main text. The remaining time-dependent solutions, with time argument denoted explicitly here by subscript , are given by
| 189 |
Initial data are specified in the generating function , which when evaluated at the solutions for become
| 190 |
The final line of Eq. (190) is obtained by combining the two solutions (189) at , and introduces the combination defined in Eq. (106).
Stationary-path equations and initial-time conditions for observable fields
The stationary-path equations of motion for the components of the field from Eq. (47) are obtained by removing a total derivative from the action (44) to shift the derivative onto . In the rotated basis (100), they evaluate to
| 191 |
The total derivative cancels the final-time term from the exponential in Eq. (43) and introduces an initial-time term . Variation of this term against with respect to gives the initial-value conditions for the components of , as
| 192 |
Solutions to the equations of motion (191) from these initial conditions are then
| 193 |
The two displacements defined in Eqs. (104) and (105), characterizing respectively the mean in the nominal distribution and the likelihood ratio applied to the stationary measure, evaluate to
| 194 |
These results are reproduced (dropping the explicit subscripts ) as Eq. (107) in the text. It follows from Eq. (194) that the combination
| 195 |
is invariant at its initial value.
The CGF for a binomial distribution at any time retains the form (92), with and replacing and , and and replacing , and . From Eq. (190) and the invariant form (195), it follows that
| 196 |
giving Eq. (109) in the text.
Finally, the non-linear mean of sample values (102) in the importance distribution can be shown to evaluate to
| 197 |
from which the form (111) for can be derived. The ratio of measures
in Eq. (197), by which the importance distribution responds to variations in the initial data, is the familiar scaling of response functions [68] in the Fluctuation-Dissipation Theorem, because expressions of the form
are the variance of fluctuations in the binomial.
Fisher spherical embedding
In one dimension, the Pythagorean theorem (12) for K-L divergences loses the interpretation of a direction cosine between vector fields, but still reflects scale changes between coherent-state or exponential families and the geometric coordinate.
The mean-value Fisher-sphere construction of Sect. A.2, for one variable, is the embedding on a circle:
| 198 |
The coordinate differential is
| 199 |
and the geometric distance element is then
| 200 |
The term in Eq. (124) becomes
| 201 |
the invariant Fisher information in coherent-state coordinates. The equivalence between the two forms (69) and (71) for the Fisher metric is again recovered as
| 202 |
showing the variation of the embedding coordinate of the importance distribution with the tilt multiplied by its variation with the base.
Evaluation of the Amari–Chentsov tensor
From the form (122) of the Fisher metric in coordinates , the Amari–Chentsov tensor on all contravariant indices can be computed:
| 203 |
T is symmetric under , but its magnitude is not conserved along the stationary-path trajectories. The measure changes and are the same as those in the Fisher metric, but in addition to these the term is not invariant.
Connection coefficients in the coherent-state connection
Among the nonzero connection coefficients for the dual coherent-state connections (90), the only independent components are for the rotated variables of Eq. (95) and of Eq. (115). They are
| 204 |
The covariant components of the time derivative of vector fields and defined in Eq. (84), for transport respectively along and , evaluate to
| 205 |
The corrections from the coherent-state connection coefficients (204) may be understood immediately by using the first line of Eq. (114) and Eq. (115) together with the time dependencies (107), to write
| 206 |
The geometric invariants (205) capture the exponential growth or decay from Eq. (206), while connection terms remove measure factors
,
for exponential relative to coherent-state coordinates.
The Fisher metric g, unlike and , has no intrinsic time dependence and changes only due to change in the net binomial parameter of the importance distribution. Its covariant derivatives (85), with connection coefficients (204), then produce the two independent components of variation from and of
| 207 |
These are reproduced as Eq. (126) in the text.
Two-dimensional divergences of the large-deviation function and their integrals
In the exponential family of tilted distributions with tilting parameter , over a base distribution with exponential parameter , the large-deviation function of two arguments is constructed as
| 208 |
Its variation with at fixed n is given by
| 209 |
By definition of as the inverse function of , the term at all n. Therefore the second derivative
| 210 |
The third line of Eq. (210) uses additivity of and to cancel the two factors, which equal respectively and g.
The log-ratio of the conditional large-deviation probabilities in Eq. (130) may be written as the integral
| 211 |
giving Eq. (131) in the text. The conversion from the fifth to the sixth line in Eq. (211) makes use of the two alternative ways of expressing the inner product in contravariant/covariant coordinates given in Eq (76).
Footnotes
It is impossible to fairly represent the motivations and scope of what has now become a significant fraction of work spanning dynamical systems and statistical mechanics. The study of generating functions for reverse-time trajectories began in dynamical systems [16, 25, 31, 32], and was later taken up in similar form for non-equilibrium stochastic processes [14, 19, 20, 24, 26, 38–41, 45, 46]. Reviews of parts of this literature from different stages in its development and from different domain perspectives include [15, 27, 37, 64].
Wigner introduced this function to treat quantum density matrices, such as those arising in the Schwinger-Keldysh time-loop 2FFI [43, 63], and it is closely related to the Glauber–Sudarshan “P-representation” [17, 34, 72]. Equivalent constructions are widely applied to problems in time-frequency analysis or optics [12, 75], and an example of the map from quantum to classical 2FFI Wigner functions is given in [67].
Note that it is the dual connection , written in all-covariant indices, which vanishes as the affine connection on the mixture family.
| 90 |
| 88 |
It implies constancy of a tower of higher-order correlation functions expressing exact conservation of the underlying variable N, though we do not develop the 2FFI representation of correlation functions in this paper.
In one dimension, the conditional probability is a ratio because the only way to escape beyond is to have also exceeded . In higher dimensions, a similar construction of the conditional can be made, but escapes must be computed along the local least-action trajectories under the action (44), and conditions computed for thresholds that lie in sequence along those trajectories. The leading exponential approximations to such probabilities are the standard first-passage constructions of Freidlin–Wentzell theory [30].
Aaronson [1, (p. 123)] has raised this equivalence as one of the reasons only the complex L2 norm of quantum mechanics results in a correspondence principle with the classical laws of probability. It is interesting that the representation of probability components as squared amplitudes (though only real-valued) also underlies the natural spherical embedding of Appendix (A) for the Fisher metric.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Aaronson S. Quantum Computing Since Democritus. London: Cambridge University Press; 2013. [Google Scholar]
- 2.Amari, S.-I.: Information Geometry and its Applications. Appl. Math. Sci. vol. 194. Springer Japan (2001)
- 3.Amari S-I, Nagaoka H. Methods of Information Geometry. Oxford: Oxford University Press; 2000. [Google Scholar]
- 4.Anderson DF, Craciun G, Kurtz TG. Product-form stationary distributions for deficiency zero chemical reaction networks. Bull. Math. Biol. 2010;72:1947–1970. doi: 10.1007/s11538-010-9517-4. [DOI] [PubMed] [Google Scholar]
- 5.Ashby WR. An Introduction to Cybernetics. London: Chapman and Hall; 1956. [Google Scholar]
- 6.Ashby WR. Requisite variety and its implications for the control of complex systems. Cybernetica. 1958;1:83–99. [Google Scholar]
- 7.Ay N, Jost J, Lê HV, Schwachhöfer L. Information Geometry. Cham: Schwinger International; 2017. [Google Scholar]
- 8.Baez, J.C.: Quantum techniques for reaction networks. arXiv:1306.3451
- 9.Baez, J.C., Biamonte, J.D.: Quantum techniques for stochastic mechanics. https://math.ucr.edu/home/baez/stoch_stable.pdf
- 10.Baez, J.C., Fong, B.: Quantum techniques for studying equilibrium in reaction networks. J. Complex Netw. 3, 22–34 (2014). https://academic.oup.com/comnet/article-abstract/3/1/22/490572/Quantum-techniques-for-studying-equilibrium-in?redirectedFrom=fulltext
- 11.Baish, A.J.: Deriving the Jarzynski relation from Doi-Peliti field theory. Bucknell University Honors Thesis (2015)
- 12.Bazarov IV. Synchrotron radiation representation in phase space. Phys. Rev. ST Accel. Beams. 2012;15:050703. doi: 10.1103/PhysRevSTAB.15.050703. [DOI] [Google Scholar]
- 13.Bertini L, De Sole A, Gabrielli D, Jona-Lasinio G, Landim C. Towards a nonequilibrium thermodynamics: a self-contained macroscopic description of driven diffusive systems. J. Stat. Phys. 2009;135:857–872. doi: 10.1007/s10955-008-9670-4. [DOI] [Google Scholar]
- 14.Chernyak, V., Chertkov, M., Jarzynski, C.: Path-integral analysis of fluctuation theorems for general Langevin processes. J. Stat. Mech., P08001 (2006). 10.1088/1742-5468/2006/08/P08001
- 15.Chetrite R, Gawedzki K. Fluctuation relations for diffusion processes. Commun. Math. Phys. 2008;282:469–518. doi: 10.1007/s00220-008-0502-9. [DOI] [Google Scholar]
- 16.Cohen EDG, Gallavotti G. Note on two theorems in nonequilibrium statistical mechanics. J. Stat. Phys. 1999;96:1343–1349. doi: 10.1023/A:1004604804070. [DOI] [Google Scholar]
- 17.Cohen L. Generalized phase-space distribution functions. J. Math. Phys. 1966;7:781–786. doi: 10.1063/1.1931206. [DOI] [Google Scholar]
- 18.Conant RC, Ashby WR. Every good regulator of a system must be a model of that system. Int. J. Syst. Sci. 1970;1:89–97. doi: 10.1080/00207727008920220. [DOI] [Google Scholar]
- 19.Crooks GE. Entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences. Phys. Rev. E. 1999;6:2721–2726. doi: 10.1103/PhysRevE.60.2721. [DOI] [PubMed] [Google Scholar]
- 20.Crooks GE. Path-ensemble averages in systems driven far from equilibrium. Phys. Rev. E. 2000;61:2361–2366. doi: 10.1103/PhysRevE.61.2361. [DOI] [Google Scholar]
- 21.Doi M. Second quantization representation for classical many-particle system. J. Phys. A. 1976;9:1465–1478. doi: 10.1088/0305-4470/9/9/008. [DOI] [Google Scholar]
- 22.Doi, M.: Stochastic theory of diffusion-controlled reaction. J. Phys. A 9, 1479 (1976)
- 23.England JL. Statistical physics of self-replication. J. Chem. Phys. 2013;139:121923. doi: 10.1063/1.4818538. [DOI] [PubMed] [Google Scholar]
- 24.Esposito M, Van den Broeck C. Three detailed fluctuation theorems. Phys. Rev. Lett. 2010;104:090601. doi: 10.1103/PhysRevLett.104.090601. [DOI] [PubMed] [Google Scholar]
- 25.Evans DJ, Cohen EGD, Morriss GP. Probability of second law violations in shearing steady states. Phys. Rev. Lett. 1993;71:2401–2404. doi: 10.1103/PhysRevLett.71.2401. [DOI] [PubMed] [Google Scholar]
- 26.Evans DJ, Searles DJ. Fluctuation theorem for stochastic systems. Phys. Rev. E. 1999;60:159–164. doi: 10.1103/PhysRevE.60.159. [DOI] [PubMed] [Google Scholar]
- 27.Evans DJ, Searles DJ. The fluctuation theorem. Adv. Phys. 2002;51:1529–1585. doi: 10.1080/00018730210155133. [DOI] [Google Scholar]
- 28.Feynman RP, Hibbs AR. Quantum Mechanics and Path Integrals. New York: McGraw-Hill; 1965. [Google Scholar]
- 29.Flajolet P, Sedgewick R. Analytic Combinatorics. London: Cambridge University Press; 2009. [Google Scholar]
- 30.Freidlin MI, Wentzell AD. Random Perturbations in Dynamical Systems. 2. New York: Springer; 1998. [Google Scholar]
- 31.Gallavotti G, Cohen EDG. Dynamical ensembles in non-equilibrium statistical mechanics. Phys. Rev. Lett. 1995;74:2694–2697. doi: 10.1103/PhysRevLett.74.2694. [DOI] [PubMed] [Google Scholar]
- 32.Gallavotti G, Cohen EDG. Dynamical ensembles in stationary states. J. Stat. Phys. 1995;80:931–970. doi: 10.1007/BF02179860. [DOI] [Google Scholar]
- 33.Gardiner C. Stochastic Methods: A Handbook for the Natural and Social Sciences. Heidelberg: Springer; 1996. [Google Scholar]
- 34.Glauber RJ. Coherent and incoherent states of the radiation field. Phys. Rev. 1963;131:2766–2788. doi: 10.1103/PhysRev.131.2766. [DOI] [Google Scholar]
- 35.Goldstein H, Poole CP, Safko JL. Classical Mechanics. 3. New York: Addison Wesley; 2001. [Google Scholar]
- 36.Goutis C, Casella G. Explaining the saddlepoint approximation. Am. Stat. 1999;53:216–224. [Google Scholar]
- 37.Harris, R.J., Schütz, G.M.: Fluctuation theorems for stochastic dynamics. J. Stat. Mech., P07020 (2007). 10.1088/1742-5468/2007/07/P07020
- 38.Hatano T, Sasa S-I. Steady state thermodynamics of Langevin systems. Phys. Rev. Lett. 2001;86:3463–3466. doi: 10.1103/PhysRevLett.86.3463. [DOI] [PubMed] [Google Scholar]
- 39.Jarzynski C. Equilibrium free-energy differences from nonequilibrium measurements: a master-equation approach. Phys. Rev. E. 1997;56:5018–5035. doi: 10.1103/PhysRevE.56.5018. [DOI] [Google Scholar]
- 40.Jarzynski C. Nonequilibrium equality for free energy differences. Phys. Rev. Lett. 1997;78:2690–2693. doi: 10.1103/PhysRevLett.78.2690. [DOI] [Google Scholar]
- 41.Jarzynski C. Nonequilibrium work relations: foundations and applications. Eur. Phys. J. B. 2008;64:331–340. doi: 10.1140/epjb/e2008-00254-2. [DOI] [Google Scholar]
- 42.Kamenev, A.: Keldysh and Doi-Peliti techniques for out-of-equilibrium systems. In: Lerner, I.V., Althsuler, B.L., Falko, V.I., Giamarchi, T. (eds.) Strongly Correlated Fermions and Bosons in Low-Dimensional Disordered Systems. Springer, Heidelberg, pp. 313–340 (2002)
- 43.Keldysh LV. Diagram technique for nonequilibrium processes. Sov. Phys. JETP. 1965;20:1018. [Google Scholar]
- 44.Krishnamurthy S, Smith E. Solving moment hierarchies for chemical reaction networks. J. Phys. A Math. Theor. 2017;50:425002. doi: 10.1088/1751-8121/aa89d0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Kurchan J. Fluctuation theorem for stochastic dynamics. J. Phys. A. 1998;31:3719. doi: 10.1088/0305-4470/31/16/003. [DOI] [Google Scholar]
- 46.Kurchan J. Non-equilibrium work relations. J. Stat. Mech. 2007;2007:P07005. doi: 10.1088/1742-5468/2007/07/P07005. [DOI] [Google Scholar]
- 47.Leok M, Zhang J. Connecting information geometry and geometric mechanics. Entropy. 2017;19:518. doi: 10.3390/e19100518. [DOI] [Google Scholar]
- 48.Maier RS, Stein DL. Escape problem for irreversible systems. Phys. Rev. E. 1993;48:931–938. doi: 10.1103/PhysRevE.48.931. [DOI] [PubMed] [Google Scholar]
- 49.Maier RS, Stein DL. Oscillatory behavior of the rate of escape through an unstable limit cycle. Phys. Rev. Lett. 1996;77:4860–4863. doi: 10.1103/PhysRevLett.77.4860. [DOI] [PubMed] [Google Scholar]
- 50.Maier RS, Stein DL. A scaling theory of bifurcations in the symmetric weak-noise escape problem. J. Stat. Phys. 1996;83:291. doi: 10.1007/BF02183736. [DOI] [Google Scholar]
- 51.Maier RS, Stein DL. Asymptotic exit location distributions in the stochastic exit problem. SIAM J. Appl. Math. 1997;57:752. doi: 10.1137/S0036139994271753. [DOI] [Google Scholar]
- 52.Martin PC, Siggia ED, Rose HA. Statistical dynamics of classical systems. Phys. Rev. A. 1973;8:423–437. doi: 10.1103/PhysRevA.8.423. [DOI] [Google Scholar]
- 53.Mattis DC, Glasser ML. The uses of quantum field theory in diffusion-limited reactions. Rev. Mod. Phys. 1998;70:979–1001. doi: 10.1103/RevModPhys.70.979. [DOI] [Google Scholar]
- 54.Nagaoka, H., Amari, S.: Differential geometry of smooth families of probability distributions. Technical Report METR 82-7, U. Tokyo (1982)
- 55.Onsager L, Machlup S. Fluctuations and irreversible processes. Phys. Rev. 1953;91:1505. doi: 10.1103/PhysRev.91.1505. [DOI] [Google Scholar]
- 56.Owen, A.B.: Monte Carlo theory, methods and examples. http://statweb.stanford.edu/~owen/mc/ (2013)
- 57.Peliti L. Path-integral approach to birth-death processes on a lattice. J. Phys. 1985;46:1469. doi: 10.1051/jphys:019850046090146900. [DOI] [Google Scholar]
- 58.Peliti L. Renormalization of fluctuation effects in reaction. J. Phys. A. 1986;19:L365. doi: 10.1088/0305-4470/19/6/012. [DOI] [Google Scholar]
- 59.Perunov N, Marsland R, England J. Statistical physics of adaptation. Phys. Rev. X. 2015;6:021036. [Google Scholar]
- 60.Polettini, M., Esposito, M.: Irreversible thermodynamics of open chemical networks. I. Emergent cycles and broken conservation laws. J. Chem. Phys. 141, 024117 (2014) [DOI] [PubMed]
- 61.Rao CR. Information and accuracy attainable in the estimation of statistical parameters. Bull. Calcutta Math. Soc. 1945;37:81–91. [Google Scholar]
- 62.Rao CR. Information and accuracy attainable in the estimation of statistical parameters. In: Kotz S, Johnson NL, editors. Breakthroughs in Statistics: Springer Series in Statistics. New York: Springer; 1992. pp. 235–247. [Google Scholar]
- 63.Schwinger J. Brownian motion of a quantum oscillator. J. Math. Phys. 1961;2:407–32. doi: 10.1063/1.1703727. [DOI] [Google Scholar]
- 64.Seifert U. Stochastic thermodynamics, fluctuation theorems, and molecular machines. Rep. Prog. Phys. 2012;75:126001. doi: 10.1088/0034-4885/75/12/126001. [DOI] [PubMed] [Google Scholar]
- 65.Seifert U, Speck T. Fluctuation-dissipation theorem in nonequilibrium steady states. Europhys. Lett. 2010;89:10007. doi: 10.1209/0295-5075/89/10007. [DOI] [Google Scholar]
- 66.Siegmund D. Importance sampling in the Monte Carlo study of sequential tests. Ann. Stat. 1976;4:673–684. doi: 10.1214/aos/1176343541. [DOI] [Google Scholar]
- 67.Smith, E.: Quantum-classical correspondence principles for locally non-equilibrium driven systems. Phys. Rev. E 77, 021109 (2008). Originally as SFI preprint # 06-11-040 [DOI] [PubMed]
- 68.Smith, E.: Large-deviation principles, stochastic effective actions, path entropies, and the structure and meaning of thermodynamic descriptions. Rep. Prog. Phys. 74, 046601 (2011). arXiv:1102.3938 [cond-mat.stat-mech]
- 69.Smith E. Intrinsic and extrinsic thermodynamics for stochastic population processes with multi-level large-deviation structure. Entropy. 2020;22:1137. doi: 10.3390/e22101137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Smith E, Krishnamurthy S. Symmetry and Collective Fluctuations in Evolutionary Games. Bristol: IOP Press; 2015. [Google Scholar]
- 71.Smith E, Krishnamurthy S. Flows, scaling, and the control of moment hierarchies for stochastic chemical reaction networks. Phys. Rev. E. 2017;96:062102. doi: 10.1103/PhysRevE.96.062102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Sudarshan ECG. Equivalence of semiclassical and quantum mechanical descriptions of statistical light beams. Phys. Rev. Lett. 1963;10:277–279. doi: 10.1103/PhysRevLett.10.277. [DOI] [Google Scholar]
- 73.Touchette H. The large deviation approach to statistical mechanics. Phys. Rep. 2009;478:1–69. doi: 10.1016/j.physrep.2009.05.002. [DOI] [Google Scholar]
- 74.van Kampen NG. Stochastic Processes in Physics and Chemistry. 3. Amsterdam: Elsevier; 2007. [Google Scholar]
- 75.Wahlberg P. The random Wigner distribution of gaussian stochastic processes with covariance in . J. Funct. Spaces Appl. 2005;3:163–181. doi: 10.1155/2005/252415. [DOI] [Google Scholar]
- 76.Wigner E. On the quantum correction for thermodynamic equilibrium. Phys. Rev. 1935;40:749–759. doi: 10.1103/PhysRev.40.749. [DOI] [Google Scholar]
- 77.Wilf HS. Generatingfunctionology. 3. Wellesley: A K Peters; 2006. [Google Scholar]
