Jeffreys Divergence and Generalized Fisher Information Measures on Fokker–Planck Space–Time Random Field

Jiaxing Zhang

doi:10.3390/e25101445

. 2023 Oct 13;25(10):1445. doi: 10.3390/e25101445

Jeffreys Divergence and Generalized Fisher Information Measures on Fokker–Planck Space–Time Random Field

Jiaxing Zhang ¹

Editor: Jean-Pierre Gazeau¹

PMCID: PMC10606917 PMID: 37895566

Abstract

In this paper, we present the derivation of Jeffreys divergence, generalized Fisher divergence, and the corresponding De Bruijn identities for space–time random field. First, we establish the connection between Jeffreys divergence and generalized Fisher information of a single space–time random field with respect to time and space variables. Furthermore, we obtain the Jeffreys divergence between two space–time random fields obtained by different parameters under the same Fokker–Planck equations. Then, the identities between the partial derivatives of the Jeffreys divergence with respect to space–time variables and the generalized Fisher divergence are found, also known as the De Bruijn identities. Later, at the end of the paper, we present three examples of the Fokker–Planck equations on space–time random fields, identify their density functions, and derive the Jeffreys divergence, generalized Fisher information, generalized Fisher divergence, and their corresponding De Bruijn identities.

Keywords: space–time random field, Fokker–Planck equations, differential entropy, Jeffreys divergence, fisher information, De Bruijn identities

1. Introduction

Information entropy and Fisher information are quantities to measure random information, and entropy divergence is derived from information entropy to measure the difference between two probability distributions. Formally, we can construct straightforward definitions of entropy divergence and Fisher information for the case of a space–time random field found on classical definitions. The density function, in their definitions, can be obtained in many different ways. In this paper, the density function of a space–time random field is obtained by Fokker–Planck equations. The traditional Fokker–Planck equation is a partial differential equation that describes the probability density function of a random process [1]. It describes the density function’s time-varying change rule. However, the Fokker–Planck equations for random fields, especially for space–time random fields, do not yet possess a distinct form. The classical equation needs to be generalized because the variable varies from time to space–time.

In this paper, we mainly obtain the relation between Jeffreys divergence and generalized Fisher information measure for space–time random field generated by Fokker–Planck equations. Jeffreys divergence is a symmetric entropy divergence, which is generalized from Kullback–Leibler divergence (KL divergence). Jeffreys divergence is a measure in information theory and statistics that evaluates the variation between anticipated and real probability distributions. However, if there is no overlap between the two distributions, the outcome will be infinite, which is a limitation of this approach. To prevent infinite results, we examine how Jeffreys divergence relates to generalized Fisher information for a space–time random field with slight variations in space–time parameters.

Moreover, the classical De Bruijn identity describes the relationship between differential entropy and the Fisher information of the Gaussian channel [2], and it can be generalized to other cases [3,4,5,6,7]. With gratitude to their works and following their ideas, we obtain De Bruijn identities on Jeffreys divergence and generalized Fisher information of space–time random fields, whose density functions satisfy Fokker–Planck equations.

1.1. Space–Time Random Field

The random field was first studied by Kolmogorov [8,9,10], and it was gradually improved by Yaglom [11,12,13] in the middle of the last century. A random field with $n \in N_{+}$ variables can be expressed as

X (t_{1}, t_{2}, \dots, t_{n})

(1)

where $(t_{1}, t_{2}, \dots, t_{n}) \in R^{n}$ . We call (1) a generalized random field or a multiparameter stochastic process. In some practical applications, we often use the concept of space–time random field. The space–time random field on a d-dimensional space is expressed as

X (t, x)

(2)

where $(t, x) \in R_{+} \times R^{d}$ are the space–time variables. It has many applications in statistics, finance, signal processing, stochastic partial differential equations, and other fields [14,15,16,17,18,19,20,21,22,23,24,25,26,27].

1.2. Kramers–Moyal Expansion and Fokker–Planck Equation

In the literature of stochastic processes, Kramers–Moyal expansion refers to a Taylor series of the master equation, named after Kramers and Moyal [28,29]. The Kramers–Moyal expansion is an infinite order partial differential equation

\frac{\partial}{\partial t} p (u, t) = \sum_{n = 1}^{\infty} \frac{{(- 1)}^{n}}{n!} \frac{\partial^{n}}{\partial u^{n}} [K_{n} (u, t) p (u, t)]

(3)

where $p (u, t)$ is the density function and

K_{n} (u, t) = \int_{R} {(u^{'} - u)}^{n} W (u^{'} | u, t) d u^{'}

(4)

is the n-order conditional moment. Here, $W (u^{'} | u, t)$ is the transition probability rate. The Fokker–Planck equation is obtained by keeping only the first two terms of the Kramers–Moyal expansion. In statistical mechanics, the Fokker–Planck equation is usually used to describe the time evolution of the probability density function of the velocity of a particle under the influence of drag forces and random forces, as in the famous Brownian motion, and this equation is commonly employed for determining the density function of an Itô stochastic differential equation [1].

1.3. Differential Entropy and De Bruijn Identity

The entropy of a continuous distribution was proposed by Shannon in 1948, known as differential entropy [30]:

h (X) = - \int_{R^{d}} p (x) log p (x) d x

(5)

where $h (\cdot)$ represents the differential entropy and $p (\cdot)$ is the probability density function of X. However, differential entropy is not easy to calculate and seldom exists. There are related studies on the entropy of stochastic processes and continuous systems [31,32,33,34]. If we consider a classical one-dimensional Gaussian channel model

Y_{t} = X + \sqrt{t} G

(6)

where X is the input signal, G is standard Gaussian noise, $t \geq 0$ is the strength, and $Y_{t}$ is the output, we can obtain that the density of $Y_{t}$ satisfies the following Fokker–Planck equation:

\frac{\partial}{\partial t} p (y, t) = \frac{1}{2} \frac{\partial^{2}}{\partial y^{2}} p (y, t)

(7)

Furthermore, the differential entropy of $Y_{t}$ can be calculated, and then its derivative with respect to t can be obtained as

\frac{d h_{Y_{t}} (t)}{d t} = \frac{1}{2} F I_{Y_{t}} (t)

(8)

where

F I_{Y_{t}} (t) = \int_{R} {[\frac{\partial log p (y, t)}{\partial y}]}^{2} p (y, t) d y

(9)

is the Fisher information of $Y_{t}$ . The Equation (8) here is the De Burijn identity. The de Bruijn identity connects the differential entropy $h (\cdot)$ and the Fisher information $F I (\cdot)$ , which shows that they are different aspects of the concept of “information”.

1.4. Entropy Divergence

In information theory and statistics, an entropy divergence is a statistical distance generated from information entropy to measure the difference between two probability distributions. There are various divergences generated by information entropy, such as Kullback–Leibler divergence [35], Jeffreys divergence [36], Jensen-Shannon divergence [37], and Rényi divergence [38]. These measures are applied in a variety of fields such as finance, economics, biology, signal processing, pattern recognition, and machine learning [39,40,41,42,43,44,45,46,47,48,49]. In this paper, we mainly focus on the Jeffreys divergence of two distributions, formed as

J D (P, Q) = \int_{R} [p (u) - q (u)] log \frac{p (u)}{q (u)} d μ (u)

(10)

where $μ$ is a measure of u.

2. Notations, Definitions, and Propositions

2.1. Notations and Assumptions

In this paper, we use the subsequent notations and definitions

•
Given a probability space $(Ω, F, P)$ , two real valued space–time random fields are denoted as $X (ω; t, x)$ , $Y (ω; s, y)$ or $X (t, x)$ , $Y (s, y)$ , where $ω \in Ω$ and $(t, x) (s, y) \in R_{+} \times R^{d}$ , $d \in N_{+}$ are space–time variables.
•
The probability density functions of P and Q are denoted as p and q. With $\forall u \in R$ , $p (u; t, x)$ is the density value at $(t, x)$ of X and $q (u; s, y)$ is the density value at $(s, y)$ of Y.
•
Unless there are specific restrictions on the ranges of variables, suppose that our density functions $p (u; t, x)$ and $q (u, s, y)$ belongs to $C^{2, 1, 1} (R \times R_{+} \times R^{d}, R)$ . This means that $p (u; t, x)$ and $q (u; s, y)$ are partial differentiable twice with respect to u and once with respect to $(t, x)$ or $(s, y)$ , respectively.
•
Vectors that differ only from the k-th coordinate of $x = (x_{1}, x_{2}, \dots, x_{k}, \dots, x_{d})$ are denoted ${\tilde{x}}^{(k)} = (x_{1}, x_{2}, \dots, x_{k}^{'}, \dots, x_{d})$ , where the k-th coordinates are $x_{k}$ and $x_{k}^{'}$ , $k = 1, 2, \dots, d$ .

2.2. Definitions

To obtain the generalized De Bruijn identities between Jeffreys divergence and Fisher divergence, we need to introduce some new definitions and propositions.

The primary and most important measure of information is the Kullback–Leibler divergence for random fields. Definition 1 is easily obtained as follows.

Definition 1.

The Kullback–Leibler divergence between two space–time random fields $X (t, x)$ and $Y (s, y)$ , $(t, x), (s, y) \in R_{+} \times R^{d}$ , with density functions $p (u; t, x)$ and $q (u; s, y)$ , is defined as

$K L (P (t, x) ∥ Q (s, y)) = \int_{R} p (u; t, x) log \frac{p (u; t, x)}{q (u; s, y)} d u$ (11)

Similar to the classical Kullback–Leibler divergence, Kullback–Leibler divergence on random fields is not symmetrical, i.e.,

K L (P (t, x) ∥ Q (s, y)) \neq K L (Q (s, y) ∥ P (t, x))

(12)

Following the classical definition of Jeffreys divergence on two random variables, we mainly consider Jeffreys divergence for random fields in this paper.

Definition 2.

The Jeffreys divergence between space–time random fields $X (t, x)$ and $Y (s, y)$ , $(t, x), (s, y) \in R_{+} \times R^{d}$ , with density function $p (u; t, x)$ and $q (u; s, y)$ is defined as

$J D (P (t, x), Q (s, y)) = K L (P (t, x) ∥ Q (s, y)) + K L (Q (s, y) ∥ P (t, x))$ (13)

Here, we replace ∥ with, in the distortion measure to emphasize the symmetric property.

Another significant measure of information is Fisher information. In this paper, we consider the generalized Fisher information of the space–time random field.

Definition 3.

The Generalized Fisher information of the space–time random field $X (t, x)$ , $(t, x) \in R_{+} \times R^{d}$ , with density function $p (u; t, x)$ defined by nonnegative function $f (\cdot)$ , is formed as

$F I_{f} (P (t, x)) = \int_{R} f (u) {[\partial_{u} log p (u; t, x)]}^{2} p (u; t, x) d u$ (14)

In this case, where f is equal to 1, $F I_{1} (P (t, x))$ represents the typical Fisher information. In addition to Equation (14), there are similar forms of generalized Fisher information

$F I_{f}^{(t)} (P (t, x)) = \int_{R} f (u) {[\partial_{t} log p (u; t, x)]}^{2} p (u; t, x) d u$ (15)

and

$F I_{f}^{(x_{k})} (P (t, x)) = \int_{R} f (u) {[\partial_{x_{k}} log p (u; t, x)]}^{2} p (u; t, x) d u$ (16)

for $k = 1, 2, \dots, d$ .

Obviously, (15) and (16) are generalized Fisher information on space–time variables. Regarding the generalized Fisher information (14), we can come to a following simple proposition.

Proposition 1.

For the arbitrary positive continuous function $f (\cdot)$ , suppose the generalized Fisher information of continuous random variable X

$F I_{f} (X) : = \int_{R} f (u) {[\frac{d log p_{X} (u)}{d u}]}^{2} p_{X} (u) d u$ (17)

is well defined, where $p_{X} (u)$ represents the probability density. Then, we have the generalized Fisher information inequality

$\frac{1}{F I_{f} (X + Y)} \geq \frac{1}{F I_{f} (X)} + \frac{1}{F I_{f} (Y)}$ (18)

when $f \equiv 1$ , $F I_{1} (X)$ represents the Fisher information in the standard case.

Proof.

Denote $Z = X + Y$ , $p_{X}$ , $p_{Y}$ , and $p_{Z}$ represent densities, i.e.,

$p_{Z} (z) = \int_{R} p_{X} (x) p_{Y} (z - x) d x$ (19)

and derivative function

$p_{Z}^{'} (z) = \int_{R} p_{X}^{'} (x) p_{Y} (z - x) d x$ (20)

If $p_{X}$ , $p_{Y}$ , and $p_{Z}$ never vanish,

$\begin{matrix} \frac{p_{Z}^{'} (z)}{p_{Z} (z)} \\ = & \int_{R} \frac{p_{X}^{'} (x) p_{Y} (z - x)}{p_{Z} (z)} d x \\ = & \int_{R} \frac{p_{X} (x) p_{Y} (z - x)}{p_{Z} (z)} \frac{p_{X}^{'} (x)}{p_{X} (x)} d x \\ = & E [\frac{p_{X}^{'} (x)}{p_{X} (x)} | Z = z] \end{matrix}$ (21)

is the conditional expectation of $\frac{p_{X}^{'} (x)}{p_{X} (x)}$ for given z. Similarly, we can obtain

$\frac{p_{Z}^{'} (z)}{p_{Z} (z)} = E [\frac{p_{Y}^{'} (y)}{p_{Y} (y)} | Z = z]$ (22)

and $\forall μ$ , $λ \in R$ , we also find that

$E [μ \frac{p_{X}^{'} (x)}{p_{X} (x)} + λ \frac{p_{Y}^{'} (y)}{p_{Y} (y)} | Z = z] = (μ + λ) \frac{p_{Z}^{'} (z)}{p_{Z} (z)}$ (23)

Then, we have

$\begin{matrix} {[(μ + λ) \frac{p_{Z}^{'} (z)}{p_{Z} (z)}]}^{2} \\ = & {\{E [μ \frac{p_{X}^{'} (x)}{p_{X} (x)} + λ \frac{p_{Y}^{'} (y)}{p_{Y} (y)} | Z = z]\}}^{2} \\ \leq & E \{{[μ \frac{p_{X}^{'} (x)}{p_{X} (x)} + y \frac{p_{Y}^{'} (y)}{p_{Y} (y)}]}^{2} | Z = z\} \end{matrix}$ (24)

with equality only if

$μ \frac{p_{X}^{'} (x)}{p_{X} (x)} + λ \frac{p_{Y}^{'} (y)}{p_{Y} (y)} = (μ + λ) \frac{p_{Z}^{'} (z)}{p_{Z} (z)}$ (25)

with probability 1 whenever $z = x + y$ and we have

$f (z) {[(μ + λ) \frac{p_{Z}^{'} (z)}{p_{Z} (z)}]}^{2} \leq f (z) E [{(μ \frac{p_{X}^{'} (x)}{p_{X} (x)} + y \frac{p_{Y}^{'} (y)}{p_{Y} (y)})}^{2} | Z = z]$ (26)

Averaging both sides over the distribution of z

$\begin{matrix} E \{f (z) {[(μ + λ) \frac{p_{Z}^{'} (z)}{p_{Z} (z)}]}^{2}\} \\ \leq & E \{f (z) E [{(μ \frac{p_{X}^{'} (x)}{p_{X} (x)} + y \frac{p_{Y}^{'} (y)}{p_{Y} (y)})}^{2} | Z = z]\} \\ = & μ^{2} E \{f (z) E [{(\frac{p_{X}^{'} (x)}{p_{X} (x)})}^{2} | Z = z]\} + λ^{2} E \{E [f (z) {(\frac{p_{Y}^{'} (y)}{p_{Y} (y)})}^{2} | Z = z]\} \end{matrix}$ (27)

i.e.,

${(μ + λ)}^{2} F I_{f} (X + Y) \leq μ^{2} F I_{f} (X) + λ^{2} F I_{f} (Y)$ (28)

Let $μ = \frac{1}{F I_{f} (X)}$ and $λ = \frac{1}{F I_{f} (Y)}$ , we obtain

$\frac{1}{F I_{f} (X + Y)} \geq \frac{1}{F I_{f} (X)} + \frac{1}{F I_{f} (Y)}$ (29)

□

According to Definition 3, we can obtain relevant definitions on the generalized Fisher information measure.

Definition 4.

The generalized Cross–Fisher information for space–time random fields $X (t, x)$ and $Y (s, y)$ , $(t, x), (s, y) \in R_{+} \times R^{d}$ , with density functions $p (u; t, x)$ and $q (u; s, y)$ , defined by the nonnegative function $f (\cdot)$ , is defined as

$C F I_{f} (P (t, x), Q (s, y)) = \int_{R} f (u) {[\partial_{u} log q (u; s, y)]}^{2} p (u; t, x) d u$ (30)

Similar to the concept of cross-entropy, it is easy to verify that (30) is symmetrical about P and Q.

Definition 5.

The generalized Fisher divergence for space–time random fields $X (t, x)$ and $Y (s, y)$ , for $(t, x), (s, y) \in R_{+} \times R^{d}$ , with density functions $p (u; t, x)$ and $q (u; s, y)$ , defined by nonnegative function $f (\cdot)$ , is defined as

$F D_{f} (P (t, x) ∥ Q (s, y)) = \int_{R} f (u) {[\partial_{u} log p (u; t, x) - \partial_{u} log q (u; s, y)]}^{2} p (u; t, x) d u$ (31)

In particular, when $f \equiv 1$ , $F D_{1} (P (t, x) ∥ Q (s, y))$ represents the typical Fisher divergence.

Obviously, the generalized Fisher divergence between two random fields is not a symmetrical measure of information. We need to create a new formula to expand on (31) in order to achieve symmetry.

Definition 6.

The generalized Fisher divergence for space–time random fields $X (t, x)$ and $Y (s, y)$ , $(t, x), (s, y) \in R_{+} \times R^{d}$ , with density functions $p (u; t, x)$ and $q (u; s, y)$ , defined by nonnegative functions $f (\cdot)$ and $g (\cdot)$ , is defined as

$\begin{matrix} F D_{(f, g)} (P (t, x) ∥ Q (s, y)) = & \int_{R} [f (u; t, x) \partial_{u} log p (u; t, x) - g (u; s, y) \partial_{u} log q (u; s, y)] \\ \times [\partial_{u} log p (u; t, x) - \partial_{u} log q (u; s, y)] [p (u; t, x) + q (u, s, y)] d u \end{matrix}$ (32)

In particular, if f equals g, the generalized Fisher divergence for random fields using a single function is denoted as $F D_{(f, f)} (P (t, x) ∥ Q (s, y))$ .

In general, $F D_{(f, g)} (P (t, x) ∥ Q (s, y))$ is asymmetric with respect to P and Q, i.e.,

F D_{(f, g)} (P (t, x) ∥ Q (s, y)) \neq F D_{(f, g)} (Q (s, y) ∥ P (t, x))

(33)

If we suppose that f and g are functions only related to P and Q, i.e.,

\{\begin{matrix} f (u; t, x) = T [p (t, x)] (u) \\ g (u; s, y) = T [q (s, y)] (u) \end{matrix}

(34)

where $T$ is an operator; the generalized Fisher divergence $F D_{(f, g)} (P (t, x) ∥ Q (s, y))$ can be rewritten as

\begin{matrix} F D_{(f, g)} (P (t, x) ∥ Q (s, y)) \\ = & \int_{R} \{T [p (t, x)] (u) \partial_{u} log p (u; t, x) - T [q (s, y)] (u) \partial_{u} log q (u; s, y)\} \\ \times [\partial_{u} log p (u; t, x) - \partial_{u} log q (u; s, y)] [p (u; t, x) + q (u, s, y)] d u \end{matrix}

(35)

and we can easily obtain

F D_{(f, g)} (P (t, x) ∥ Q (s, y)) = F D_{(g, f)} (Q (t, x) ∥ P (s, y))

(36)

In this case, we call (35) symmetric Fisher divergence for random fields generated by operator $T$ and denote it as

\begin{matrix} s F D_{T} (P (t, x), Q (s, y)) \\ = & \int_{R} \{T [p (t, x)] (u) \partial_{u} log p (u; t, x) - T [q (s, y)] (u) \partial_{u} log q (u; s, y)\} \\ \times [\partial_{u} log p (u; t, x) - \partial_{u} log q (u; s, y)] [p (u; t, x) + q (u, s, y)] d u \end{matrix}

(37)

Notice that

\begin{matrix} A a - B b \\ = & \frac{1}{2} \times 2 (A a - B b) \\ = & \frac{1}{2} [(A a - A b + A b - B b) + (A a - B a + B a - B b)] \\ = & \frac{1}{2} [(A + B) (a - b) + (A - B) (a + b)] \end{matrix}

(38)

for A, B, a, $b \in R$ ; then, we can rewrite (37) as

\begin{matrix} s F D_{T} (P (t, x), Q (s, y)) \\ = & \frac{1}{2} \int_{R} \{T [p (t, x)] (u) + T [q (s, y)] (u)\} {[\partial_{u} log p (u; t, x) - \partial_{u} log q (u; s, y)]}^{2} \\ \times [p (u; t, x) + q (u, s, y)] d u \\ + \frac{1}{2} \int_{R} \{T [p (t, x)] (u) - T [q (s, y)] (u)\} \{{[\partial_{u} log p (u; t, x)]}^{2} - {[\partial_{u} log q (u; s, y)]}^{2}\} \\ \times [p (u; t, x) + q (u, s, y)] d u \\ = & \frac{1}{2} [F D_{T [p (t, x)] + T [q (s, y)]} (P (t, x) ∥ Q (s, y)) + F D_{T [p (t, x)] + T [q (s, y)]} (Q (s, y) ∥ P (t, x))] \\ + \frac{1}{2} [F I_{T [p (t, x)] - T [q (s, y)]} (P (t, x)) + F I_{T [p (t, x)] - T [q (s, y)]} (Q (s, y))] \\ + \frac{1}{2} [C F I_{T [p (t, x)] - T [q (s, y)]} (Q (s, y), P (t, x)) - C F I_{T [p (t, x)] - T [q (s, y)]} (P (t, x), Q (s, y))] \end{matrix}

(39)

Lemma 1

(Kramers–Moyal expansion [28,29]). Suppose that the random process $X (t)$ has any order moment; then, the probability density function $p (u, t)$ satisfies the Kramers–Moyal expansion

$\frac{\partial}{\partial t} p (u, t) = \sum_{n = 1}^{\infty} \frac{{(- 1)}^{n}}{n!} \frac{\partial^{n}}{\partial u^{n}} [K_{n} (u, t) p (u, t)]$ (40)

where

$K_{n} (u, t) = \int_{R} {(u^{'} - u)}^{n} W (u^{'} | u, t) d u^{'}$ (41)

is the n-order conditional moment and $W (u^{'} | u, t)$ is the transition probability rate.

Lemma 2

(Pawula theorem [50,51]).If the limit on conditional moment of random process $X (t)$

$lim_{Δ t \to 0} \frac{1}{Δ t} E \{{[X (t + Δ t) - X (t)]}^{n} | X (t) = x\}$ (42)

exists for all $n \in N_{+}$ , and the limit value equals 0 for some even number, then the limit values are 0 for all $n \geq 3$ .

The Pawula theorem states that there are only three possible cases in the Kramers–Moyal expansion:

(1)
The Kramers–Moyal expansion is truncated at $n = 1$ , meaning that the process is deterministic;
(2)
The Kramers–Moyal expansion stops at $n = 2$ , with the resulting equation being the Fokker–Planck equation, and describes diffusion processes;
(3)
The Kramers–Moyal expansion contains all the terms up to $n = \infty$ .

In this paper, we only focus on the case of the Fokker–Planck equation.

3. Main Results and Proofs

In this section, we establish the Fokker–Planck equations for continuous space–time random field. Additionally, we present the relationship theorem between Jeffreys divergence and Fisher information, as well as the De Bruijn identities connection between Jeffreys divergence and Fisher divergence.

Theorem 1.

The probability density function $p (u; t, x)$ of the continuous space–time random field $X (t, x)$ , $u \in R$ , $(t, x) \in R_{+} \times R^{d}$ satisfies the following Fokker–Planck equations:

$\{\begin{matrix} \frac{\partial}{\partial t} p (u; t, x) & = \frac{1}{2} \frac{\partial^{2}}{\partial u^{2}} [b_{0} (u; t, x) p (u, t, x)] - \frac{\partial}{\partial u} [a_{0} (u; t, x) p (u; t, x)] \\ \frac{\partial}{\partial x_{k}} p (u; t, x) & = \frac{1}{2} \frac{\partial^{2}}{\partial u^{2}} [b_{k} (u; t, x) p (u, t, x)] - \frac{\partial}{\partial u} [a_{k} (u; t, x) p (u; t, x)] \\ k & = 1, 2, \dots, d \end{matrix}$ (43)

where

$\{\begin{matrix} a_{0} (u; t, x) & = lim_{Δ t \to 0} \frac{1}{Δ t} M_{1} (u; t, Δ t, x) \\ b_{0} (u; t, x) & = lim_{Δ t \to 0} \frac{1}{Δ t} M_{2} (u; t, Δ t, x) \\ a_{k} (u; t, x) & = lim_{Δ x_{k} \to 0} \frac{1}{Δ x_{k}} {\tilde{M}}_{1} (u; t, x, Δ x_{k}) \\ b_{k} (u; t, x) & = lim_{Δ x_{k} \to 0} \frac{1}{Δ x_{k}} {\tilde{M}}_{2} (u; t, x, Δ x_{k}) \\ k & = 1, 2, \dots, d \end{matrix}$ (44)

here,

$\{\begin{matrix} M_{n} (u; t, Δ t, x) = E \{{[X (t + Δ t, x) - X (t, x)]}^{n} | X (t, x) = u\} \\ {\tilde{M}}_{n} (u; t, x, Δ x_{k}) = E \{{[X (t, x + Δ x_{k} e_{k}) - X (t, x)]}^{n} | X (t, x) = u\} \end{matrix}$ (45)

are n-order conditional moments and $e_{k} = (0, 0, \dots, 1, \dots, 0) \in R^{d}$ are standard orthogonal basis vectors, $k = 1, 2, \dots, d$ .

Proof.

$\forall Δ t \neq 0$ , we can obtain the difference of density function in the time variable

$p (u; t + Δ t, x) - p (u; t, x) = \sum_{n = 1}^{+ \infty} \frac{{(- 1)}^{n}}{n!} \frac{\partial^{n}}{\partial u^{n}} [M_{n} (u; t, Δ t, x) p (u; t, x)]$ (46)

where

$M_{n} (u; t, Δ t, x) = E \{{[X (t + Δ t, x) - X (t, x)]}^{n} | X (t, x) = u\}$ (47)

is the n-order conditional moment. Then, the partial derivative of the density function with respect to t is

$\frac{\partial}{\partial t} p (u; t, x) = lim_{Δ t \to 0} \frac{1}{Δ t} \sum_{n = 1}^{+ \infty} \frac{{(- 1)}^{n}}{n!} \frac{\partial^{n}}{\partial u^{n}} [M_{n} (u; t, Δ t, x) p (u; t, x)]$ (48)

The Pawula theorem implies that if the Kramers–Moyal expansion stops after the second term, we obtain the Fokker–Planck equation about the time variable t

$\frac{\partial}{\partial t} p (u; t, x) = \frac{1}{2} \frac{\partial^{2}}{\partial u^{2}} [b_{0} (u; t, x) p (u, t, x)] - \frac{\partial}{\partial u} [a_{0} (u; t, x) p (u; t, x)]$ (49)

where

$\{\begin{matrix} a_{0} (u; t, x) = lim_{Δ t \to 0} \frac{1}{Δ t} M_{1} (u; t, Δ t, x) \\ b_{0} (u; t, x) = lim_{Δ t \to 0} \frac{1}{Δ t} M_{2} (u; t, Δ t, x) \end{matrix}$ (50)

Similarly, we may consider the increment $Δ x_{k}$ of the spatial variable $x_{k}$ , and we can obtain the Fokker–Planck equations about $x_{k}$ as

$\frac{\partial}{\partial x_{k}} p (u; t, x) = \frac{1}{2} \frac{\partial^{2}}{\partial u^{2}} [b_{k} (u; t, x) p (u, t, x)] - \frac{\partial}{\partial u} [a_{k} (u; t, x) p (u; t, x)]$ (51)

where

$\{\begin{matrix} a_{k} (u; t, x) = lim_{Δ x_{k} \to 0} \frac{1}{Δ x_{k}} {\tilde{M}}_{1} (u; t, x, Δ x_{k}) \\ b_{k} (u; t, x) = lim_{Δ x_{k} \to 0} \frac{1}{Δ x_{k}} {\tilde{M}}_{2} (u; t, x, Δ x_{k}) \end{matrix}$ (52)

here,

${\tilde{M}}_{n} (u; t, x, Δ x_{k}) = E \{{[X (t, x + Δ x_{k} e_{k}) - X (t, x)]}^{n} | X (t, x) = u\}$ (53)

$e_{k} = (0, 0, \dots, 1, \dots, 0) \in R^{d}$ are standard orthogonal basis vectors, $k = 1, 2, \dots, d$ . □

The Fokker–Planck equations are partial differential equations that describe the probability density function of the space–time random field, similar to the classical Fokker–Planck equation. Solving a system of partial differential equations for general Fokker–Planck equations proves to be challenging. Fortunately, in Section 4 we present three distinct categories of space–time random fields in detail, along with their corresponding Fokker–Planck equations, and deduce their probability density functions.

Next, we examine the relationship between Jeffreys divergence and Fisher information in a single space–time random field when there are different time or spatial variables.

Theorem 2.

Suppose that $p (u; t, x) > 0$ is a continuous differential density function of the space–time random field $X (t, x)$ , the partial derivatives $\partial_{u} p (u; t, x)$ , $\partial_{t} p (u; t, x)$ , $\partial_{x_{k}} p (u; t, x)$ are continuous bounded functions, and the integrals in the proof are well-defined, $k = 1, 2, \dots, d$ , $u \in R$ , $(t, x) \in R_{+} \times R^{d}$ . Then, we have

$\{\begin{matrix} lim_{|t - s| \to 0} \frac{J D (P (t, x), P (s, x))}{{| t - s |}^{2}} & = F I_{1}^{(t)} (X (t, x)) \\ lim_{|x_{k} - x_{k}^{'}| \to 0} \frac{J D (P (t, x), P (t, {\tilde{x}}^{(k)}))}{| x_{k} - x_{k}^{'} |^{2}} & = F I_{1}^{(x_{k})} (X (t, x)) \\ k & = 1, 2, \dots, d \end{matrix}$ (54)

Proof.

For fixed $x \in R$ , and $\forall s \neq t > 0$ ,

$\begin{matrix} J D (P (t, x), P (s, x)) & = K L (P (t, x) | | P (s, x)) + K L (P (s, x) | | P (t, x)) \\ = \int_{R} [log p (u; t, x) - log p (u, s, x)] [p (u; t, x) - p (u, s, x)] d u \end{matrix}$ (55)

then we can obtain

$\begin{matrix} lim_{|t - s| \to 0} \frac{J D (P (t, x), P (s, x))}{{| t - s |}^{2}} \\ = & lim_{|t - s| \to 0} \int_{R} \frac{log p (u; t, x) - log p (u, s, x)}{t - s} \frac{p (u; t, x) - p (u, s, x)}{t - s} d u \end{matrix}$ (56)

Notice that

$\{\begin{matrix} lim_{|t - s| \to 0} \frac{log p (u; t, x) - log p (u, s, x)}{t - s} & = \partial_{t} log p (u; t, x) \\ lim_{|t - s| \to 0} \frac{p (u; t, x) - p (u, s, x)}{t - s} & = \partial_{t} p (u; t, x) \end{matrix}$ (57)

exist, and we obtain

$\begin{matrix} lim_{|t - s| \to 0} \frac{J D (P (t, x), P (s, x))}{{| t - s |}^{2}} \\ = & \int_{R} \partial_{t} log p (u; t, x) \partial_{t} p (u; t, x) d u \\ = & \int_{R} \partial_{t} {[log p (u; t, x)]}^{2} p (u; t, x) d u \\ = & F I_{1}^{(t)} (X (t, x)) \end{matrix}$ (58)

Similarly, for fixed t and $\forall x_{k} \neq x_{k}^{'}$ , we can obtain the identity on Jeffreys divergence and Fisher information for space coordinates

$lim_{|x_{k} - x_{k}^{'}| \to 0} \frac{J D (P (t, x), P (t, {\tilde{x}}^{(k)}))}{| x_{k} - x_{k}^{'} |^{2}} = F I_{1}^{(x_{k})} (X (t, x))$ (59)

for $k = 1, 2, \dots, d$ . □

Theorem 2 states that as the space–time variable difference approaches zero, the Fisher information of the space–time random field is the limit of the ratio of Jeffreys divergence at different locations to the square of space–time variable difference. It is noteworthy that Theorem 2 specifically addresses Jeffreys divergence only in cases where a single space–time random field is situated in distinct space–time positions, and where the difference between space–time variables approaches to 0. This ensures that Jeffreys divergence will not be infinite.

Theorem 3.

Suppose that $p (u; t, x)$ and $q (u; t, x)$ are continuous differentiable density functions of space–time random fields $X (t, x)$ and $Y (t, x)$ such that

$\{\begin{matrix} lim_{u \to \infty} \{\frac{1}{2} \partial_{u} [b_{k}^{(1)} (u; t, x) p (u; t, x)] - a_{k}^{(1)} (u; t, x) p (u; t, x)\} [log \frac{p (u; t, x)}{q (u; t, x)} - \frac{q (u; t, x)}{p (u; t, x)}] = 0 \\ lim_{u \to \infty} \{\frac{1}{2} \partial_{u} [b_{k}^{(2)} (u; t, x) q (u; t, x)] - a_{k}^{(2)} (u; t, x) q (u; t, x)\} [log \frac{q (u; t, x)}{p (u; t, x)} - \frac{p (u; t, x)}{q (u; t, x)}] = 0 \end{matrix}$ (60)

where $a_{k}$ , $b_{k}$ are the forms in (44) and (45), and $(t, x) \in R_{+} \times R^{d}$ , $k = 0, 1, 2, \dots, d$ . Then, the Jeffreys divergence $J D (P (t, x), Q (t, x))$ satisfies generalized De Bruijn identities

$\{\begin{matrix} \frac{\partial}{\partial t} J D (P (t, x), Q (t, x)) & = - \frac{1}{2} F D_{(b_{0}^{(1)}, b_{0}^{(2)})} (P (t, x) ∥ Q (t, x)) - R_{0} (P (t, x) ∥ Q (t, x)) \\ \frac{\partial}{\partial x_{k}} J D (P (t, x), Q (t, x)) & = - \frac{1}{2} F D_{(b_{k}^{(1)}, b_{k}^{(2)})} (P (t, x) ∥ Q (t, x)) - R_{k} (P (t, x) ∥ Q (t, x)) \\ k & = 1, 2, \dots, d \end{matrix}$ (61)

where

$\{\begin{matrix} R_{0} (P (t, x) ∥ Q (t, x)) & = \int_{R} [\frac{1}{2} \partial_{u u}^{2} (b_{0}^{(1)} - b_{0}^{(2)}) - \partial_{u} (a_{0}^{(1)} - a_{0}^{(2)})] (p + q) d u \\ R_{k} (P (t, x) ∥ Q (t, x)) & = \int_{R} [\frac{1}{2} \partial_{u u}^{2} (b_{k}^{(1)} - b_{k}^{(2)}) - \partial_{u} (a_{k}^{(1)} - a_{k}^{(2)})] (p + q) d u \\ k & = 1, 2, \dots, d \end{matrix}$ (62)

here, we omit $(u; t, x)$ in the integrals for convenience.

Proof.

By Definition 2, we have

$\begin{matrix} J D (P (t, x), Q (t, x)) \\ = & K L (P (t, x) ∥ Q (t, x)) + K L (Q (t, x) ∥ P (t, x)) \\ = & \int_{R} p log \frac{p}{q} d u + \int_{R} q log \frac{q}{p} d u \\ = & \int_{R} [p log \frac{p}{q} + q log \frac{q}{p}] d u \end{matrix}$ (63)

where $p : = p (u; t, x)$ , $q : = q (u; t, x)$ are density functions of $X (t, x)$ and $Y (t, x)$ ; here, we omit $(u; t, x)$ .

Notice that

$\{\begin{matrix} \partial_{u} (\frac{p}{q}) & = \frac{1}{q} (\partial_{u} p - \frac{p}{q} \partial_{u} q) \\ \partial_{u} (\frac{q}{p}) & = \frac{1}{p} (\partial_{u} q - \frac{q}{p} \partial_{u} p) \end{matrix}$ (64)

i.e.,

$\{\begin{matrix} \frac{p}{q} \partial_{u} q & = \partial_{u} p - q \partial_{u} (\frac{p}{q}) \\ \frac{q}{p} \partial_{u} p & = \partial_{u} q - p \partial_{u} (\frac{q}{p}) \end{matrix}$ (65)

and

$\begin{matrix} (\partial_{u} log p - \partial_{u} log q) (p + q) \\ = & (\frac{1}{p} \partial_{u} p - \frac{1}{q} \partial_{u} q) (p + q) \\ = & \partial_{u} p - \partial_{u} q + \frac{q}{p} \partial_{u} p - \frac{p}{q} \partial_{u} q \\ = & p \frac{q \partial_{u} p - p \partial_{u} q}{p^{2}} + q \frac{q \partial_{u} p - p \partial_{u} q}{q^{2}} \\ = & - [p \partial_{u} (\frac{q}{p}) - q \partial_{u} (\frac{p}{q})] \end{matrix}$ (66)

then,

$\begin{matrix} \partial_{t} J D (P (t, x), Q (t, x)) \\ = & \int_{R} [\partial_{t} p log \frac{p}{q} + q \partial_{t} (\frac{p}{q}) + \partial_{t} q log \frac{q}{p} + p \partial_{t} (\frac{q}{p})] d u \\ = & \int_{R} [\partial_{t} p log \frac{p}{q} + \partial_{t} p - \frac{p}{q} \partial_{t} q + \partial_{t} q log \frac{q}{p} + \partial_{t} q - \frac{q}{p} \partial_{t} p] d u \\ = & \int_{R} [(log \frac{p}{q} - \frac{q}{p}) \partial_{t} p + (log \frac{q}{p} - \frac{p}{q}) \partial_{t} q] d u \\ = & \int_{R} (log \frac{p}{q} - \frac{q}{p}) [\frac{1}{2} \partial_{u u}^{2} (b_{0}^{(1)} p) - \partial_{u} (a_{0}^{(1)} p)] d u \\ + \int_{R} (log \frac{q}{p} - \frac{p}{q}) [\frac{1}{2} \partial_{u u}^{2} (b_{0}^{(2)} q) - \partial_{u} (a_{0}^{(2)} q)] d u \\ = & - \int_{R} [\frac{1}{2} \partial_{u} (b_{0}^{(1)} p) - a_{0}^{(1)} p] [\frac{q}{p} \partial_{u} (\frac{p}{q}) - \partial_{u} (\frac{q}{p})] d u \\ - \int_{R} [\frac{1}{2} \partial_{u} (b_{0}^{(2)} q) - a_{0}^{(2)} q] [\frac{p}{q} \partial_{u} (\frac{q}{p}) - \partial_{u} (\frac{p}{q})] d u \\ = & - \int_{R} [\frac{1}{2} \partial_{u} (b_{0}^{(1)} p) - a_{0}^{(1)} p] \frac{1}{p} [q \partial_{u} (\frac{p}{q}) - p \partial_{u} (\frac{q}{p})] d u \\ - \int_{R} [\frac{1}{2} \partial_{u} (b_{0}^{(2)} q) - a_{0}^{(2)} q] \frac{1}{q} [p \partial_{u} (\frac{q}{p}) - q \partial_{u} (\frac{p}{q})] d u \\ = & - \int_{R} \{[\frac{1}{2} \partial_{u} (b_{0}^{(1)} p) - a_{0}^{(1)} p] \frac{1}{p} - [\frac{1}{2} \partial_{u} (b_{0}^{(2)} q) - a_{0}^{(2)} q] \frac{1}{q}\} \\ \times (\partial_{u} log p - \partial_{u} log q) (p + q) d u \\ = & - \int_{R} [\frac{1}{2} b_{0}^{(1)} \partial_{u} log p - \frac{1}{2} b_{0}^{(2)} \partial_{u} log q + \frac{1}{2} \partial_{u} (b_{0}^{(1)} - b_{0}^{(2)}) - (a_{0}^{(1)} - a_{0}^{(2)})] \\ \times (\partial_{u} log p - \partial_{u} log q) (p + q) d u \\ = & - \int_{R} [\frac{1}{2} b_{0}^{(1)} \partial_{u} log p - \frac{1}{2} b_{0}^{(2)} \partial_{u} log q] (\partial_{u} log p - \partial_{u} log q) (p + q) d u \\ - \int_{R} [\frac{1}{2} \partial_{u} (b_{0}^{(1)} - b_{0}^{(2)}) - (a_{0}^{(1)} - a_{0}^{(2)})] (\partial_{u} log p - \partial_{u} log q) (p + q) d u \\ = & - \frac{1}{2} F D_{(b_{0}^{(1)}, b_{0}^{(2)})} (P (t, x) ∥ Q (t, x)) - R_{0} (P (t, x) ∥ Q (t, x)) \end{matrix}$ (67)

where

$\begin{matrix} F D_{(b_{0}^{(1)}, b_{0}^{(2)})} (P (t, x) ∥ Q (t, x)) \\ = & \int_{R} [b_{0}^{(1)} \partial_{u} log p - b_{0}^{(2)} \partial_{u} log q] (\partial_{u} log p - \partial_{u} log q) (p + q) d u \end{matrix}$ (68)

and

$R_{0} (P (t, x) ∥ Q (t, x)) = \int_{R} [\frac{1}{2} \partial_{u u}^{2} (b_{0}^{(1)} - b_{0}^{(2)}) - \partial_{u} (a_{0}^{(1)} - a_{0}^{(2)})] (p + q) d u$ (69)

Similarly, for $k = 1, 2, \dots, d$ , we can obtain the generalized De Bruijn identities about the spatial variable $x_{k}$

$\frac{\partial}{\partial x_{k}} J D (P (t, x), Q (t, x)) = - \frac{1}{2} F D_{(b_{k}^{(1)}, b_{k}^{(2)})} (P (t, x) ∥ Q (t, x)) - R_{k} (P (t, x), Q (t, x))$ (70)

where

$R_{k} (P (t, x) ∥ Q (t, x)) = \int_{R} [\frac{1}{2} \partial_{u u}^{2} (b_{k}^{(1)} - b_{k}^{(2)}) - \partial_{u} (a_{k}^{(1)} - a_{k}^{(2)})] (p + q) d u$ (71)

then we obtain the conclusion. □

Unlike Theorem 3, Theorem 4 focuses on the Jeffreys divergence between two separate space–time random fields $X (t, x)$ and $Y (t, x)$ , both at the same position $(t, x)$ , and establishes the identities of the connection between the Jeffreys divergence and the Fisher divergence of $X (t, x)$ and $Y (t, x)$ . This is known as the De Bruijn identities. To prevent Jeffreys divergence from becoming infinite, it is necessary for the difference between the probability density functions of $X (t, x)$ and $Y (t, x)$ to be small. In Section 4, we obtain Jeffreys divergence and Fisher divergence using the same type of Fokker–Planck equations but with different parameters. This allows for the selection of only the appropriate parameters.

4. Three Fokker–Planck Random Fields and Their Corresponding Information Measures

In this section, we present three types of Fokker–Planck equations and derive their corresponding density functions and information measures, which are Jeffreys divergence, generalized Fisher information, and Fisher divergence. With these quantities, the results corresponding to the applications of Theorems 2 and 3 are obtained. On the one hand, we calculate the ratio of Jeffreys divergence to the square of space–time variation on the identical Fokker–Planck space–time random field at various space–time points, in comparison to generalized Fisher information. On the other hand, we derive the De Burijn identities for Jeffreys divergence and generalized Fisher divergence from Fokker–Planck equations on a single space–time random field at the corresponding space–time location, under same type but with different parameters.

First, we present a theorem regarding simple type Fokker–Planck equations of the random field.

Theorem 4.

Suppose the functions in the Fokker–Planck Equations (43) for the continuous random field $X (t, x)$ are formulated as follows:

$\{\begin{matrix} a_{0} (u; t, x) & = a_{0} (t, x) \\ b_{0} (u; t, x) & = b_{0} (t, x) > 0 \\ a_{k} (u; t, x) & = a_{k} (t, x) \\ b_{k} (u; t, x) & = b_{k} (t, x) > 0 \\ k & = 1, 2, \dots, d \end{matrix}$ (72)

where $a_{0}$ , $a_{k}$ , $b_{0}$ , and $b_{k}$ are continuously differentiable functions independent of u and two continuously differentiable functions, $α (t, x)$ and $β (t, x)$ , exist such that

$\{\begin{matrix} d α (t, x) = a_{0} d t + a_{1} d x_{1} + \dots + a_{d} d x_{d} \\ d β (t, x) = b_{0} d t + b_{1} d x_{1} + \dots + b_{d} d x_{d} \end{matrix}$ (73)

the initial density function is $p (u; t, x) = δ [u - u_{0} (x)]$ as $prod (t, x) = 0$ ; then, the density function of $X (t, x)$ is presented as follows:

$p (u; t, x) = \frac{1}{\sqrt{2 π β (t, x)}} e^{- \frac{{[u - u_{0} (x) - α (t, x)]}^{2}}{2 β (t, x)}}$ (74)

Proof.

It can be easily inferred that the Fokker–Planck equations are simple parabolic equations, and their solution can be obtained through Fourier transform

$\{\begin{matrix} p (u; t, x) & = \frac{1}{\sqrt{2 π \int_{0}^{t} b_{0} (s, x) d s}} e^{- \frac{{[u - u_{0} (x) - \int_{0}^{t} a_{0} (s, x) d s]}^{2}}{2 \int_{0}^{t} b_{0} (s, x) d s}} \\ p (u; t, x) & = \frac{1}{\sqrt{2 π \int_{0}^{x_{k}} b_{k} (t, x) d x_{k}}} e^{- \frac{{[u - u_{0} (x) - \int_{0}^{x_{k}} a_{k} (t, x) d x_{k}]}^{2}}{2 \int_{0}^{x_{k}} b_{k} (t, x) d x_{k}}} \end{matrix}$ (75)

Recall that there are two functions $α (t, x)$ and $β (t, x)$ such that

$\{\begin{matrix} d α (t, x) = a_{0} (t, x) d t + a_{1} (t, x) d x_{1} + \dots + a_{d} (t, x) d x_{d} \\ d β (t, x) = b_{0} (t, x) d t + b_{1} (t, x) d x_{1} + \dots + b_{d} (t, x) d x_{d} \end{matrix}$ (76)

we can obtain the probability density function

$p (u; t, x) = \frac{1}{\sqrt{2 π β (t, x)}} e^{- \frac{{[u - u_{0} (x) - α (t, x)]}^{2}}{2 β (t, x)}}$ (77)

□

Actually, numerous examples exist in which the Fokker–Planck equations comply with Theorem 4. Let $B (t, x)$ be the $(1 + d, 1)$ Brownian sheet [52,53], that is, a centered continuous Gaussian process that is indexed by $(1 + d)$ real, positive parameters and takes its values in $R$ . Moreover, its covariance structure is given by

E [B (t, x) B (s, y)] = (t \land s) \prod_{k = 1}^{d} (x_{k} \land y_{k})

(78)

for $(t, x_{1}, x_{2}, \dots, x_{d})$ , $(s, y_{1}, y_{2}, \dots, y_{d}) \in R_{+} \times R_{+}^{d}$ , where $(\cdot \land \cdot)$ represents the minimum of two numbers. We can easily obtain

E [B^{2} (t, x)] = prod (t, x)

(79)

where $prod (t, x) = t x_{1} x_{2} \dots x_{d}$ is the coordinate product of $(t, x)$ and the density function is

p^{(1)} (u; t, x) = \frac{1}{\sqrt{2 π prod (t, x)}} e^{- \frac{u^{2}}{2 prod (t, x)}}

(80)

Moreover, the Fokker–Planck equations of Brownian sheet are

\{\begin{matrix} \frac{\partial}{\partial t} p^{(1)} (u; t, x) & = \frac{prod (x)}{2} \frac{\partial^{2}}{\partial u^{2}} p^{(1)} (u, t, x) \\ \frac{\partial}{\partial x_{k}} p^{(1)} (u; t, x) & = \frac{prod (t, x)}{2 x_{k}} \frac{\partial^{2}}{\partial u^{2}} p^{(1)} (u, t, x) \\ k & = 1, 2, \dots, d \end{matrix}

(81)

with the initial condition $p (u; t, x) = δ (u)$ as $prod (t, x) = 0$ .

Following the concept of constructing a Brownian bridge on Brownian motion [53], we refer to

B^{*} (t, x) = B (t, x) - prod (t, x) B (1, 1, \dots, 1)

(82)

as a Brownian sheet bridge on the cube $(t, x) \in [0, 1] \times {[0, 1]}^{d}$ , where $B (t, x)$ represents the Brownian sheet. Obviously, $B^{*} (t, x)$ is Gaussian, and $E [B^{*} (t, x)] = 0$ and the covariance structure are

E [B^{*} (t, x) B^{*} (s, y)] = E [B (t, x) B (s, y)] - prod (t, x) prod (s, y)

(83)

we can obtain

E [B^{2} (t, x)] = prod (t, x) [1 - prod (t, x)]

(84)

and the density function of $B^{*} (t, x)$ is

p^{(2)} (u; t, x) = \frac{1}{\sqrt{2 π prod (t, x) [1 - prod (t, x)]}} e^{- \frac{u^{2}}{2 prod (t, x) [1 - prod (t, x)]}}

(85)

In addition to this, the Fokker–Planck equations of Brownian sheet bridge are

\{\begin{matrix} \frac{\partial}{\partial t} p^{(2)} (u; t, x) & = \frac{prod (x) [1 - 2 prod (t, x)]}{2} \frac{\partial^{2}}{\partial u^{2}} p^{(2)} (u, t, x) \\ \frac{\partial}{\partial x_{k}} p^{(2)} (u; t, x) & = \frac{prod (t, x)}{2 x_{k}} [1 - 2 prod (t, x)] \frac{\partial^{2}}{\partial u^{2}} p^{(2)} (u, t, x) \\ k & = 1, 2, \dots, d \end{matrix}

(86)

with the initial condition $p (u; t, x) = δ (u)$ as $prod (t, x) = 0$ , and we obtain the solution (85).

Combining two probability density functions (80) and (85) yields their respective Jeffreys divergence and generalized De Burijn identities. The Jeffreys divergence of (74) can be obtained at various space–time points as

J D (P (t, x), P (s, y)) = \frac{{[α (t, x) - α (s, y)]}^{2} + β (s, y)}{2 β (t, x)} + \frac{{[α (t, x) - α (s, y)]}^{2} + β (t, x)}{2 β (s, y)} - 1

(87)

and the Fisher divergence between $P^{(1)}$ and $P^{(2)}$ at the identical space–time point represents

\begin{matrix} F D_{(b_{k}^{(1)}, b_{k}^{(2)})} (P^{(1)} (t, x) ∥ P^{(2)} (t, x)) \\ = & \frac{1}{β_{1}^{2} (t, x) β_{2}^{2} (t, x)} {{[α_{1} (t, x) - α_{2} (t, x)]}^{2} [b_{k}^{(2)} β_{1}^{2} (t, x) + b_{k}^{(1)} β_{2}^{2} (t, x)] \\ + [β_{1} (t, x) - β_{2} (t, x)] [β_{1} (t, x) + β_{2} (t, x)] [b_{k}^{(2)} β_{1} (t, x) - b_{k}^{(1)} β_{2} (t, x)]} \end{matrix}

(88)

where $k = 0, 1, \dots, d$ .

Bring the density function of Brownian sheet into Equation (87); we can easily obtain the Jeffreys divergence of the Brownian sheet at various space–time points as

J D (P^{(1)} (t, x), P^{(1)} (s, y)) = \frac{prod (s, y)}{2 prod (t, x)} + \frac{prod (t, x)}{2 prod (s, y)} - 1

(89)

and the generalized Fisher information on space–time variables is as follows:

\{\begin{matrix} F I_{1}^{(t)} (P^{(1)} (t, x)) & = \frac{1}{2 t^{2}} \\ F I_{1}^{(x_{k})} (P^{(1)} (t, x)) & = \frac{1}{2 x_{k}^{2}} \end{matrix}

(90)

$k = 1, 2, \dots, d$ .

Then, we can obtain quotients of the squared difference between Jeffreys divergence and space–time variables

\{\begin{matrix} \frac{J D (P^{(1)} (t, x), P^{(1)} (s, x))}{{| t - s |}^{2}} = \frac{1}{2 s t} \\ \frac{J D (P^{(1)} (t, x), P^{(1)} (t, {\tilde{x}}^{(k)}))}{| x_{k} - x_{k}^{'} |^{2}} = \frac{1}{2 x_{k} x_{k}^{'}} \end{matrix}

(91)

and then we can obtain the relation between quotients and generalized Fisher information

\{\begin{matrix} \frac{J D (P^{(1)} (t, x), P^{(1)} (s, x))}{{| t - s |}^{2} F I_{1}^{(t)} (P^{(1)} (t, x))} = \frac{t}{s} \\ \frac{J D (P^{(1)} (t, x), P^{(1)} (t, {\tilde{x}}^{(k)}))}{| x_{k} - x_{k}^{'} |^{2} F I_{1}^{(x_{k})} (P^{(1)} (t, x))} = \frac{x_{k}}{x_{k}^{'}} \end{matrix}

(92)

for $k = 1, 2, \dots, d$ . If we consider the approximation of spacetime points $(t, x)$ and $(s, y)$ , the final result (92) satisfies the conclusion of Theorem 2.

Similarly, we can obtain the Jeffreys divergence of Brownian sheet bridge at different space–time points

J D (P^{(2)} (t, x), P^{(2)} (s, y)) = \frac{prod (s, y) [1 - prod (s, y)]}{2 prod (t, x) [1 - prod (t, x)]} + \frac{prod (t, x) [1 - prod (t, x)]}{2 prod (s, y) [1 - prod (s, y)]} - 1

(93)

and the generalized Fisher information on space–time variables

\{\begin{matrix} F I_{1}^{(t)} (P^{(2)} (t, x)) & = \frac{{[1 - 2 prod (t, x)]}^{2}}{2 t^{2} {[1 - prod (t, x)]}^{2}} \\ F I_{1}^{(x_{k})} (P^{(2)} (t, x)) & = \frac{{[1 - 2 prod (t, x)]}^{2}}{2 x_{k}^{2} {[1 - prod (t, x)]}^{2}} \end{matrix}

(94)

$k = 1, 2, \dots, d$ . Further, we can easily get the quotients of the squared difference between Jeffreys divergence and space–time variables

\{\begin{matrix} \frac{J D (P^{(2)} (t, x), P^{(2)} (s, x))}{{| t - s |}^{2}} = \frac{{[1 - prod (x) (s + t)]}^{2}}{2 s t [1 - prod (s, x)] [1 - prod (t, x)]} \\ \frac{J D (P^{(2)} (t, x), P^{(2)} (t, {\tilde{x}}^{(k)}))}{| x_{k} - x_{k}^{'} |^{2}} = \frac{1}{2 x_{k} x_{k}^{'} [1 - prod (t, x)] [1 - prod (t, {\tilde{x}}^{(k)})]} {[1 - \frac{prod (t, x)}{x_{k}} (x_{k} + x_{k}^{'})]}^{2} \end{matrix}

(95)

and then we can obtain the relation between quotients and generalized Fisher information

\{\begin{matrix} \frac{J D (P^{(2)} (t, x), P^{(2)} (s, x))}{{| t - s |}^{2} F I_{1}^{(t)} (P^{(2)} (t, x))} = & \frac{t [1 - prod (t, x)] {[1 - prod (x) (s + t)]}^{2}}{s [1 - prod (s, x)] {[1 - 2 prod (t, x)]}^{2}} \\ \frac{J D (P^{(2)} (t, x), P^{(2)} (t, {\tilde{x}}^{(k)}))}{| x_{k} - x_{k}^{'} |^{2} F I_{1}^{(x_{k})} (P^{(2)} (t, x))} = & \frac{x_{k} [1 - prod (t, x)]}{x_{k}^{'} [1 - prod (t, {\tilde{x}}^{(k)})] {[1 - 2 prod (t, x)]}^{2}} {[1 - \frac{prod (t, x)}{x_{k}} (x_{k} + x_{k}^{'})]}^{2} \end{matrix}

(96)

for $k = 1, 2, \dots, d$ . Without loss of generality, the result (96) also satisfies Theorem 2.

Next, we evaluate the Jeffreys divergence between the density functions (80) and (85) for the same space–time points. It should be noted that the Brownian sheet bridge density function is defined on a bounded domain; therefore, we limit our analysis to the space–time region of $(t, x) \in [0, 1] \times {[0, 1]}^{d}$ .

The Jeffreys divergence between $P^{(1)}$ and $P^{(2)}$ can be easily obtained as

J D (P^{(1)} (t, x), P^{(2)} (t, x)) = \frac{1 - prod (t, x)}{2} + \frac{1}{2 [1 - prod (t, x)]} - 1

(97)

and the Fisher divergence as shown in (88) is given by

\{\begin{matrix} F D_{(b_{0}^{(1)}, b_{0}^{(2)})} (P^{(1)} (t, x) ∥ P^{(2)} (t, x)) = prod (x) - \frac{prod (x)}{{[1 - prod (t, x)]}^{2}} \\ F D_{(b_{k}^{(1)}, b_{k}^{(2)})} (P^{(1)} (t, x) ∥ P^{(2)} (t, x)) = \frac{prod (t, x)}{x_{k}} - \frac{prod (t, x)}{x_{k} {[1 - prod (t, x)]}^{2}} \end{matrix}

(98)

with the remainder terms

R_{0} (P^{(1)} (t, x) ∥ P^{(2)} (t, x)) = R_{k} (P^{(1)} (t, x) ∥ P^{(2)} (t, x)) = 0

(99)

for $k = 1, 2, \dots, d$ . Furthermore, we can obtain the generalized De Bruijn identities

\{\begin{matrix} \frac{\partial}{\partial t} J D (P^{(1)} (t, x), P^{(2)} (t, x)) & = - \frac{1}{2} F D_{(b_{0}^{(1)}, b_{0}^{(2)})} (P^{(1)} (t, x) ∥ P^{(2)} (t, x)) \\ \frac{\partial}{\partial x_{k}} J D (P^{(1)} (t, x), P^{(2)} (t, x)) & = - \frac{1}{2} F D_{(b_{k}^{(1)}, b_{k}^{(2)})} (P^{(1)} (t, x) ∥ P^{(2)} (t, x)) \\ k & = 1, 2, \dots, d \end{matrix}

(100)

Next, we present two categories of significant Fokker–Planck equations and provide pertinent illustrations for computing Jefferys divergence, Fisher information, and Fisher divergence.

Theorem 5.

Suppose the functions in the Fokker–Planck Equations (43) for the continuous random field $X (t, x)$ are formulated as follows:

$\{\begin{matrix} a_{k} (u; t, x) & \equiv 0 \\ b_{k} (u; t, x) & = b_{k} (t, x) u^{2} > 0 \\ k & = 0, 1, 2, \dots, d \end{matrix}$ (101)

where $b_{k}$ are continuously differentiable functions independent of u and a continuously differentiable function $β (t, x)$ exists, such that

$d β (t, x) = b_{0} (t, x) d t + b_{1} (t, x) d x_{1} + \dots + b_{d} (t, x) d x_{d}$ (102)

the initial value $X (t, x) = 1$ as $prod (t, x) = 0$ and the initial density function is $p (u; t, x) = δ (u - 1)$ as $prod (t, x) = 0$ . Then, the density function is

$p (u; t, x) = \frac{e^{β (t, x)}}{\sqrt{2 π β (t, x)}} e^{- \frac{{[log u + \frac{3}{2} β (t, x)]}^{2}}{2 β (t, x)}}$ (103)

Proof.

Depending on the conditions, it is easy to obtain the Fokker–Planck equations as

$\{\begin{matrix} \frac{\partial}{\partial t} p (u; t, x) & = \frac{b_{0} (t, x)}{2} u^{2} \frac{\partial^{2} p (u; t, x)}{\partial u^{2}} + 2 b_{0} (t, x) u \frac{\partial}{\partial u} p (u; t, x) + b_{0} (t, x) p (u; t, x) \\ \frac{\partial}{\partial x_{k}} p (u; t, x) & = \frac{b_{k} (t, x)}{2} u^{2} \frac{\partial^{2} p (u; t, x)}{\partial u^{2}} + 2 b_{k} (t, x) u \frac{\partial}{\partial u} p (u; t, x) + b_{k} (t, x) p (u; t, x) \\ k & = 1, 2, \dots, d \end{matrix}$ (104)

Take the transformation $v = log u$ or $u = e^{v}$ and note $\tilde{p} (v; t, x) = p (u (v); t, x)$ ; we can obtain

$\{\begin{matrix} \frac{\partial}{\partial t} \tilde{p} (v; t, x) & = \frac{b_{0} (t, x)}{2} \frac{\partial^{2}}{\partial v^{2}} \tilde{p} (v; t, x) + \frac{3 b_{0} (t, x)}{2} \frac{\partial}{\partial v} \tilde{p} (v; t, x) + b_{0} (t, x) \tilde{p} (v; t, x) \\ \frac{\partial}{\partial x_{k}} \tilde{p} (v; t, x) & = \frac{b_{k} (t, x)}{2} \frac{\partial^{2}}{\partial v^{2}} \tilde{p} (v; t, x) + \frac{3 b_{k} (t, x)}{2} \frac{\partial}{\partial v} \tilde{p} (v; t, x) + b_{k} (t, x) \tilde{p} (v; t, x) \\ k & = 1, 2, \dots, d \end{matrix}$ (105)

with the solution

$\{\begin{matrix} \tilde{p} (v; t, x) & = \frac{e^{\int_{0}^{t} b_{0} (s, x) d s}}{\sqrt{2 π \int_{0}^{t} b_{0} (s, x) d s}} e^{- \frac{{[v + \frac{3}{2} \int_{0}^{t} b_{0} (s, x) d s]}^{2}}{2 \int_{0}^{t} b_{0} (s, x) d s}} \\ \tilde{p} (v; t, x) & = \frac{e^{\int_{0}^{x_{k}} b_{k} (t, x) d x_{k}}}{\sqrt{2 π \int_{0}^{x_{k}} b_{k} (t, x) d x_{k}}} e^{- \frac{{[v + \frac{3}{2} \int_{0}^{x_{k}} b_{k} (t, x) d x_{k}]}^{2}}{2 \int_{0}^{x_{k}} b_{k} (t, x) d x_{k}}} \\ k & = 1, 2, \dots, d \end{matrix}$ (106)

then,

$\{\begin{matrix} p (u; t, x) & = \frac{e^{\int_{0}^{t} b_{0} (s, x) d s}}{\sqrt{2 π \int_{0}^{t} b_{0} (s, x) d s}} e^{- \frac{{[log u + \frac{3}{2} \int_{0}^{t} b_{0} (s, x) d s]}^{2}}{2 \int_{0}^{t} b_{0} (s, x) d s}} \\ p (u; t, x) & = \frac{e^{\int_{0}^{x_{k}} b_{k} (t, x) d x_{k}}}{\sqrt{2 π \int_{0}^{x_{k}} b_{k} (t, x) d x_{k}}} e^{- \frac{{[log u + \frac{3}{2} \int_{0}^{x_{k}} b_{k} (t, x) d x_{k}]}^{2}}{2 \int_{0}^{x_{k}} b_{k} (t, x) d x_{k}}} \\ k & = 1, 2, \dots, d \end{matrix}$ (107)

Recall that a continuously differential function $β (t, x)$ exists such that

$d β (t, x) = b_{0} (t, x) d t + b_{1} (t, x) d x_{1} + \dots + b_{d} (t, x) d x_{d}$ (108)

this enables the derivation of the probability density

$p (u; t, x) = \frac{e^{β (t, x)}}{\sqrt{2 π β (t, x)}} e^{- \frac{{[log u + \frac{3}{2} β (t, x)]}^{2}}{2 β (t, x)}}$ (109)

□

Remark 1.

In the stochastic process theory, a correlation exists between the Fokker–Planck equation and the Itô process. Specifically, if the Itô process is

$d X_{t} = μ (X_{t}, t) d t + σ (X_{t}, t) d B_{t}$ (110)

then the corresponding Fokker–Planck equation can be obtained as

$\frac{\partial}{\partial t} p (u, t) = \frac{1}{2} \frac{\partial^{2}}{\partial u^{2}} [σ^{2} (u, t) p (u, t)] - \frac{\partial}{\partial u} [μ (u, t) p (u, t)]$ (111)

where μ and σ represent drift and diffusion, $B_{t}$ is the standard Brownian motion, or

$\frac{d X_{t}}{d t} = μ (X_{t}, t) + σ (X_{t}, t) W_{t}$ (112)

where $W_{t} = \frac{d B_{t}}{d t}$ represents the white noise. Actually, if we consider the Itô processes corresponding to Fokker–Planck equations from Theorem 5, we can obtain

$\{\begin{matrix} \partial_{t} X (t, x) & = \sqrt{b_{0} (t, x)} X (t, x) W_{t} \\ \partial_{x_{k}} X (t, x) & = \sqrt{b_{k} (t, x)} X (t, x) W_{k} \end{matrix}$ (113)

where $W_{k}$ represents the space white noise with respect to $x_{k}$ , $k = 1, 2, \dots, d$ . Further, we can also write Equation (113) in vector form

$\nabla X (t, x) = γ (t, x) X (t, x) ⊙ W (t, x)$ (114)

where

$\{\begin{matrix} γ (t, x) & = [\sqrt{b_{0} (t, x)}, \sqrt{b_{1} (t, x)}, \dots, \sqrt{b_{d} (t, x)}] \\ W (t, x) & = [W_{t}, W_{1}, \dots, W_{d}] \end{matrix}$ (115)

∇ represents the gradient operator and ⊙ represents element by element multiplication. Notice that each equation in Equation (113) is similar in form to the geometric Brownian motion in the theory of stochastic processes. Similarly, we can call the space–time random field that satisfies Equation (113) a geometric Brownian filed.

If we consider different $β_{3} (t, x)$ and $β_{4} (t, x)$ in density function (103), we can obtain density functions $p^{(3)} (u; t, x)$ and $p^{(4)} (u; t, x)$ ; then, we can obtain the Jeffreys divergence

J D (P^{(3)} (t, x), P^{(3)} (s, y)) = \frac{β_{3} (t, x) + β_{3} (s, t) + 4}{8 β_{3} (t, x) β_{3} (s, y)} {[β_{3} (t, x) - β_{3} (s, y)]}^{2}

(116)

and generalized Fisher information

\{\begin{matrix} F I_{1}^{(t)} (P^{(3)} (t, x)) & = \frac{β_{3} (t, x) + 2}{4 β_{3}^{2} (t, x)} {[b_{0}^{(3)} (t, x)]}^{2} \\ F I_{1}^{(x_{k})} (P^{(3)} (t, x)) & = \frac{β_{3} (t, x) + 2}{4 β_{3}^{2} (t, x)} {[b_{k}^{(3)} (t, x)]}^{2} \end{matrix}

(117)

and then the quotients

\{\begin{matrix} \frac{J D (P^{(3)} (t, x), P^{(3)} (s, x))}{{| t - s |}^{2}} & = \frac{β_{3} (t, x) + β_{3} (s, x) + 4}{8 β_{3} (t, x) β_{3} (s, y)} {[\frac{β_{3} (t, x) - β_{3} (s, y)}{t - s}]}^{2} \\ \frac{J D (P^{(3)} (t, x), P^{(3)} (t, {\tilde{x}}^{(k)}))}{| x_{k} - x_{k}^{'} |^{2}} & = \frac{β_{3} (t, x) + β_{3} (t, {\tilde{x}}^{(k)}) + 4}{8 β_{3} (t, x) β_{3} (t, {\tilde{x}}^{(k)})} {[\frac{β_{3} (t, x) - β_{3} (t, {\tilde{x}}^{(k)})}{x_{k} - x_{k}^{'}}]}^{2} \end{matrix}

(118)

and we can easily obtain the relation

\{\begin{matrix} \frac{J D (P^{(3)} (t, x), P^{(3)} (s, y))}{{| t - s |}^{2} F I_{1}^{(t)} (P^{(3)} (t, x))} & = \frac{β_{3} (t, x)}{β_{3} (s, x)} \frac{β_{3} (t, x) + β_{3} (s, x) + 4}{2 [β_{3} (t, x) + 2]} {[\frac{β_{3} (t, x) - β_{3} (s, y)}{b_{0}^{(3)} (t, x) (t - s)}]}^{2} \\ \frac{J D (P^{(3)} (t, x), P^{(3)} (t, {\tilde{x}}^{(k)}))}{| x_{k} - x_{k}^{'} |^{2} F I_{1}^{(x_{k})} (P^{(3)} (t, x))} & = \frac{β_{3} (t, x)}{β_{3} (t, {\tilde{x}}^{(k)})} \frac{β_{3} (t, x) + β_{3} (t, {\tilde{x}}^{(k)}) + 4}{2 [β_{3} (t, x) + 2]} {[\frac{β_{3} (t, x) - β_{3} (t, {\tilde{x}}^{(k)})}{b_{k}^{(3)} (t, x) (x_{k} - x_{k}^{'})}]}^{2} \end{matrix}

(119)

for $k = 1, 2, \dots, d$ . Without a loss of generality, the result (119) also corroborates Theorem 2.

Furthermore, if we consider different $β_{3} (t, x)$ and $β_{4} (t, x)$ in density function (103), we can obtain density functions $p^{(3)} (u; t, x)$ and $p^{(4)} (u; t, x)$ ; then, the generalized Fisher divergence at the same space–time points is

\begin{matrix} F D_{(b_{k}^{(3)}, b_{k}^{(4)})} (P^{(3)} (t, x) ∥ P^{(4)} (t, x)) \\ = & \frac{β_{3} (t, x) - β_{4} (t, x)}{4 β_{3}^{2} (t, x) β_{4}^{2} (s, y)} \\ \times \{b_{k}^{(4)} (t, x) β_{3} (t, x) [β_{3}^{2} (t, x) - 2 β_{4}^{2} (s, y)] - b_{k}^{(3)} (t, x) β_{4} (t, x) [β_{4}^{2} (s, y) - 2 β_{3}^{2} (t, x)]\} \\ \times [4 β_{3} (t, x) + 4 β_{4} (t, x) - 3 β_{3} (t, x) β_{4} (t, x)] [b_{k}^{(4)} (t, x) β_{3} (t, x) - b_{k}^{(3)} (t, x) β_{4} (t, x)] \end{matrix}

(120)

with the remainder terms

R_{k} (P^{(3)} (t, x) ∥ P^{(4)} (t, x)) = 2 [b_{k}^{(3)} (t, x) - b_{k}^{(4)} (t, x)]

(121)

$k = 0, 1, 2, \dots, d$ . Then, the generalized De Bruijn identities are as follows:

\{\begin{matrix} \frac{\partial}{\partial t} J D (P^{(3)} (t, x), P^{(4)} (t, x)) = & - \frac{1}{2} F D_{(b_{0}^{(3)}, b_{0}^{(4)})} (P^{(3)} (t, x) ∥ P^{(4)} (t, x)) \\ - 2 [b_{0}^{(3)} (t, x) - b_{0}^{(4)} (t, x)] \\ \frac{\partial}{\partial x_{k}} J D (P^{(3)} (t, x), P^{(4)} (t, x)) = & - \frac{1}{2} F D_{(b_{k}^{(3)}, b_{k}^{(4)})} (P^{(3)} (t, x) ∥ P^{(4)} (t, x)) \\ - 2 [b_{k}^{(3)} (t, x) - b_{k}^{(4)} (t, x)] \\ k = & 1, 2, \dots, d \end{matrix}

(122)

Additionally, we offer an alternative non-trivial form below that utilizes the implicit functions method to express our results. This form differs from the one presented in Theorem 5.

Theorem 6.

Suppose the functions in Fokker–Planck Equations (43) for the continuously bounded random field $X (t, x) \in [0, 1]$ are formulated as follows:

$\{\begin{matrix} a_{k} (u; t, x) & = - \frac{3}{2} b_{k} (t, x) u \\ b_{k} (u; t, x) & = b_{k} (t, x) (1 - u^{2}) \\ k & = 0, 1, 2, \dots, d \end{matrix}$ (123)

where $b_{k}$ are continuously differentiable functions independent of u and a continuously differentiable function $β (t, x)$ exists such that

$d β (t, x) = b_{0} (t, x) d t + b_{1} (t, x) d x_{1} + \dots + b_{d} (t, x) d x_{d}$ (124)

the initial value $X (t, x) = 0$ as $prod (t, x) = 0$ and the initial density function is $p (u; t, x) = δ (u)$ as $prod (t, x) = 0$ . Then, the density function is as follows:

$\{\begin{matrix} p (u; t, x) & = \frac{e^{\frac{1}{2} β (t, x)}}{\sqrt{2 π β (t, x)}} e^{- \frac{v^{2}}{2 β (t, x)}} \\ u & = sin v \end{matrix}$ (125)

Proof.

Depending on the conditions, it is easy to obtain the Fokker–Planck equations as

$\{\begin{matrix} \frac{\partial}{\partial t} p (u; t, x) & = \frac{b_{0} (t, x)}{2} \frac{\partial^{2}}{\partial u^{2}} [(1 - u^{2}) p (u; t, x)] - \frac{3 b_{0} (t, x)}{2} \frac{\partial}{\partial u} [u p (u; t, x)] \\ \frac{\partial}{\partial x_{k}} p (u; t, x) & = \frac{b_{k} (t, x)}{2} \frac{\partial^{2}}{\partial u^{2}} [(1 - u^{2}) p (u; t, x)] - \frac{3 b_{k} (t, x)}{2} \frac{\partial}{\partial u} [u p (u; t, x)] \\ k & = 1, 2, \dots, d \end{matrix}$ (126)

By implementing the transformation with $u = sin v$ and defining $\tilde{p} (v; t, x) = p (u (v; t, x))$ , the equations can be restructured as

$\{\begin{matrix} \frac{\partial}{\partial t} \tilde{p} (v; t, x) & = \frac{b_{0} (t, x)}{2} \frac{\partial^{2}}{\partial v^{2}} \tilde{p} (v; t, x) + \frac{b_{0} (t, x)}{2} \tilde{p} (v; t, x) \\ \frac{\partial}{\partial x_{k}} \tilde{p} (v; t, x) & = \frac{b_{k} (t, x)}{2} \frac{\partial^{2}}{\partial v^{2}} \tilde{p} (v; t, x) + \frac{b_{k} (t, x)}{2} \tilde{p} (v; t, x) \\ k & = 1, 2, \dots, d \end{matrix}$ (127)

with the solution

$\{\begin{matrix} \tilde{p} (v; t, x) & = \frac{e^{\frac{1}{2} \int_{0}^{t} b_{0} (s, x) d s}}{\sqrt{2 π \int_{0}^{t} b_{0} (s, x) d s}} e^{- \frac{v^{2}}{2 \int_{0}^{t} b_{0} (s, x) d s}} \\ \tilde{p} (v; t, x) & = \frac{e^{\frac{1}{2} \int_{0}^{x_{k}} b_{k} (t, x) d x_{k}}}{\sqrt{2 π \int_{0}^{x_{k}} b_{k} (t, x) d x_{k}}} e^{- \frac{v^{2}}{2 \int_{0}^{x_{k}} b_{k} (t, x) d x_{k}}} \\ k & = 1, 2, \dots, d \end{matrix}$ (128)

Recall that a continuously differential function $β (t, x)$ exists such that

$d β (t, x) = b_{0} (t, x) d t + b_{1} (t, x) d x_{1} + \dots + b_{d} (t, x) d x_{d}$ (129)

we can derive the probability density function

$\tilde{p} (v; t, x) = \frac{e^{\frac{1}{2} β (t, x)}}{\sqrt{2 π β (t, x)}} e^{- \frac{v^{2}}{2 β (t, x)}}$ (130)

then,

$\{\begin{matrix} p (u; t, x) & = \frac{e^{\frac{1}{2} β (t, x)}}{\sqrt{2 π β (t, x)}} e^{- \frac{v^{2}}{2 β (t, x)}} \\ u & = sin v \end{matrix}$ (131)

□

Remark 2.

Similar to the discussion in Remark 1, we can obtain the Itô processes corresponding to the Fokker–Planck equations in Theorem 6

$\{\begin{matrix} \partial_{t} X (t, x) & = - \frac{3}{2} b_{0} (t, x) X (t, x) + \sqrt{b_{0} (t, x) [1 - X^{2} (t, x)]} W_{t} \\ \partial_{x_{k}} X (t, x) & = - \frac{3}{2} b_{k} (t, x) X (t, x) + \sqrt{b_{k} (t, x) [1 - X^{2} (t, x)]} W_{k} \end{matrix}$ (132)

$k = 1, 2, \dots, d$ . In fact, this random field can be solved with a sinusoidal transformation, and the corresponding probability density function can be obtained. Although random field (132) has not yet found its application scenario, it gives us ideas for constructing different forms on space–time random fields in the future.

From density function (125), if we consider different $β_{5} (t, x)$ and $β_{6} (t, x)$ , we can obtain density functions $p^{(5)} (u; t, x)$ and $p^{(6)} (u; t, x)$ ; then, we can obtain the Jeffreys divergence and generalized Fisher information

\begin{matrix} J D (P^{(5)} (t, x), P^{(5)} (s, y)) = \frac{1 - β_{5} (t, x) - β_{5} (s, y)}{2 β_{5} (t, x) β_{5} (s, y)} {[β_{5} (t, x) - β_{5} (s, y)]}^{2} \end{matrix}

(133)

and

\{\begin{matrix} F I_{1}^{(t)} (P^{(5)} (t, x)) & = \frac{1 - 2 β_{5} (t, x)}{2 β_{5}^{2} (t, x)} {[b_{0}^{(5)} (t, x)]}^{2} \\ F I_{1}^{(x_{k})} (P^{(5)} (t, x)) & = \frac{1 - 2 β_{5} (t, x)}{2 β_{5}^{2} (t, x)} {[b_{k}^{(5)} (t, x)]}^{2} \end{matrix}

(134)

and then the quotients

\{\begin{matrix} \frac{J D (P^{(5)} (t, x), P^{(5)} (s, x))}{{| t - s |}^{2}} & = \frac{1 - β_{5} (t, x) - β_{5} (s, x)}{2 β_{5} (t, x) β_{5} (s, x)} {[\frac{β_{5} (t, x) - β_{5} (s, x)}{t - s}]}^{2} \\ \frac{J D (P^{(5)} (t, x), P^{(5)} (t, {\tilde{x}}^{(k)}))}{| x_{k} - x_{k}^{'} |^{2}} & = \frac{1 - β_{5} (t, x) - β_{5} (t, {\tilde{x}}^{(k)})}{2 β_{5} (t, x) β_{5} (t, {\tilde{x}}^{(k)})} {[\frac{β_{5} (t, x) - β_{5} (t, {\tilde{x}}^{(k)})}{x_{k} - x_{k}^{'}}]}^{2} \end{matrix}

(135)

Obviously, we can easily obtain

\{\begin{matrix} \frac{J D (P^{(5)} (t, x), P^{(5)} (s, x))}{{| t - s |}^{2} F I_{1}^{(t)} (P^{(5)} (t, x))} & = \frac{1 - β_{5} (t, x) - β_{5} (s, x)}{1 - 2 β_{5} (t, x)} \frac{β_{5} (t, x)}{β_{5} (s, x)} {[\frac{β_{5} (t, x) - β_{5} (s, x)}{b_{0}^{(5)} (t, x) (t - s)}]}^{2} \\ \frac{J D (P^{(5)} (t, x), P^{(5)} (t, {\tilde{x}}^{(k)}))}{| x_{k} - x_{k}^{'} |^{2} F I_{1}^{(x_{k})} (P^{(5)} (t, x))} & = \frac{1 - β_{5} (t, x) - β_{5} (t, {\tilde{x}}^{(k)})}{1 - 2 β_{5} (t, x)} \frac{β_{5} (t, x)}{β_{5} (t, {\tilde{x}}^{(k)})} {[\frac{β_{5} (t, x) - β_{5} (t, {\tilde{x}}^{(k)})}{b_{k}^{(5)} (t, x) (x - x_{k}^{'})}]}^{2} \end{matrix}

(136)

for $k = 1, 2, \dots, d$ . Without a loss of generality, the result (136) corroborates Theorem 2.

Furthermore, if we consider different $β_{5} (t, x)$ and $β_{6} (t, x)$ in the density function (125), denoted as $p^{(5)} (u; t, x)$ and $p^{(6)} (u; t, x)$ , we can obtain the generalized Fisher divergence at the same space–time points

\begin{matrix} F D_{(b_{k}^{(5)}, b_{k}^{(6)})} (P^{(5)} (t, x) ∥ P^{(6)} (t, x)) \\ = & \frac{[β_{5} (t, x) - β_{6} (t, x)] [b_{k}^{(6)} β_{5} (t, x) - b_{k}^{(5)} β_{6} (t, x)]}{2 β_{5}^{2} (t, x) β_{6}^{2} (t, x)} [β_{5} (t, x) - β_{5}^{2} (t, x) + β_{6} (t, x) - β_{6}^{2} (t, x)] \end{matrix}

(137)

with the remainder terms

R_{k} (P^{(5)} (t, x) ∥ P^{(6)} (t, x)) = b_{k}^{(5)} (t, x) - b_{k}^{(6)} (t, x)

(138)

for $k = 0, 1, 2, \dots, d$ . Then, the generalized De Bruijn identities are as follows:

\{\begin{matrix} \frac{\partial}{\partial t} J D (P^{(5)} (t, x), P^{(6)} (t, x)) & = - \frac{1}{2} F D_{(b_{0}^{(5)}, b_{0}^{(6)})} (P^{(5)} (t, x) ∥ P^{(6)} (t, x)) - [b_{0}^{(5)} (t, x) - b_{0}^{(6)} (t, x)] \\ \frac{\partial}{\partial x_{k}} J D (P^{(5)} (t, x), P^{(6)} (t, x)) & = - \frac{1}{2} F D_{(b_{k}^{(5)}, b_{k}^{(6)})} (P^{(5)} (t, x) ∥ P^{(6)} (t, x)) - [b_{k}^{(5)} (t, x) - b_{k}^{(6)} (t, x)] \\ k & = 1, 2, \dots, d \end{matrix}

(139)

5. Conclusions

In this paper, we provide a generalization of the classical definitions of entropy, divergence, and Fisher information and derive these measures on a space–time random field. In addition, the Fokker–Planck Equations (43) for the space–time random field and density functions are obtained. Moreover, we obtain the Jeffreys divergence of a space–time random field at different space–time positions, and we obtain the approximation of the ratio of Jeffreys divergence to the square of space–time coordinate difference to the generalized Fisher information (54). Additionally, we use the Jeffreys divergence on two space–time random fields from the same type but different parameters Fokker–Planck equations, to obtain generalized De Bruijn identities (61). Finally, we give three examples of Fokker–Planck equations, with their solutions, to calculate the corresponding Jeffreys divergence, generalized Fisher information, and Fisher divergence and obtain the De Bruijn identities. These results encourage further research into the entropy divergence of space–time random fields, which advances the pertinent fields of information entropy, Fisher information, and De Bruijn identities.

Acknowledgments

The author would like to thank Pingyi Fan, Zhanjie Song, Ying Li, and Yumeng Song for providing relevant references and helpful discussions on topics related to this work.

Abbreviations

The following abbreviations are used in this manuscript:

KL	Kullback–Leibler divergence
FI	Fisher information
CFI	Cross–Fisher information
FD	Fisher divergence
sFD	Symmetric Fisher divergence
JD	Jeffreys divergence

Open in a new tab

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The author declare no conflict of interest.

Funding Statement

This research received no external funding.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

1.Risken H. The Fokker–Planck Equation: Methods of Solution and Applications. Springer; Berlin/Heidelberg, Germany: 1984. [Google Scholar]
2.Stam A.J. Some inequalities satisfied by the quantities of information of Fisher and Shannon. Inf. Control. 1959;2:101–112. doi: 10.1016/S0019-9958(59)90348-1. [DOI] [Google Scholar]
3.Barron A.R. Entropy and the central limit theorem. Ann. Probab. 1986;14:336–342. doi: 10.1214/aop/1176992632. [DOI] [Google Scholar]
4.Johnson O. Information Theory and the Central Limit Theorem. Imperial College Press; London, UK: 2004. [Google Scholar]
5.Guo D. Relative entropy and score function: New information estimation relationships through arbitrary additive perturbation; Proceedings of the 2009 IEEE International Symposium on Information Theory; Seoul, Republic of Korea. 28 June–3 July 2009; pp. 814–818. [Google Scholar]
6.Toranzo I.V., Zozor S., Brossier J.-M. Generalization of the De Bruijn Identity to General ϕ-Entropies and ϕ-Fisher Informations. IEEE Trans. Inform. Theory. 2018;64:6743–6758. doi: 10.1109/TIT.2017.2771209. [DOI] [Google Scholar]
7.Kharazmi O., Balakrishnan N. Cumulative residual and relative cumulative residual Fisher information and their properties. IEEE Trans. Inform. Theory. 2021;67:6306–6312. doi: 10.1109/TIT.2021.3073789. [DOI] [Google Scholar]
8.Kolmogorov A.N. The local structure of turbulence in incompressible viscous fluid for very large Reynolds numbers. Dokl. Akad. Nauk SSSR. 1941;30:299–303. [Google Scholar]
9.Kolmogorov A.N. On the degeneration of isotropic turbulence in an incompressible viscous flu. Dokl. Akad. Nauk SSSR. 1941;31:538–542. [Google Scholar]
10.Kolmogorov A.N. Dissipation of energy in isotropic turbulence. Dokl. Akad. Nauk SSSR. 1941;32:19–21. [Google Scholar]
11.Yaglom A.M. Some classes of random fields in n-dimensional space, related to stationary random processes. Theory Probab. Its Appl. 1957;2:273–320. doi: 10.1137/1102021. [DOI] [Google Scholar]
12.Yaglom A.M. Correlation Theory of Stationary and Related Random Functions. Volume I: Basic Results. Springer; New York, NY, USA: 1987. [Google Scholar]
13.Yaglom A.M. Correlation Theory of Stationary and Related Random Functions. Volume II: Supplementary Notes and References. Springer; Berlin, Germany: 1987. [Google Scholar]
14.Bowditch A., Sun R. The two-dimensional continuum random field Ising model. Ann. Probab. 2022;50:419–454. doi: 10.1214/21-AOP1536. [DOI] [Google Scholar]
15.Bailleul I., Catellier R., Delarue F. Propagation of chaos for mean field rough differential equations. Ann. Probab. 2021;49:944–996. doi: 10.1214/20-AOP1465. [DOI] [Google Scholar]
16.Wu L., Samorodnitsky G. Regularly varying random fields. Stoch. Process Their Appl. 2020;130:4470–4492. doi: 10.1016/j.spa.2020.01.005. [DOI] [Google Scholar]
17.Koch E., Dombry C., Robert C.Y. A central limit theorem for functions of stationary max-stable random fields on Rd. Stoch. Process Their Appl. 2020;129:3406–3430. doi: 10.1016/j.spa.2018.09.014. [DOI] [Google Scholar]
18.Ye Z. Ph.D. Dissertation. Cornell University; Ithaca, NY, USA: 1989. On Entropy and ε-Entropy of Random Fields. [Google Scholar]
19.Ye Z., Berger T. A new method to estimate the critical distortion of random fields. IEEE Trans. Inform. Theory. 1992;38:152–157. doi: 10.1109/18.108261. [DOI] [Google Scholar]
20.Ye Z., Berger T. Information Measures for Discrete Random Fields. Science Press; Beijing, China: New York, NY, USA: 1998. [Google Scholar]
21.Ye Z., Yang W. Random Field: Network Information Theory and Game Theory. Science Press; Beijing, China: 2023. (In Chinese) [Google Scholar]
22.Ma C. Stationary random fields in space and time with rational spectral densities. IEEE Trans. Inform. Theory. 2007;53:1019–1029. doi: 10.1109/TIT.2006.890721. [DOI] [Google Scholar]
23.Hairer M. A theory of regularity structures. Invent. Math. 2014;198:269–504. doi: 10.1007/s00222-014-0505-4. [DOI] [Google Scholar]
24.Hairer M. Solving the KPZ equation. Ann. Math. 2013;178:559–664. doi: 10.4007/annals.2013.178.2.4. [DOI] [Google Scholar]
25.Kremp H., Perkowski N. Multidimensional SDE with distributional drift and Lévy noise. Bernoulli. 2022;28:1757–1783. doi: 10.3150/21-BEJ1394. [DOI] [Google Scholar]
26.Beeson R., Namachchivaya N.S., Perkowski N. Approximation of the filter equation for multiple timescale, correlated, nonlinear systems. SIAM J. Math. Anal. 2022;54:3054–3090. doi: 10.1137/20M1379265. [DOI] [Google Scholar]
27.Song Z., Zhang J. A note for estimation about average differential entropy of continuous bounded space–time random field. Chin. J. Electron. 2022;31:793–803. doi: 10.1049/cje.2021.00.213. [DOI] [Google Scholar]
28.Kramers H.A. Brownian motion in a field of force and the diffusion model of chemical reactions. Physica. 1940;7:284–304. doi: 10.1016/S0031-8914(40)90098-2. [DOI] [Google Scholar]
29.Moyal J.E. Stochastic processes and statistical physics. J. R. Stat. Soc. Ser. B Stat. Methodol. 1949;11:150–210. doi: 10.1111/j.2517-6161.1949.tb00030.x. [DOI] [Google Scholar]
30.Shannon C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948;27:379–423, 623–656. doi: 10.1002/j.1538-7305.1948.tb01338.x. [DOI] [Google Scholar]
31.Neeser F.D., Massey J.L. Proper complex random processes with applications to information theory. IEEE Trans. Inform. Theory. 1991;39:1293–1302. doi: 10.1109/18.243446. [DOI] [Google Scholar]
32.Ihara S. Information Theory-for Continuous Systems. World Scientific; Singapore: 1993. [Google Scholar]
33.Gray R.M. Entropy and Information Theory. Springer; Boston, MA, USA: 2011. [Google Scholar]
34.Bach F. Information Theory With Kernel Methods. IEEE Trans. Inform. Theory. 2023;69:752–775. doi: 10.1109/TIT.2022.3211077. [DOI] [Google Scholar]
35.Kullback S., Leibler R.A. On information and sufficiency. Ann. Math. Stat. 1951;22:79–86. doi: 10.1214/aoms/1177729694. [DOI] [Google Scholar]
36.Jeffreys H. An invariant form for the prior probability in estimation problems. Proc. R. Soc. Lond. A. 1946;186:453–461. doi: 10.1098/rspa.1946.0056. [DOI] [PubMed] [Google Scholar]
37.Fuglede B., Topsøe F. Jensen-Shannon divergence and Hilbert space embedding; Proceedings of the IEEE International Symposium on Information Theory (ISIT); Chicago, IL, USA. 27 June–2 July 2004; [Google Scholar]
38.Rényi A. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. Volume 1. University of California Press; Berkeley, CA, USA: 1961. On measures of entropy and information; pp. 547–561. [Google Scholar]
39.She R., Fan P., Liu X.-Y., Wang X. Interpretable Generative Adversarial Networks With Exponential Function. IEEE Trans. Signal Process. 2021;69:3854–3867. doi: 10.1109/TSP.2021.3089285. [DOI] [Google Scholar]
40.Liu S., She R., Zhu Z., Fan P. Storage Space Allocation Strategy for Digital Data with Message Importance. Entropy. 2020;22:591. doi: 10.3390/e22050591. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.She R., Liu S., Fan P. Attention to the Variation of Probabilistic Events: Information Processing with Message Importance Measure. Entropy. 2019;21:439. doi: 10.3390/e21050439. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Wan S., Lu J., Fan P., Letaief K.B. Information Theory in Formation Control: An Error Analysis to Multi-Robot Formation. Entropy. 2018;20:618. doi: 10.3390/e20080618. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.She R., Liu S., Fan P. Recognizing Information Feature Variation: Message Importance Transfer Measure and Its Applications in Big Data. Entropy. 2018;20:401. doi: 10.3390/e20060401. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Nielsen F. An Elementary Introduction to Information Geometry. Entropy. 2020;22:1100. doi: 10.3390/e22101100. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Nielsen F. On the Jensen–Shannon Symmetrization of Distances Relying on Abstract Means. Entropy. 2019;21:485. doi: 10.3390/e21050485. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Nielsen F., Nock R. Generalizing skew Jensen divergences and Bregman divergences with comparative convexity. IEEE Signal Process. Lett. 2017;24:1123–1127. doi: 10.1109/LSP.2017.2712195. [DOI] [Google Scholar]
47.Furuichi S., Minculete N. Refined Young Inequality and Its Application to Divergences. Entropy. 2021;23:514. doi: 10.3390/e23050514. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Pinele J., Strapasson J.E., Costa S.I. The Fisher-Rao Distance between Multivariate Normal Distributions: Special Cases, Bounds and Applications. Entropy. 2020;22:404. doi: 10.3390/e22040404. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Reverter F., Oller J.M. Computing the Rao distance for Gamma distributions. J. Comput. Appl. Math. 2003;157:155–167. doi: 10.1016/S0377-0427(03)00387-X. [DOI] [Google Scholar]
50.Pawula R.F. Generalizations and extensions of the Fokker–Planck-Kolmogorov equations. IEEE Trans. Inform. Theory. 1967;13:33–41. doi: 10.1109/TIT.1967.1053955. [DOI] [Google Scholar]
51.Pawula R.F. Approximation of the linear Boltzmann equation by the Fokker–Planck equation. Phys. Rev. 1967;162:186–188. doi: 10.1103/PhysRev.162.186. [DOI] [Google Scholar]
52.Khoshnevisan D., Shi Z. Brownian Sheet and Capacity. Ann. Probab. 1999;27:1135–1159. doi: 10.1214/aop/1022677442. [DOI] [Google Scholar]
53.Revuz D., Yor M. Continuous Martingales and Brownian Motion. 2nd ed. Springer; New York, NY, USA: 1999. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Not applicable.

[B1-entropy-25-01445] 1.Risken H. The Fokker–Planck Equation: Methods of Solution and Applications. Springer; Berlin/Heidelberg, Germany: 1984. [Google Scholar]

[B2-entropy-25-01445] 2.Stam A.J. Some inequalities satisfied by the quantities of information of Fisher and Shannon. Inf. Control. 1959;2:101–112. doi: 10.1016/S0019-9958(59)90348-1. [DOI] [Google Scholar]

[B3-entropy-25-01445] 3.Barron A.R. Entropy and the central limit theorem. Ann. Probab. 1986;14:336–342. doi: 10.1214/aop/1176992632. [DOI] [Google Scholar]

[B4-entropy-25-01445] 4.Johnson O. Information Theory and the Central Limit Theorem. Imperial College Press; London, UK: 2004. [Google Scholar]

[B5-entropy-25-01445] 5.Guo D. Relative entropy and score function: New information estimation relationships through arbitrary additive perturbation; Proceedings of the 2009 IEEE International Symposium on Information Theory; Seoul, Republic of Korea. 28 June–3 July 2009; pp. 814–818. [Google Scholar]

[B6-entropy-25-01445] 6.Toranzo I.V., Zozor S., Brossier J.-M. Generalization of the De Bruijn Identity to General ϕ-Entropies and ϕ-Fisher Informations. IEEE Trans. Inform. Theory. 2018;64:6743–6758. doi: 10.1109/TIT.2017.2771209. [DOI] [Google Scholar]

[B7-entropy-25-01445] 7.Kharazmi O., Balakrishnan N. Cumulative residual and relative cumulative residual Fisher information and their properties. IEEE Trans. Inform. Theory. 2021;67:6306–6312. doi: 10.1109/TIT.2021.3073789. [DOI] [Google Scholar]

[B8-entropy-25-01445] 8.Kolmogorov A.N. The local structure of turbulence in incompressible viscous fluid for very large Reynolds numbers. Dokl. Akad. Nauk SSSR. 1941;30:299–303. [Google Scholar]

[B9-entropy-25-01445] 9.Kolmogorov A.N. On the degeneration of isotropic turbulence in an incompressible viscous flu. Dokl. Akad. Nauk SSSR. 1941;31:538–542. [Google Scholar]

[B10-entropy-25-01445] 10.Kolmogorov A.N. Dissipation of energy in isotropic turbulence. Dokl. Akad. Nauk SSSR. 1941;32:19–21. [Google Scholar]

[B11-entropy-25-01445] 11.Yaglom A.M. Some classes of random fields in n-dimensional space, related to stationary random processes. Theory Probab. Its Appl. 1957;2:273–320. doi: 10.1137/1102021. [DOI] [Google Scholar]

[B12-entropy-25-01445] 12.Yaglom A.M. Correlation Theory of Stationary and Related Random Functions. Volume I: Basic Results. Springer; New York, NY, USA: 1987. [Google Scholar]

[B13-entropy-25-01445] 13.Yaglom A.M. Correlation Theory of Stationary and Related Random Functions. Volume II: Supplementary Notes and References. Springer; Berlin, Germany: 1987. [Google Scholar]

[B14-entropy-25-01445] 14.Bowditch A., Sun R. The two-dimensional continuum random field Ising model. Ann. Probab. 2022;50:419–454. doi: 10.1214/21-AOP1536. [DOI] [Google Scholar]

[B15-entropy-25-01445] 15.Bailleul I., Catellier R., Delarue F. Propagation of chaos for mean field rough differential equations. Ann. Probab. 2021;49:944–996. doi: 10.1214/20-AOP1465. [DOI] [Google Scholar]

[B16-entropy-25-01445] 16.Wu L., Samorodnitsky G. Regularly varying random fields. Stoch. Process Their Appl. 2020;130:4470–4492. doi: 10.1016/j.spa.2020.01.005. [DOI] [Google Scholar]

[B17-entropy-25-01445] 17.Koch E., Dombry C., Robert C.Y. A central limit theorem for functions of stationary max-stable random fields on Rd. Stoch. Process Their Appl. 2020;129:3406–3430. doi: 10.1016/j.spa.2018.09.014. [DOI] [Google Scholar]

[B18-entropy-25-01445] 18.Ye Z. Ph.D. Dissertation. Cornell University; Ithaca, NY, USA: 1989. On Entropy and ε-Entropy of Random Fields. [Google Scholar]

[B19-entropy-25-01445] 19.Ye Z., Berger T. A new method to estimate the critical distortion of random fields. IEEE Trans. Inform. Theory. 1992;38:152–157. doi: 10.1109/18.108261. [DOI] [Google Scholar]

[B20-entropy-25-01445] 20.Ye Z., Berger T. Information Measures for Discrete Random Fields. Science Press; Beijing, China: New York, NY, USA: 1998. [Google Scholar]

[B21-entropy-25-01445] 21.Ye Z., Yang W. Random Field: Network Information Theory and Game Theory. Science Press; Beijing, China: 2023. (In Chinese) [Google Scholar]

[B22-entropy-25-01445] 22.Ma C. Stationary random fields in space and time with rational spectral densities. IEEE Trans. Inform. Theory. 2007;53:1019–1029. doi: 10.1109/TIT.2006.890721. [DOI] [Google Scholar]

[B23-entropy-25-01445] 23.Hairer M. A theory of regularity structures. Invent. Math. 2014;198:269–504. doi: 10.1007/s00222-014-0505-4. [DOI] [Google Scholar]

[B24-entropy-25-01445] 24.Hairer M. Solving the KPZ equation. Ann. Math. 2013;178:559–664. doi: 10.4007/annals.2013.178.2.4. [DOI] [Google Scholar]

[B25-entropy-25-01445] 25.Kremp H., Perkowski N. Multidimensional SDE with distributional drift and Lévy noise. Bernoulli. 2022;28:1757–1783. doi: 10.3150/21-BEJ1394. [DOI] [Google Scholar]

[B26-entropy-25-01445] 26.Beeson R., Namachchivaya N.S., Perkowski N. Approximation of the filter equation for multiple timescale, correlated, nonlinear systems. SIAM J. Math. Anal. 2022;54:3054–3090. doi: 10.1137/20M1379265. [DOI] [Google Scholar]

[B27-entropy-25-01445] 27.Song Z., Zhang J. A note for estimation about average differential entropy of continuous bounded space–time random field. Chin. J. Electron. 2022;31:793–803. doi: 10.1049/cje.2021.00.213. [DOI] [Google Scholar]

[B28-entropy-25-01445] 28.Kramers H.A. Brownian motion in a field of force and the diffusion model of chemical reactions. Physica. 1940;7:284–304. doi: 10.1016/S0031-8914(40)90098-2. [DOI] [Google Scholar]

[B29-entropy-25-01445] 29.Moyal J.E. Stochastic processes and statistical physics. J. R. Stat. Soc. Ser. B Stat. Methodol. 1949;11:150–210. doi: 10.1111/j.2517-6161.1949.tb00030.x. [DOI] [Google Scholar]

[B30-entropy-25-01445] 30.Shannon C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948;27:379–423, 623–656. doi: 10.1002/j.1538-7305.1948.tb01338.x. [DOI] [Google Scholar]

[B31-entropy-25-01445] 31.Neeser F.D., Massey J.L. Proper complex random processes with applications to information theory. IEEE Trans. Inform. Theory. 1991;39:1293–1302. doi: 10.1109/18.243446. [DOI] [Google Scholar]

[B32-entropy-25-01445] 32.Ihara S. Information Theory-for Continuous Systems. World Scientific; Singapore: 1993. [Google Scholar]

[B33-entropy-25-01445] 33.Gray R.M. Entropy and Information Theory. Springer; Boston, MA, USA: 2011. [Google Scholar]

[B34-entropy-25-01445] 34.Bach F. Information Theory With Kernel Methods. IEEE Trans. Inform. Theory. 2023;69:752–775. doi: 10.1109/TIT.2022.3211077. [DOI] [Google Scholar]

[B35-entropy-25-01445] 35.Kullback S., Leibler R.A. On information and sufficiency. Ann. Math. Stat. 1951;22:79–86. doi: 10.1214/aoms/1177729694. [DOI] [Google Scholar]

[B36-entropy-25-01445] 36.Jeffreys H. An invariant form for the prior probability in estimation problems. Proc. R. Soc. Lond. A. 1946;186:453–461. doi: 10.1098/rspa.1946.0056. [DOI] [PubMed] [Google Scholar]

[B37-entropy-25-01445] 37.Fuglede B., Topsøe F. Jensen-Shannon divergence and Hilbert space embedding; Proceedings of the IEEE International Symposium on Information Theory (ISIT); Chicago, IL, USA. 27 June–2 July 2004; [Google Scholar]

[B38-entropy-25-01445] 38.Rényi A. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. Volume 1. University of California Press; Berkeley, CA, USA: 1961. On measures of entropy and information; pp. 547–561. [Google Scholar]

[B39-entropy-25-01445] 39.She R., Fan P., Liu X.-Y., Wang X. Interpretable Generative Adversarial Networks With Exponential Function. IEEE Trans. Signal Process. 2021;69:3854–3867. doi: 10.1109/TSP.2021.3089285. [DOI] [Google Scholar]

[B40-entropy-25-01445] 40.Liu S., She R., Zhu Z., Fan P. Storage Space Allocation Strategy for Digital Data with Message Importance. Entropy. 2020;22:591. doi: 10.3390/e22050591. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B41-entropy-25-01445] 41.She R., Liu S., Fan P. Attention to the Variation of Probabilistic Events: Information Processing with Message Importance Measure. Entropy. 2019;21:439. doi: 10.3390/e21050439. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B42-entropy-25-01445] 42.Wan S., Lu J., Fan P., Letaief K.B. Information Theory in Formation Control: An Error Analysis to Multi-Robot Formation. Entropy. 2018;20:618. doi: 10.3390/e20080618. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B43-entropy-25-01445] 43.She R., Liu S., Fan P. Recognizing Information Feature Variation: Message Importance Transfer Measure and Its Applications in Big Data. Entropy. 2018;20:401. doi: 10.3390/e20060401. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B44-entropy-25-01445] 44.Nielsen F. An Elementary Introduction to Information Geometry. Entropy. 2020;22:1100. doi: 10.3390/e22101100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B45-entropy-25-01445] 45.Nielsen F. On the Jensen–Shannon Symmetrization of Distances Relying on Abstract Means. Entropy. 2019;21:485. doi: 10.3390/e21050485. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B46-entropy-25-01445] 46.Nielsen F., Nock R. Generalizing skew Jensen divergences and Bregman divergences with comparative convexity. IEEE Signal Process. Lett. 2017;24:1123–1127. doi: 10.1109/LSP.2017.2712195. [DOI] [Google Scholar]

[B47-entropy-25-01445] 47.Furuichi S., Minculete N. Refined Young Inequality and Its Application to Divergences. Entropy. 2021;23:514. doi: 10.3390/e23050514. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B48-entropy-25-01445] 48.Pinele J., Strapasson J.E., Costa S.I. The Fisher-Rao Distance between Multivariate Normal Distributions: Special Cases, Bounds and Applications. Entropy. 2020;22:404. doi: 10.3390/e22040404. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B49-entropy-25-01445] 49.Reverter F., Oller J.M. Computing the Rao distance for Gamma distributions. J. Comput. Appl. Math. 2003;157:155–167. doi: 10.1016/S0377-0427(03)00387-X. [DOI] [Google Scholar]

[B50-entropy-25-01445] 50.Pawula R.F. Generalizations and extensions of the Fokker–Planck-Kolmogorov equations. IEEE Trans. Inform. Theory. 1967;13:33–41. doi: 10.1109/TIT.1967.1053955. [DOI] [Google Scholar]

[B51-entropy-25-01445] 51.Pawula R.F. Approximation of the linear Boltzmann equation by the Fokker–Planck equation. Phys. Rev. 1967;162:186–188. doi: 10.1103/PhysRev.162.186. [DOI] [Google Scholar]

[B52-entropy-25-01445] 52.Khoshnevisan D., Shi Z. Brownian Sheet and Capacity. Ann. Probab. 1999;27:1135–1159. doi: 10.1214/aop/1022677442. [DOI] [Google Scholar]

[B53-entropy-25-01445] 53.Revuz D., Yor M. Continuous Martingales and Brownian Motion. 2nd ed. Springer; New York, NY, USA: 1999. [Google Scholar]

PERMALINK

Jeffreys Divergence and Generalized Fisher Information Measures on Fokker–Planck Space–Time Random Field

Jiaxing Zhang

Roles

Abstract

1. Introduction

1.1. Space–Time Random Field

1.2. Kramers–Moyal Expansion and Fokker–Planck Equation

1.3. Differential Entropy and De Bruijn Identity

1.4. Entropy Divergence

2. Notations, Definitions, and Propositions

2.1. Notations and Assumptions

2.2. Definitions

Definition 1.

Definition 2.

Definition 3.

Proposition 1.

Proof.

Definition 4.

Definition 5.

Definition 6.

Lemma 1

Lemma 2

3. Main Results and Proofs

Theorem 1.

Proof.

Theorem 2.

Proof.

Theorem 3.

Proof.

4. Three Fokker–Planck Random Fields and Their Corresponding Information Measures

Theorem 4.

Proof.

Theorem 5.

Proof.

Remark 1.

Theorem 6.

Proof.

Remark 2.

5. Conclusions

Acknowledgments

Abbreviations

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

Funding Statement

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases