Beating the curse of dimension with accurate statistics for the Fokker–Planck equation in complex turbulent systems

Nan Chen; Andrew J Majda

doi:10.1073/pnas.1717017114

. 2017 Nov 20;114(49):12864–12869. doi: 10.1073/pnas.1717017114

Beating the curse of dimension with accurate statistics for the Fokker–Planck equation in complex turbulent systems

Nan Chen ^a,^b,¹, Andrew J Majda ^a,^b,^c,¹

PMCID: PMC5724285 PMID: 29158403

Significance

Solving the Fokker–Planck equation for high-dimensional complex dynamical systems is an important issue. Effective strategies are developed and incorporated into efficient statistically accurate algorithms for solving the Fokker–Planck equations associated with a rich class of high-dimensional nonlinear turbulent dynamical systems with strong non-Gaussian features. These effective strategies exploit a judicious block decomposition of high-dimensional conditional covariance matrices and statistical symmetry to facilitate an extremely efficient parallel computation and a significant reduction of sample numbers. The resulting algorithms can efficiently solve the Fokker–Planck equation in much higher dimensions even with orders in the millions and thus beat the curse of dimension. Skillful behavior of the algorithms is illustrated for highly non-Gaussian systems in excitable media and geophysical turbulence.

Keywords: high-dimensional non-Gaussian PDFs, hybrid strategy, block decomposition, statistical symmetry, small sample size

Abstract

Solving the Fokker–Planck equation for high-dimensional complex dynamical systems is an important issue. Recently, the authors developed efficient statistically accurate algorithms for solving the Fokker–Planck equations associated with high-dimensional nonlinear turbulent dynamical systems with conditional Gaussian structures, which contain many strong non-Gaussian features such as intermittency and fat-tailed probability density functions (PDFs). The algorithms involve a hybrid strategy with a small number of samples $L$ , where a conditional Gaussian mixture in a high-dimensional subspace via an extremely efficient parametric method is combined with a judicious Gaussian kernel density estimation in the remaining low-dimensional subspace. In this article, two effective strategies are developed and incorporated into these algorithms. The first strategy involves a judicious block decomposition of the conditional covariance matrix such that the evolutions of different blocks have no interactions, which allows an extremely efficient parallel computation due to the small size of each individual block. The second strategy exploits statistical symmetry for a further reduction of $L$ . The resulting algorithms can efficiently solve the Fokker–Planck equation with strongly non-Gaussian PDFs in much higher dimensions even with orders in the millions and thus beat the curse of dimension. The algorithms are applied to a $1,000$ -dimensional stochastic coupled FitzHugh–Nagumo model for excitable media. An accurate recovery of both the transient and equilibrium non-Gaussian PDFs requires only $L = 1$ samples! In addition, the block decomposition facilitates the algorithms to efficiently capture the distinct non-Gaussian features at different locations in a $240$ -dimensional two-layer inhomogeneous Lorenz 96 model, using only $L = 500$ samples.

The Fokker–Planck equation is a partial differential equation (PDE) that governs the time evolution of the probability density function (PDF) of a complex system with noise (1, 2). For a general nonlinear dynamical system,

d 𝐮 = 𝐅 (𝐮, t) d t + 𝚺 (𝐮, t) d 𝐖,

[1]

with state variables $𝐮 \in ℝ^{N}$ , noise matrix $𝚺 \in ℝ^{N \times K}$ , and white noise $𝐖 \in ℝ^{K}$ , the associated Fokker–Planck equation is given by

\frac{\partial}{\partial t} p (𝐮, t) = - \nabla_{𝐮} (𝐅 (𝐮, t) p (𝐮, t)) + \frac{1}{2} \nabla_{𝐮} \cdot \nabla_{𝐮} (𝐐 (𝐮, t) p (𝐮, t)), {p_{t} |}_{t = t_{0}} = p_{0} (𝐮),

[2]

with $𝐐 = 𝚺 𝚺^{T}$ . In many complex dynamical systems, such as geophysical and engineering turbulence, neuroscience, and excitable media, the solution of the Fokker–Planck equation in Eq. 2 involves strong non-Gaussian features with intermittency and extreme events (3–5). In addition, the dimension of $𝐮$ in these complex systems is typically very large, representing a variety of variability in different temporal and spatial scales (3, 6). Therefore, solving the high-dimensional Fokker–Planck equation for both the steady-state and transient phases with non-Gaussian features is an important issue. However, traditional numerical methods such as finite element and finite difference as well as the direct Monte Carlo simulations of Eq. 1 all suffer from the curse of dimension (7, 8).

Recently, the authors developed efficient statistically accurate algorithms for solving the Fokker–Planck equation associated with high-dimensional nonlinear turbulent dynamical systems with conditional Gaussian structures (9). These conditional Gaussian nonlinear dynamical systems capture many strong non-Gaussian features such as intermittency and fat-tailed PDFs (10). Applications of the conditional Gaussian framework include modeling and predicting the highly intermittent time series of the Madden–Julian oscillation and monsoon (11–13), state estimation of the turbulent ocean flows from noisy Lagrangian tracers (14–16), dynamic stochastic superresolution of sparsely observed turbulent systems (17), and stochastic superparameterization for geophysical turbulent flows (18), etc. The efficient statistically accurate algorithms in ref. 9 involve a hybrid strategy that requires only a small number of samples. In these algorithms, a conditional Gaussian mixture in the high-dimensional subspace of $𝐮_{𝐈𝐈}$ via an extremely efficient parametric method is combined with a judicious Gaussian kernel density estimation in the remaining low-dimensional subspace of $𝐮_{𝐈}$ . Particularly, the parametric method provides closed analytical formulas for determining the conditional Gaussian distributions in the high-dimensional subspace of $𝐮_{𝐈𝐈}$ and is therefore computationally efficient and accurate. It has been shown in a stringent set of numerical tests (9) that with an order of $L \sim O (100)$ samples the mixture distribution has significant skill in capturing both the statistically steady state and the transient behavior with fat tails of non-Gaussian PDFs in up to six dimensions. Rigorous analysis (19) indicates that $L$ does not increase exponentially as the dimension of the high-dimensional subspace of $𝐮_{𝐈𝐈}$ to maintain a given level of accuracy, which is fundamentally different from Monte Carlo methods.

In this article, two effective strategies are developed and incorporated into the algorithms in ref. 9 (hereafter, basic algorithms) that enable the expanded algorithms to efficiently solve the Fokker–Planck equation in much higher dimensions, even with orders in the millions. In fact, the major computational cost in the basic algorithms for systems with a large dimension of state variables $𝐮_{𝐈𝐈}$ comes from solving the time evolution of the conditional covariance. To overcome this difficulty, an effective strategy involving a decomposition of the state variables into different groups $𝐮 = \cup_{k = 1}^{K} 𝐮_{k}$ with each $𝐮_{k} = (𝐮_{𝐈, k}, 𝐮_{𝐈𝐈, k}) \in ℝ^{N_{𝐈, k} + N_{𝐈𝐈, k}}$ is developed. Then the high-dimensional conditional covariance matrix of $𝐮_{𝐈𝐈}$ conditioned on $𝐮_{𝐈}$ becomes a block diagonal matrix under the conditions that the nonlinear terms on the right-hand side of $𝐮_{k}$ in Eq. 1 consist of any nonlinear interactions between different components of $𝐮_{𝐈}$ but only those between $𝐮_{𝐈𝐈, k}$ and nonlinear functions of $𝐮_{𝐈, k}$ . Note that such conditions are not artificial and they are actually salient features of many complex dynamical systems with multiscale structures (18), multilevel dynamics (20), or state-dependent parameterizations (17). One important characteristic of the resulting covariance matrix is that the evolution of the $k$ th block, representing the conditional covariance of $𝐮_{𝐈𝐈, k}$ conditioned on $𝐮_{𝐈}$ , has no interactions with that of $𝐮_{𝐈𝐈, k'}$ for all $k' \neq k$ . This allows an extremely efficient parallel computation due to the small size of each individual block. On the other hand, the conditional means of different $𝐮_{𝐈𝐈, k}$ are all coupled in their evolutions.

The second effective strategy exploits the statistical symmetry if the dynamical system in Eq. 1 is statistically homogeneous; namely, the statistics of different $𝐮_{k}$ are identical with each other. The statistical symmetry is often satisfied when the underlying dynamics in Eq. 1 represent a discrete approximation of some PDEs in a periodic domain with nonlinear advection, diffusion, and homogeneous external forcing. Examples include a rich class of models in geophysical turbulence and excitable media (4, 21). In light of statistical symmetry, the number of samples $L$ in the algorithms can be greatly reduced. In fact, the effective sample size of each $𝐮_{k}$ becomes $L' = L K$ and therefore a much smaller $L$ is needed to reach the same level of accuracy as in the situation without using statistical symmetry.

These effective strategies are incorporated into the basic algorithms and then applied to two highly non-Gaussian dynamical systems. The first model is a $1,000$ -dimensional stochastic coupled FitzHugh–Nagumo (FHN) model for excitable media with extreme events (4, 22). It describes activation and deactivation dynamics of spiking neurons and has scale-invariant features. The block decomposition leads to solving the evolution of $500$ individual covariance matrices and the statistical symmetry allows an accurate estimation of both the transient and the steady-state PDFs using only $L = 1$ samples! The second model is the so-called two-layer Lorenz 96 (L96) model in geophysical turbulence (20, 23, 24), which is widely used as a testbed for data assimilation and parameterization in numerical weather forecasting. Inhomogeneous damping and coupling are adopted in the model with $240$ state variables that mimic the atmosphere motion along a latitude circle with different dissipation over land and sea. Despite the absence of statistical symmetry, the block decomposition facilitates the algorithms to efficiently capture the distinct non-Gaussian features at different locations, using only $L = 500$ samples.

The remainder of this article includes a detailed description of these effective strategies and their application to the stochastic coupled FHN and two-layer L96 models in solving the highly non-Gaussian PDFs at both the transient and statistical equilibrium phases, using a small number of samples.

Algorithms

Conditional Gaussian Framework.

The general framework of conditional Gaussian models is given as (10, 25)

d 𝐮_{𝐈} = [𝐀_{0} (t, 𝐮_{𝐈}) + 𝐀_{1} (t, 𝐮_{𝐈}) 𝐮_{𝐈𝐈}] d t + 𝚺_{𝐈} (t, 𝐮_{𝐈}) d 𝐖_{𝐈} (t),

[3a]

d 𝐮_{𝐈𝐈} = [𝐚_{0} (t, 𝐮_{𝐈}) + 𝐚_{1} (t, 𝐮_{𝐈}) 𝐮_{𝐈𝐈}] d t + 𝚺_{𝐈𝐈} (t, 𝐮_{𝐈}) d 𝐖_{𝐈𝐈} (t),

[3b]

where $𝐮 = (𝐮_{𝐈}, 𝐮_{𝐈𝐈})$ with both $𝐮_{𝐈} \in R^{N_{𝐈}}$ and $𝐮_{𝐈𝐈} \in R^{N_{𝐈𝐈}}$ being multidimensional state variables. In Eq. 3, $𝐀_{0}, 𝐀_{1}, 𝐚_{0}, 𝐚_{1}, 𝚺_{𝐈}$ and $𝚺_{𝐈𝐈}$ are vectors and matrices that depend only on time $t$ and the state variables $𝐮_{𝐈}$ , and $𝐖_{𝐈} (t)$ and $𝐖_{𝐈𝐈} (t)$ are independent Wiener processes. The systems in Eq. 3 are named as conditional Gaussian systems due to the fact that once $𝐮_{𝐈} (s)$ for $s \leq t$ is given, $𝐮_{𝐈𝐈} (t)$ conditioned on $𝐮_{𝐈} (s)$ becomes a Gaussian process with mean ${\bar{𝐮}}_{𝐈𝐈} (t)$ and covariance $𝐑_{𝐈𝐈} (t)$ ; i.e.,

p (𝐮_{𝐈𝐈} (t) | 𝐮_{𝐈} (s \leq t)) \sim N ({\bar{𝐮}}_{𝐈𝐈} (t), 𝐑_{𝐈𝐈} (t)) .

[4]

Despite the conditional Gaussianity, the coupled system in Eq. 3 remains highly nonlinear and is able to capture many non-Gaussian features as observed in nature (10). One of the desirable features of Eq. 3 is that the conditional distribution in Eq. 4 has the following closed analytical form (25):

d {\bar{𝐮}}_{𝐈𝐈} (t) = [𝐚_{0} (t, 𝐮_{𝐈}) + 𝐚_{1} (t, 𝐮_{𝐈}) {\bar{𝐮}}_{𝐈𝐈}] d t + (𝐑_{𝐈𝐈} 𝐀_{1}^{*} (t, 𝐮_{𝐈})) \times {(𝚺_{𝐈} 𝚺_{𝐈}^{*})}^{- 1} (t, 𝐮_{𝐈}) [d 𝐮_{𝐈} - (𝐀_{0} (t, 𝐮_{𝐈}) + 𝐀_{1} (t, 𝐮_{𝐈}) {\bar{𝐮}}_{𝐈𝐈}) d t],

[5a]

d 𝐑_{𝐈𝐈} (t) = {𝐚_{1} (t, 𝐮_{𝐈}) 𝐑_{𝐈𝐈} + 𝐑_{𝐈𝐈} 𝐚_{1}^{*} (t, 𝐮_{𝐈}) + (𝚺_{𝐈𝐈} 𝚺_{𝐈𝐈}^{*}) (t, 𝐮_{𝐈}) - 𝐑_{𝐈𝐈} 𝐀_{1}^{*} (t, 𝐮_{𝐈}) {(𝚺_{𝐈} 𝚺_{𝐈}^{*})}^{- 1} (t, 𝐮_{𝐈}) {(𝐑_{𝐈𝐈} 𝐀_{1}^{*} (t, 𝐮_{𝐈}))}^{*}} d t .

[5b]

Basic Algorithms.

Here, we summarize the basic efficient statistically accurate algorithms developed in ref. 9. First, we generate $L$ independent trajectories of the variables $𝐮_{𝐈}$ , namely $𝐮_{𝐈}^{1} (s \leq t), \dots, 𝐮_{𝐈}^{L} (s \leq t)$ . Then, different strategies are used to deal with $𝐮_{𝐈}$ and $𝐮_{𝐈𝐈}$ . The PDF of $𝐮_{𝐈𝐈}$ is estimated via a parametric method that exploits the closed form of the conditional Gaussian statistics in Eq. 5,

p (𝐮_{𝐈𝐈} (t)) = lim_{L \to \infty} \frac{1}{L} \sum_{i = 1}^{L} p (𝐮_{𝐈𝐈} (t) | 𝐮_{𝐈}^{i} (s \leq t)) .

[6]

Note that the limit $L \to \infty$ in Eq. 6 (as well as Eqs. 7 and 8 below) is taken to illustrate the statistical intuition, while the estimator is the nonasymptotic version. On the other hand, a Gaussian kernel density estimation method is used for solving the PDF of the observed variables $𝐮_{𝐈}$ ,

p (𝐮_{𝐈} (t)) = lim_{L \to \infty} \frac{1}{L} \sum_{i = 1}^{L} K_{𝐇} (𝐮_{𝐈} (t) - 𝐮_{𝐈}^{i} (t)),

[7]

where $K_{𝐇} (\cdot)$ is a Gaussian kernel centered at each sample point $𝐮_{𝐈}^{i} (t)$ with covariance given by the bandwidth matrix $𝐇 (t)$ . The kernel density estimation algorithm here involves a “solve-the-equation plug-in” approach for optimizing the bandwidth (26) that works for any non-Gaussian PDFs. Finally, combining Eqs. 6 and 7, a hybrid method is applied to solve the joint PDF of $𝐮_{𝐈}$ and $𝐮_{𝐈𝐈}$ through a Gaussian mixture,

p (𝐮_{𝐈} (t), 𝐮_{𝐈𝐈} (t)) = lim_{L \to \infty} \frac{1}{L} \sum_{i = 1}^{L} (K_{𝐇} (𝐮_{𝐈} (t) - 𝐮_{𝐈}^{i} (t)) \cdot p (𝐮_{𝐈𝐈} (t) | 𝐮_{𝐈}^{i} (s \leq t))) .

[8]

Practically, $L \sim O (100)$ is sufficient for the hybrid method to solve the joint PDF with $N_{𝐈} \leq 3$ and $N_{𝐈𝐈} \sim 10$ . Since $L$ is small, the trajectories $𝐮_{𝐈}^{1} (s \leq t), \dots, 𝐮_{𝐈}^{L} (s \leq t)$ can be obtained by running a Monte Carlo simulation for the coupled system [3], which is computationally affordable. In addition, the closed form of the $L$ conditional distributions in Eq. 6 can be solved in a parallel way due to their independence (9), which further reduces the computational cost.

Beating the Curse of Dimension with Block Decomposition.

The basic algorithms succeed in solving the Fokker–Planck equation with $O (10)$ state variables. Now we develop an effective strategy with block decomposition and incorporate it into the basic algorithms. The expanded algorithms can efficiently solve the Fokker–Planck equation in much higher dimensions even with orders in the millions and beat the curse of dimension.

Consider the following decomposition of state variables

𝐮_{k} = (𝐮_{𝐈, k}, 𝐮_{𝐈𝐈, k}) with 𝐮_{𝐈, k} \in ℝ^{N_{𝐈}, k} and 𝐮_{𝐈𝐈, k} \in ℝ^{N_{𝐈𝐈}, k},

where $1 \leq k \leq K$ , $N_{𝐈} = \sum_{k = 1}^{K} N_{𝐈, k}$ , and $N_{𝐈𝐈} = \sum_{k = 1}^{K} N_{𝐈𝐈, k}$ . Correspondingly, the full dynamics in Eq. 3 are also decomposed into $K$ groups, where the variables on the left-hand side of the $k$ th group are $𝐮_{k}$ . In addition, we assume both $𝚺_{𝐈}$ and $𝚺_{𝐈𝐈}$ are diagonal for notation simplicity.

To develop efficient statistically accurate algorithms that beat the curse of dimension, the following two conditions are imposed on the coupled system:

Condition 1:

In the dynamics of each $𝐮_{k}$ in Eq. 3, the terms $𝐀_{0, k}$ and $𝐚_{0, k}$ can depend on all of the components of $𝐮_{𝐈}$ while the terms $𝐀_{1, k}$ and $𝐚_{1, k}$ are only functions of $𝐮_{𝐈, k}$ ; namely,

\begin{matrix} 𝐀_{0, k} ≔ 𝐀_{0, k} (t, 𝐮_{𝐈}), 𝐚_{0, k} ≔ 𝐚_{0, k} (t, 𝐮_{𝐈}), \\ 𝐀_{1, k} ≔ 𝐀_{1, k} (t, 𝐮_{𝐈, k}), 𝐚_{1, k} ≔ 𝐚_{1, k} (t, 𝐮_{𝐈, k}) . \end{matrix}

[9]

In addition, only $𝐮_{𝐈𝐈, k}$ interacts with $𝐀_{1, k}$ and $𝐚_{1, k}$ on the right-hand side of the dynamics of $𝐮_{k}$ .

Condition 2:

The initial values of $(𝐮_{𝐈, k}, 𝐮_{𝐈𝐈, k})$ and $(𝐮_{𝐈, k'}, 𝐮_{𝐈𝐈, k'})$ with $k \neq k'$ are independent from each other.

Conditions 1 and 2 are not artificial and they are actually the salient features of many complex systems with multiscale structures (18), multilevel dynamics (20), or state-dependent parameterizations (17). Under these two conditions, the conditional covariance matrix becomes block diagonal, which can be easily verified according to Eq. 5b. The evolution of the conditional covariance of $𝐮_{𝐈𝐈, k}$ conditioned on $𝐮_{𝐈}$ is given by

d 𝐑_{𝐈𝐈, k} (t) = {𝐚_{1, k} 𝐑_{𝐈𝐈, k} + 𝐑_{𝐈𝐈, k} 𝐚_{1, k}^{*} + (𝚺_{𝐈𝐈, k} 𝚺_{𝐈𝐈, k}^{*}) - (𝐑_{𝐈𝐈, k} 𝐀_{1, k}^{*}) {(𝚺_{𝐈, k} 𝚺_{𝐈, k}^{*})}^{- 1} {(𝐑_{𝐈𝐈, k} 𝐀_{1, k}^{*})}^{*}} d t,

which has no interaction with that of $𝐑_{𝐈𝐈, k'}$ for all $k' \neq k$ since $𝐀_{0}$ and $𝐚_{0}$ do not enter into the evolution of the conditional covariance. Notably, the evolutions of different $𝐑_{𝐈𝐈, k}$ with $k = 1, \dots, K$ can be solved in a parallel way and the computation is extremely efficient due to the small size of each individual block. This facilitates the algorithms to efficiently solve the Fokker–Planck equation in large dimensions.

Next, the structures of $𝐀_{0, k}$ and $𝐚_{0, k}$ in Eq. 9 allow the coupling among all of the $K$ groups of variables in the conditional mean according to Eq. 5a. The evolution of ${\bar{𝐮}}_{𝐈𝐈, k}$ , namely the conditional mean of $𝐮_{𝐈𝐈, k}$ conditioned on $𝐮_{𝐈}$ , is given by

d {\bar{𝐮}}_{𝐈𝐈, k} (t) = [𝐚_{0, k} + 𝐚_{1, k} {\bar{𝐮}}_{𝐈𝐈, k}] d t + 𝐑_{𝐈𝐈, k} 𝐀_{1, k}^{*} {(𝚺_{𝐈, k} 𝚺_{𝐈, k}^{*})}^{- 1} \times [d 𝐮_{𝐈, k} - (𝐀_{0, k} (t, 𝐮_{𝐈}) + 𝐀_{1, k} {\bar{𝐮}}_{𝐈𝐈, k}) d t] .

Statistical Symmetry.

The computational cost in the algorithms developed above can be further reduced if the coupled system Eq. 3 has statistical symmetry; namely,

p (𝐮_{𝐈, k} (t), 𝐮_{𝐈𝐈, k} (t)) = p (𝐮_{𝐈, k'} (t), 𝐮_{𝐈𝐈, k'} (t)), for all t, k and k',

including the initial conditions. The statistical symmetry is often satisfied when the underlying dynamical system represents a discrete approximation of some PDEs in a periodic domain with nonlinear advection, diffusion, and homogeneous external forcing (21, 27).

With the statistical symmetry, collecting the conditional Gaussian ensembles $N ({\bar{𝐮}}_{𝐈𝐈, k} (t), 𝐑_{𝐈𝐈, k} (t))$ for a specific $k$ in $K$ different simulations is equivalent to collecting that for all $k$ with $1 \leq k \leq K$ in a single simulation. This also applies to $N (𝐮_{𝐈}^{i} (t), 𝐇 (t))$ that are associated with $𝐮_{𝐈}$ . Therefore, the statistical symmetry implies that the effective sample size is $L' = K L$ , where $K$ is the number of the group variables that are statistically symmetric and $L$ is the number of different simulations of the coupled systems via Monte Carlo. If $K$ is large, then a much smaller $L$ is needed to reach the same accuracy as in the situation without using statistical symmetry, which greatly reduces the computational cost. SI Appendix provides the mathematical details of reconstructing the joint PDFs, using statistical symmetry.

A Stochastic Coupled FHN Model

The efficient statistically accurate algorithms developed in this article work for a wide class of models in excitable media (4), including different versions of the famous FHN model that describes activation and deactivation dynamics of spiking neurons. Here, the algorithms are applied to a stochastic coupled FHN model with $N$ elements (4, 22),

ϵ \frac{d u_{i}}{d t} = u_{i} - \frac{1}{3} u_{i}^{3} - v_{i} + \sqrt{ϵ} δ_{1} {\dot{W}}_{u_{i}} + d_{u} (u_{i + 1} + u_{i - 1} - 2 u_{i}), i = 1, \dots, N,

[10a]

\frac{d v_{i}}{d t} = u_{i} + a + δ_{2} {\dot{W}}_{v_{i}},

[10b]

where $u_{i}$ and $v_{i}$ are activator and inhibitor variables and they belong to $𝐮_{𝐈}$ and $𝐮_{𝐈𝐈}$ in the conditional Gaussian framework, respectively, such that the conditions of the block decomposition strategy are satisfied. Periodic boundary conditions are imposed on $u_{i}$ variables. In Eq. 10, the timescale ratio $ϵ = 0.01 ≪ 1$ leads to a slow–fast structure of the model. The parameter $a = 1.05 > 1$ such that the system has a global attractor in the absence of noise and diffusion (28). The random noise is able to drive the system above the threshold level of global stability and triggers limit cycles intermittently. Note that with $N = 1$ , the model reduces to the classical FHN model with a single neuron and it contains the model families with both coherence resonance and self-induced stochastic resonance (29). With different choices of the noise strength $δ_{1}, δ_{2}$ and the diffusion coefficient $d_{u}$ , the system in Eq. 10 exhibits rich dynamical behaviors. Below, we adopt constant parameters and the initial values are $u_{i} (0) = - 2, v_{i} (0) = 0.5$ for all $i$ . Therefore, the model satisfies the statistical symmetry.

Model Behavior in Different Dynamical Regimes.

Fig. 1 shows the model behavior in three different regimes with $N = 500$ . Here, the noise coefficient $δ_{1} = 0.2$ is fixed while different values of $δ_{2}$ and $d_{u}$ are chosen for the three regimes. In Fig. 1A, the spatial–temporal patterns are highly coherent due to the choice of a weak noise $δ_{2} = 0.1$ and a strong diffusion $d_{u} = 10$ . The time series of both $u_{1}$ and $v_{1}$ have nearly regular oscillations with large bursts in $u_{i}$ around every $3$ units. The associated statistical equilibrium PDF of $u_{1}$ is bimodal while that of $v_{1}$ is skewed. With an increase of the noise $δ_{2} = 0.4$ and a decrease of the diffusion $d_{u} = 0.5$ (Fig. 1B), the coherent patterns becomes much weaker and only quasi-regular periods are found in the time series of $u_{1}$ . The associated PDF of $v_{1}$ turns into symmetric and slightly sub-Gaussian. With a further increase of the noise to $δ_{2} = 0.8$ (Fig. 1C), the spatial–temporal pattern becomes strongly mixed and the time series is more irregular. Correspondingly, the PDF of $v_{1}$ becomes nearly Gaussian with a large variance.

It is shown in SI Appendix that the FHN model Eq. 10 is scale invariant in all three regimes. The scale-invariant structure means that the spatial–temporal structures in any given scale change little as the number of spatial grid points $N$ increases. Mutual information (30) is used to quantify the dependence of different variables with strongly non-Gaussian features, the advantage of which over pattern correlation is also clearly illustrated in SI Appendix.

Recovering the PDFs at Both Transient and Equilibrium Phases Exploiting Statistical Symmetry.

Now we apply the efficient statistically accurate algorithms to solve the PDFs associated with the stochastic coupled FHN system in Eq. 10. Here we focus on the weakly coherent regime (Fig. 1B). Due to the statistical symmetry, the effective sample size is $L' = N L = 500 L$ , where $L$ is the number of repeated simulations of the systems. Below, we simply take $L = 1$ in the efficient statistically accurate algorithms, which is extremely cheap. For comparison, we take $L_{C} = 300$ in Monte Carlo simulations and again use statistical symmetry to generate the true PDFs and therefore the effective sample size in Monte Carlo simulation is $L_{C}^{'} = N L_{C}$ = 150,000.

The time evolutions of the first four moments associated with $u_{1}$ and $v_{1}$ are shown in Fig. 2, where the black circles mark three transient phases and the statistical equilibrium phase with different non-Gaussian features. Fig. 3 shows the skill of solving the one-point statistics at these phases. First, the PDFs of $u_{1}$ at $t = 1.6, 1.9$ and $25$ with different bimodal features are all accurately recovered. Next, $p (u_{1})$ at $t = 2.7$ has a tiny second peak around $u_{1} = 1$ since only about $1.5 %$ of the events at this time instant have large bursts and these rare events contribute to a significant kurtosis of $u_{1}$ at $t = 2.7$ . Nevertheless, the efficient statistically accurate algorithm succeeds in capturing the appearance of this weak peak and the recovered PDF almost overlaps with the truth. On the other hand, the skewed $p (v_{1})$ at $t = 1.6$ and the sub-Gaussian $p (v_{1})$ at $t = 1.9, 2.7$ , and $25$ are all recovered with high accuracy by the algorithm. In addition, the recovered joint PDFs $p (u_{1}, v_{1})$ at different phases all resemble the truth.

Fig. 3. — Stochastic coupled FHN model. Shown is a comparison of the truth and recovered PDFs at three transient phases $t = 1.6, 1.9$ , and $2.7$ as well as the statistical equilibrium phase $t = 25$ . A and B show the truth and recovered 2D PDF $p (u_{1}, v_{1})$ . C and D compare the recovered 1D PDFs $p (u_{1})$ and $p (v_{1})$ (blue) with the truth (red). E shows the logarithm plots of D, where the black dotted curves are the Gaussian fits of the truth. *Inset* above the subplot of $p (u_{1})$ at $t = 2.7$ in C is also in logarithm scale, ranging from $5 \times 10^{- 3}$ to $5$ .

Fig. 4 illustrates the recovered two-point statistics at $t = 1.6$ . Fig. 4 A and B shows the PDFs $p (u_{1}, u_{2})$ and $p (u_{1}, u_{20})$ and those for $v$ , respectively. The variables $u_{1}$ and $u_{2}$ have a strong correlation while $u_{1}$ and $u_{20}$ are nearly uncorrelated. These features as well as the highly non-Gaussian PDFs with multiple peaks are both captured by the recovered PDFs from the efficient statistically accurate algorithms with only $L = 1$ . Likewise, the skewed and nearly Gaussian joint PDFs $p (v_{1}, v_{2})$ and $p (v_{1}, v_{20})$ are also recovered with high accuracy.

Two-Layer Inhomogeneous L96 Model

The two-layer L96 model is a conceptual model in geophysical turbulence that is widely used as a testbed for data assimilation and parameterization in numerical weather forecasting (20, 23, 24). The model can be regarded as a coarse discretization of atmospheric flow on a latitude circle with complicated wave-like and chaotic behavior. It schematically describes the interaction between small-scale fluctuations with larger-scale motions. In the model presented here, large-scale motions are denoted by variables $u_{i}$ , which are coupled to small-scale variables $v_{i, j}$ ,

\frac{d u_{i}}{d t} = u_{i - 1} (u_{i + 1} - u_{i - 2}) + \sum_{j = 1}^{J} γ_{i, j} u_{i} v_{i, j} - {\bar{d}}_{i} u_{i} + F + σ_{u} {\dot{W}}_{u_{i}}, i = 1, \dots, I,

[11a]

\frac{d v_{i, j}}{d t} = - d_{v_{i, j}} v_{i, j} - γ_{j} u_{i}^{2} + σ_{i, j} {\dot{W}}_{v_{i, j}}, j = 1, \dots, J,

[11b]

with periodic boundary conditions in $u_{i}$ . One important feature of Eq. 11 is that the nonlinear interaction between $u_{i}$ and $v_{i, j}$ conserves energy, as observed in nature (3). The two-layer L96 model belongs to the conditional Gaussian framework in Eq. 3 with $𝐮_{𝐈} = {u_{i}}$ and $𝐮_{𝐈𝐈} = {v_{i, j}}$ . Below, we take $I = 40$ as in the standard L96 model. Associated with each $u_{i}$ , there are $J = 5$ variables $v_{i, j}$ representing different scales of fluctuations. Thus, the total number of state variables is $240$ . A constant forcing $F = 8$ is adopted in Eq. 11a while both the damping ${\bar{d}}_{i}$ and the coupling $γ_{i, j}$ are functions in space. Therefore, the model is inhomogeneous. These mimic the situation that the damping and coupling above the ocean are weaker than those above the land since the latter usually have stronger friction or dissipation. As a result, the large-scale wave patterns over the ocean are more significant (Fig. 5). The model in Eq. 11 has many desirable properties as in more complicated turbulent systems. Particularly, the smaller scales are more intermittent with stronger fat tails in PDFs. See SI Appendix for more details.

Fig. 5. — Spatial–temporal evolution of the observed variables $u_{i}, i = 1, \dots 40$ in the two-layer L-96 model Eq. 11 with $F = 8$ . The profiles of damping ${\bar{d}}_{i}$ and coupling coefficients $γ_{i}$ as a function of $i$ are shown below the spatial–temporal evolution of $u_{i}$ in solid blue curves and the black dashed lines represent the spatial averaged values. Here, ${\bar{d}}_{i} = 1 + 0.7 \cos (2 π i / J)$ and $γ_{i, j} = γ_{i} = 0.1 + 0.025 \cos (2 π i / J)$ .

Although the two-layer inhomogeneous L96 model in Eq. 11 has no statistical symmetry, the model structure nevertheless allows the effective block decomposition. Below, $L = 500$ trajectories of each variable $u_{i}$ are simulated from Eq. 11 to implement the efficient statistically accurate algorithms. As a comparison, a direct Monte Carlo method requires $L_{C}$ = 150,000 samples for each of the $240$ variables for an accurate estimation of at least the one-point statistics. This means the total number of samples is around $4 \times 10^{7}$ ! For an efficient calculation of the truth, we focus only on the statistical equilibrium state here but the algorithms are not restricted to the equilibrium statistics. The true PDFs are calculated using the Monte Carlo samples over a long time series in light of the ergodicity while the recovered PDFs from the efficient statistically accurate algorithms are computed at $t = 25$ .

Fig. 6 shows the one-point statistics as a function of $i$ regarding the first four moments of $u_{i}$ and each $v_{i, j}$ . The mean and variance resulting from the efficient statistically accurate algorithms of all variables are highly consistent with the truth. For the skewness and kurtosis, despite small fluctuations in the recovered statistics, the pattern correlations between the curves associated with the truth and the recovered statistics are significant unless both are nearly flat (SI Appendix). These results indicate the success of the efficient statistically accurate algorithms in capturing the inhomogeneous behavior of the model. Fig. 7 demonstrates the skill of recovering different 2D joint PDFs. Fig. 7 A and B shows the PDFs $p (u_{i}, v_{i, 5})$ at $i = 11$ and $21$ . These highly non-Gaussian PDFs with distinct features are recovered accurately by the algorithms. On the other hand, Fig. 7C shows the joint PDFs of the two smallest-scale fluctuation variables $v_{11,5}$ and $v_{13,5}$ , which are highly correlated and have strong non-Gaussian features. Fig. 7D shows the joint PDFs $p (u_{11}, u_{13})$ , where the two components also have a strong correlation. The efficient statistically accurate algorithms succeed in solving all these joint PDFs with only $L = 500$ samples.

Fig. 7. — Two-layer inhomogeneous L-96 model. (*A–D*) Comparison of 2D joint PDFs at a fixed grid point (A and B) and between two different grid points (C and D).

Concluding Discussion

Effective strategies involving block decomposition of the conditional covariance matrix and statistical symmetry are developed and incorporated into the efficient statistically accurate algorithms in ref. 9. The resulting expanded algorithms are able to efficiently solve the Fokker–Planck equation in much higher dimensions even with orders in the millions and thus beat the curse of dimension. Applications of these effective strategies to both the stochastic coupled FHN and the two-layer inhomogeneous L96 models illustrate the efficiency and accuracy.

It is worthwhile pointing out that although only the recovered one-point and two-point statistics are shown in this article for illustration purposes, the algorithms can actually provide an accurate estimation of the full joint PDF of $𝐮_{𝐈𝐈}$ , using a small number of samples. This is because the sample size in these algorithms does not grow exponentially as the dimension of $𝐮_{𝐈𝐈}$ , which is fundamentally different from Monte Carlo methods. See ref. 19 for a theoretical justification. The algorithms developed here are extremely useful in understanding the causality as well as improving the parameterizations and predictions of high-dimensional complex turbulent dynamical systems with non-Gaussian features.

Supplementary Material

Supplementary File

pnas.1717017114.sapp.pdf^{(3.1MB, pdf)}

Acknowledgments

The research of A.J.M. is partially supported by the Office of Naval Research (ONR) Multidisciplinary University Research Initiative (MURI) Grant N0001416-1-2161 and the New York University Abu Dhabi Research Institute. N.C. is supported as a postdoctoral fellow through A.J.M.’s ONR MURI Grant.

Footnotes

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1717017114/-/DCSupplemental.

References

1.Gardiner CW. Stochastic Methods. Springer; Berlin: 1985. [Google Scholar]
2.Risken H. 1989. The Fokker-Planck equation. Methods of Solution and Applications, Springer Series in Synergetics, ed Haken H (Springer, Berlin), Vol 18.
3.Majda A. 2016. Introduction to Turbulent Dynamical Systems in Complex Systems, Frontiers in Applied Dynamical Systems: Reviews and Tutorials 5 (Springer, Cham, Switzerland)
4.Lindner B, Garcıa-Ojalvo J, Neiman A, Schimansky-Geier L. Effects of noise in excitable systems. Phys Rep. 2004;392:321–424. [Google Scholar]
5.Cousins W, Sapsis TP. Reduced-order precursors of rare events in unidirectional nonlinear water waves. J Fluid Mech. 2016;790:368–388. [Google Scholar]
6.Vallis GK. Atmospheric and Oceanic Fluid Dynamics. Cambridge Univ Press; Cambridge, UK: 2017. [Google Scholar]
7.Pichler L, Masud A, Bergman LA. Numerical solution of the Fokker–Planck equation by finite difference and finite element methods – A comparative study. In: Papadrakakis M, Stefanou G, Papadopoulos V, editors. Computational Methods in Stochastic Dynamics. Springer, Dordrecht; The Netherlands: 2013. pp. 69–85. [Google Scholar]
8.Robert CP. Monte Carlo Methods. Wiley, Hoboken; NJ: 2004. [Google Scholar]
9.Chen N, Majda AJ. Efficient statistically accurate algorithms for solving Fokker-Planck equations in large dimensions. J Comput Phys. October 23, 2017 doi: 10.1016/j.jcp.2017.10.022. [DOI] [Google Scholar]
10.Chen N, Majda AJ. Filtering nonlinear turbulent dynamical systems through conditional Gaussian statistics. Monthly Weather Rev. 2016;144:4885–4917. [Google Scholar]
11.Chen N, Majda AJ, Giannakis D. Predicting the cloud patterns of the Madden-Julian oscillation through a low-order nonlinear stochastic model. Geophys Res Lett. 2014;41:5612–5619. [Google Scholar]
12.Chen N, Majda AJ. Predicting the real-time multivariate Madden–Julian oscillation index through a low-order nonlinear stochastic model. Monthly Weather Rev. 2015;143:2148–2169. [Google Scholar]
13.Chen N, Majda AJ. Predicting the cloud patterns for the boreal summer intraseasonal oscillation through a low-order stochastic model. Math Clim Weather Forecast. 2015;1:1–20. [Google Scholar]
14.Chen N, Majda AJ, Tong XT. Information barriers for noisy Lagrangian tracers in filtering random incompressible flows. Nonlinearity. 2014;27:2133–2163. [Google Scholar]
15.Chen N, Majda AJ, Tong XT. Noisy Lagrangian tracers for filtering random rotating compressible flows. J Nonlinear Sci. 2015;25:451–488. [Google Scholar]
16.Chen N, Majda AJ. Model error in filtering random compressible flows utilizing noisy Lagrangian tracers. Mon Weather Rev. 2016;144:4037–4061. [Google Scholar]
17.Branicki M, Majda AJ. Dynamic stochastic superresolution of sparsely observed turbulent systems. J Comput Phys. 2013;241:333–363. [Google Scholar]
18.Majda AJ, Grooms I. New perspectives on superparameterization for geophysical turbulence. J Comput Phys. 2014;271:60–77. [Google Scholar]
19.Chen N, Majda AJ, Tong XT. Rigorous analysis for efficient statistically accurate algorithms for solving Fokker-Planck equations in large dimensions. SIAM/ASA J Uncertain Quantif. 2017 in press. [Google Scholar]
20.Wilks DS. Effects of stochastic parametrizations in the lorenz’96 system. Q J R Meteorol Soc. 2005;131:389–407. [Google Scholar]
21.Majda A, Wang X. Nonlinear Dynamics and Statistical Theories for Basic Geophysical Flows. Cambridge Univ Press; Cambridge, UK: 2006. [Google Scholar]
22.Muratov CB, Vanden-Eijnden E, Weinan E. Noise can play an organizing role for the recurrent dynamics in excitable media. Proc Natl Acad Sci USA. 2007;104:702–707. doi: 10.1073/pnas.0607433104. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Lee Y, Majda AJ. Multiscale data assimilation and prediction using clustered particle filters. J Comput Phys. 2017 in press. [Google Scholar]
24.Arnold H, Moroz I, Palmer T. Stochastic parametrizations and model uncertainty in the lorenz’96 system. Philos Trans R Soc A. 2013;371:20110479. doi: 10.1098/rsta.2011.0479. [DOI] [PubMed] [Google Scholar]
25.Liptser RS, Shiryaev AN. Statistics of Random Processes II: II. Applications. Vol 2 Springer; Berlin: 2001. [Google Scholar]
26.Botev ZI, et al. Kernel density estimation via diffusion. Ann Stat. 2010;38:2916–2957. [Google Scholar]
27.Majda AJ, Harlim J. Filtering Complex Turbulent Systems. Cambridge Univ Press; Cambridge, UK: 2012. [Google Scholar]
28.Guckenheimer J, Holmes PJ. Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields. Vol 42 Springer; New York: 2013. [Google Scholar]
29.DeVille RL, Vanden-Eijnden E, Muratov CB. Two distinct mechanisms of coherence in randomly perturbed dynamical systems. Phys Rev E. 2005;72:031105. doi: 10.1103/PhysRevE.72.031105. [DOI] [PubMed] [Google Scholar]
30.Branicki M, Majda A. Quantifying Bayesian filter performance for turbulent dynamical systems through information theory. Commun Math Sci. 2014;12:901–978. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File

pnas.1717017114.sapp.pdf^{(3.1MB, pdf)}

[r1] 1.Gardiner CW. Stochastic Methods. Springer; Berlin: 1985. [Google Scholar]

[r2] 2.Risken H. 1989. The Fokker-Planck equation. Methods of Solution and Applications, Springer Series in Synergetics, ed Haken H (Springer, Berlin), Vol 18.

[r3] 3.Majda A. 2016. Introduction to Turbulent Dynamical Systems in Complex Systems, Frontiers in Applied Dynamical Systems: Reviews and Tutorials 5 (Springer, Cham, Switzerland)

[r4] 4.Lindner B, Garcıa-Ojalvo J, Neiman A, Schimansky-Geier L. Effects of noise in excitable systems. Phys Rep. 2004;392:321–424. [Google Scholar]

[r5] 5.Cousins W, Sapsis TP. Reduced-order precursors of rare events in unidirectional nonlinear water waves. J Fluid Mech. 2016;790:368–388. [Google Scholar]

[r6] 6.Vallis GK. Atmospheric and Oceanic Fluid Dynamics. Cambridge Univ Press; Cambridge, UK: 2017. [Google Scholar]

[r7] 7.Pichler L, Masud A, Bergman LA. Numerical solution of the Fokker–Planck equation by finite difference and finite element methods – A comparative study. In: Papadrakakis M, Stefanou G, Papadopoulos V, editors. Computational Methods in Stochastic Dynamics. Springer, Dordrecht; The Netherlands: 2013. pp. 69–85. [Google Scholar]

[r8] 8.Robert CP. Monte Carlo Methods. Wiley, Hoboken; NJ: 2004. [Google Scholar]

[r9] 9.Chen N, Majda AJ. Efficient statistically accurate algorithms for solving Fokker-Planck equations in large dimensions. J Comput Phys. October 23, 2017 doi: 10.1016/j.jcp.2017.10.022. [DOI] [Google Scholar]

[r10] 10.Chen N, Majda AJ. Filtering nonlinear turbulent dynamical systems through conditional Gaussian statistics. Monthly Weather Rev. 2016;144:4885–4917. [Google Scholar]

[r11] 11.Chen N, Majda AJ, Giannakis D. Predicting the cloud patterns of the Madden-Julian oscillation through a low-order nonlinear stochastic model. Geophys Res Lett. 2014;41:5612–5619. [Google Scholar]

[r12] 12.Chen N, Majda AJ. Predicting the real-time multivariate Madden–Julian oscillation index through a low-order nonlinear stochastic model. Monthly Weather Rev. 2015;143:2148–2169. [Google Scholar]

[r13] 13.Chen N, Majda AJ. Predicting the cloud patterns for the boreal summer intraseasonal oscillation through a low-order stochastic model. Math Clim Weather Forecast. 2015;1:1–20. [Google Scholar]

[r14] 14.Chen N, Majda AJ, Tong XT. Information barriers for noisy Lagrangian tracers in filtering random incompressible flows. Nonlinearity. 2014;27:2133–2163. [Google Scholar]

[r15] 15.Chen N, Majda AJ, Tong XT. Noisy Lagrangian tracers for filtering random rotating compressible flows. J Nonlinear Sci. 2015;25:451–488. [Google Scholar]

[r16] 16.Chen N, Majda AJ. Model error in filtering random compressible flows utilizing noisy Lagrangian tracers. Mon Weather Rev. 2016;144:4037–4061. [Google Scholar]

[r17] 17.Branicki M, Majda AJ. Dynamic stochastic superresolution of sparsely observed turbulent systems. J Comput Phys. 2013;241:333–363. [Google Scholar]

[r18] 18.Majda AJ, Grooms I. New perspectives on superparameterization for geophysical turbulence. J Comput Phys. 2014;271:60–77. [Google Scholar]

[r19] 19.Chen N, Majda AJ, Tong XT. Rigorous analysis for efficient statistically accurate algorithms for solving Fokker-Planck equations in large dimensions. SIAM/ASA J Uncertain Quantif. 2017 in press. [Google Scholar]

[r20] 20.Wilks DS. Effects of stochastic parametrizations in the lorenz’96 system. Q J R Meteorol Soc. 2005;131:389–407. [Google Scholar]

[r21] 21.Majda A, Wang X. Nonlinear Dynamics and Statistical Theories for Basic Geophysical Flows. Cambridge Univ Press; Cambridge, UK: 2006. [Google Scholar]

[r22] 22.Muratov CB, Vanden-Eijnden E, Weinan E. Noise can play an organizing role for the recurrent dynamics in excitable media. Proc Natl Acad Sci USA. 2007;104:702–707. doi: 10.1073/pnas.0607433104. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r23] 23.Lee Y, Majda AJ. Multiscale data assimilation and prediction using clustered particle filters. J Comput Phys. 2017 in press. [Google Scholar]

[r24] 24.Arnold H, Moroz I, Palmer T. Stochastic parametrizations and model uncertainty in the lorenz’96 system. Philos Trans R Soc A. 2013;371:20110479. doi: 10.1098/rsta.2011.0479. [DOI] [PubMed] [Google Scholar]

[r25] 25.Liptser RS, Shiryaev AN. Statistics of Random Processes II: II. Applications. Vol 2 Springer; Berlin: 2001. [Google Scholar]

[r26] 26.Botev ZI, et al. Kernel density estimation via diffusion. Ann Stat. 2010;38:2916–2957. [Google Scholar]

[r27] 27.Majda AJ, Harlim J. Filtering Complex Turbulent Systems. Cambridge Univ Press; Cambridge, UK: 2012. [Google Scholar]

[r28] 28.Guckenheimer J, Holmes PJ. Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields. Vol 42 Springer; New York: 2013. [Google Scholar]

[r29] 29.DeVille RL, Vanden-Eijnden E, Muratov CB. Two distinct mechanisms of coherence in randomly perturbed dynamical systems. Phys Rev E. 2005;72:031105. doi: 10.1103/PhysRevE.72.031105. [DOI] [PubMed] [Google Scholar]

[r30] 30.Branicki M, Majda A. Quantifying Bayesian filter performance for turbulent dynamical systems through information theory. Commun Math Sci. 2014;12:901–978. [Google Scholar]

PERMALINK

Beating the curse of dimension with accurate statistics for the Fokker–Planck equation in complex turbulent systems

Nan Chen

Andrew J Majda

Significance

Abstract