Data assimilation in operator algebras

David Freeman; Dimitrios Giannakis; Brian Mintz; Abbas Ourmazd; Joanna Slawinska

doi:10.1073/pnas.2211115120

. 2023 Feb 17;120(8):e2211115120. doi: 10.1073/pnas.2211115120

Data assimilation in operator algebras

David Freeman ^a, Dimitrios Giannakis ^a,^b,¹, Brian Mintz ^a, Abbas Ourmazd ^c, Joanna Slawinska ^a

PMCID: PMC9974492 PMID: 36800390

Significance

Data assimilation is an essential component of numerical models for forecasting and uncertainty quantification of dynamical systems given incomplete knowledge of the state and governing equations. Here, we develop theory and computational methods for data assimilation through a combination of ideas from operator algebras, quantum information, and ergodic theory. Our approach leverages properties of noncommutative operator spaces to design computational schemes that i) preserve the sign of positive quantities such as mass in ways that are not possible with classical commutative methods and ii) are amenable to consistent data-driven approximation using machine learning. Furthermore, our framework provides a route for implementing data assimilation algorithms on quantum computers. We present applications to multiscale chaotic systems and the El Niño Southern Oscillation.

Keywords: data assimilation, operator algebras, quantum information, Koopman operators, kernel methods

Abstract

We develop an algebraic framework for sequential data assimilation of partially observed dynamical systems. In this framework, Bayesian data assimilation is embedded in a nonabelian operator algebra, which provides a representation of observables by multiplication operators and probability densities by density operators (quantum states). In the algebraic approach, the forecast step of data assimilation is represented by a quantum operation induced by the Koopman operator of the dynamical system. Moreover, the analysis step is described by a quantum effect, which generalizes the Bayesian observational update rule. Projecting this formulation to finite-dimensional matrix algebras leads to computational schemes that are i) automatically positivity-preserving and ii) amenable to consistent data-driven approximation using kernel methods for machine learning. Moreover, these methods are natural candidates for implementation on quantum computers. Applications to the Lorenz 96 multiscale system and the El Niño Southern Oscillation in a climate model show promising results in terms of forecast skill and uncertainty quantification.

Since its inception in weather forecasting (1) and object tracking problems (2), sequential data assimilation, also known as filtering, has evolved into an indispensable tool in forecasting and uncertainty quantification of dynamical systems (3, 4). In its essence, data assimilation is a Bayesian inference framework: Knowledge about the state of the system at time t is described by a probability distribution p_t. The system dynamics acts on probability distributions, carrying along p_t to a time-dependent family p_{t, τ}, which can be used to forecast observables of the system at time t + τ, τ ≥ 0. When an observation is made, at time t + Δt, the forecast distribution p_{t, Δt} is updated in an analysis step using Bayes’ rule to a posterior distribution p_{t + Δt}, and the cycle is repeated.

In real-world applications, the Bayesian theoretical “gold standard” is seldom feasible to employ due to a variety of challenges, including high-dimensional nonlinear dynamics, nonlinear observation modalities, and model error. Weather and climate dynamics (5) represent a classical application domain where these challenges are prevalent due to the extremely large number of active degrees of freedom (which necessitates making dynamical approximations such as subgrid-scale parameterization) and nonlinear equations of motion and observation functions (which prevent direct application of Bayes’ rule). Addressing these issues has stimulated the creation of a broad range of data assimilation techniques, including variational (6), ensemble (7), and particle (8) methods.

In this paper, we examine Bayesian data assimilation and its representation through finite-dimensional computational methods from an algebraic perspective. Our formulation employs different levels of description, depicted schematically in Fig. 1. We begin by assigning to a measure-preserving dynamical flow Φ^t: X → X, $t \in ℝ$ , an algebra of observables (complex-valued functions of the state) $A = L^{\infty} (X, μ)$ , where X is the state space and μ the invariant measure. This algebra is a commutative, or abelian, von Neumann algebra (9) under pointwise function multiplication. The state space of $A$ , denoted as $S (A)$ , is the set of continuous linear functionals $ω : A \to C$ , satisfying the positivity condition ω(f^*f)≥0 for all $f \in A$ and the normalization condition ω1 = 1. Here, ^* denotes the complex conjugation of functions, and 1 is the unit of $A$ , 1(x)=1 for all x ∈ X. Every probability density $p \in L^{1} (X, μ) \equiv A_{*}$ induces a state $ω_{p} \in S (A)$ that acts on $A$ as an expectation functional, ω_pf = ∫_Xfp dμ. Such states ω_p constitute the set of normal states of $A$ , denoted as $S_{*} (A)$ .

Fig. 1. — Schematic representation of the abelian and nonabelian formulations of sequential data assimilation (DA), showing a forecast–analysis cycle. The Top row of the diagram shows the dynamical flow Φ^t : X → X. The second row shows the observation map h : X → Y used to update the state of the DA system in the analysis step. The rows labeled , , and show the abelian, infinite-dimensional nonabelian (quantum mechanical), and finite-dimensional nonabelian (matrix mechanical) DA systems, respectively. In , the forecast step is carried out by the transfer operator $P^{t} : S_{*} (A) \to S_{*} (A)$ acting on states of the abelian algebra $A$ . The analysis step (green dot) is represented by an effect-valued map $F : Y \to A$ that updates the state given observations in Y. In , the forecast step is carried out by the transfer operator $P^{t} : S_{*} (B) \to S_{*} (B)$ acting on states of the nonabelian operator algebra $B$ . The analysis step (red dot) is carried out by an effect $F : Y \to B$ given by the composition of F with the regular representation π of $A$ into $B$ (red arrow). The state space $S_{*} (A)$ is embedded into $S_{*} (B)$ by means of a map Γ, which is compatible with both forecast and analysis; Eqs. 4 and 9. This compatibility is represented by the commutative loops between and having Γ as a vertical arrow. To arrive at the matrix mechanical DA, , we project $B$ into an L²-dimensional operator algebra $B_{L}$ using a positivity-preserving projection $Π_{L}$ . The composition of this projection with ℱ leads to an effect $F_{L} : Y \to B_{L}$ employed in the analysis step (purple arrow and dot). Moreover, $Π_{L}$ induces a state space projection $Π'_{L} : S_{*} (B) \to S_{*} (B_{L})$ and a projected transfer operator $P_{L}^{(t)} : S_{*} (B_{L}) \to S_{*} (B_{L})$ employed in the forecast step. Vertical dotted arrows indicate asymptotically commutative relationships that hold as L → ∞.

Elements of $A$ evolve under the Koopman operator $U^{t} : A \to A$ , which is the composition operator by the dynamical flow, U^tf = f ° Φ^t. Moreover, algebra states evolve under the transfer operator $P^{t} : S (A) \to S (A)$ , which is the adjoint of the Koopman operator, P^tω = ω ⚬ U^t (10–12). The space of normal states $S_{*} (A)$ is invariant under P^t, and we have P^tω_p = ω_U^−tp for every probability density $p \in A_{*}$ . In this picture, the evolution p_t ↦ p_{t, τ} of the forecast density is represented by dynamics on $S_{*} (A)$ under the transfer operator, ω_{p_{t, τ}} = P^τω_{p_t}. Moreover, Bayesian analysis, p_{t, Δt} ↦ p_{t + Δt}, is represented by projective conditioning of the state. Together, these two steps encapsulate classical data assimilation within an abelian algebraic setting, labeled Inline graphic in Fig. 1.

The next level of our framework, labeled Inline graphic in Fig. 1, generalizes data assimilation by embedding it in a nonabelian operator algebra. Operator algebras form the mathematical backbone of quantum mechanics (13)—one of the most successful theories in physics. Quantum information theory and quantum probability provide a unified mathematical framework to characterize the properties of information transfer in both abelian and nonabelian systems through maps (quantum operations) acting on elements of the algebra and the corresponding states (14–16). Our nonabelian formulation Inline graphic is based on the von Neumann algebra $B \equiv B (H)$ of bounded linear operators on the Hilbert space H = L²(X, μ), equipped with operator composition as the algebraic product. The state space $S (B)$ is defined as the space of continuous, positive, normalized functionals analogously to $S (A)$ ; that is, every state $ω \in S (B)$ satisfies ω(A^*A)≥0 and ωI = 1, where ^* denotes the operator adjoint on B(H) and I is the identity operator. Analogously to $A$ , $S (B)$ has a subset of normal states, $S_{*} (B)$ , induced in this case by trace-class operators. Specifically, letting $B_{*} \equiv B_{1} (H)$ denote the space of trace-class operators in B(H), every positive operator $ρ \in B_{*}$ of unit trace induces a state $ω_{ρ} \in S_{*} (B)$ such that ω_ρA = tr(ρA). Such operators ρ are called density operators and can be thought of as nonabelian analogs of probability densities $p \in A_{*}$ . As we will see below, analogs of the transfer operator and Bayesian update described above for $S_{*} (A)$ are naturally defined for $S_{*} (B)$ .

To arrive at practical data assimilation algorithms, we project (discretize) the infinite-dimensional system on $B$ to a system on a finite-dimensional subalgebra $B_{L} \subset B$ , which is concretely represented by an L × L matrix algebra ( Inline graphic in Fig. 1). We show that this approach leads to computational techniques which are well suited for assimilation of high-dimensional observables, while enjoying structure-preservation properties that cannot be obtained from orthogonal projections of abelian function spaces. Moreover, by virtue of being rooted in linear operator theory, these methods are amenable to consistent data-driven approximation using kernel methods for machine learning.

Previous Work.

Recently, an operator-theoretic framework for data assimilation, called quantum mechanical data assimilation (QMDA) (17), was developed using ideas from Koopman operator theory (10, 12) in conjunction with the Dirac–von Neumann axioms of quantum dynamics and measurement (18). In QMDA, the state of the data assimilation system is a density operator ρ_t acting on H = L²(X, μ) (rather than a classical probability density p_t ∈ L¹(X, μ)), and the assimilated observables are multiplication operators in B(H) (rather than functions in L^∞(X, μ)). Between observations, ρ_t evolves under the induced action of the transfer operator, and the forecast distribution of observables is obtained as a quantum mechanical expectation with respect to ρ_t. During observations, the density operator ρ_t is updated projectively as a von Neumann measurement, which is the quantum analog of Bayes’ rule. QMDA has a data-driven formulation based on kernel methods for Koopman operator approximation (19–21), which was shown to perform well in low-dimensional applications. Meanwhile, the paper (22) has shown that Koopman operators of systems with pure point spectra can be approximated on quantum computers using shallow quantum circuits, offering an exponential computational advantage over classical deterministic algorithms for Koopman operator approximation.

Contributions.

We provide a general algebraic framework that encompasses classical data assimilation and QMDA as particular instances (abelian and nonabelian, respectively). The principal distinguishing aspects of this framework are as follows:

1.
Dynamical consistency. We employ a dynamically consistent embedding of abelian data assimilation into the nonabelian framework . As in ref. (17), observables in $A$ are mapped into multiplication operators in $B$ , but here, we also employ an embedding $Γ : S_{*} (A) \to S_{*} (B)$ that is compatible with the transfer operator (see the commutative loops between and in Fig. 1). This allows us to study QMDA in relation to the underlying classical theory and establish the consistency between the two approaches.
2.
Effect system. In both the abelian and nonabelian settings, the analysis step, given observations in a space Y acquired through an observation map h : X → Y, is carried out using quantum effects (loosely speaking, algebra-valued logical predicates) (14). In the abelian case, the effect $F : Y \to A$ is induced by a kernel feature map. In the nonabelian setting, F is promoted to an operator-valued feature map $F : Y \to B$ ; see the column in the schematic of Fig. 1 labeled “Analysis.” Our use of feature maps enables assimilation of data of arbitrarily large dimension, overcoming an important limitation of the original QMDA scheme (17) (which becomes prohibitively expensive for high-dimensional observation maps).
3.
Positivity-preserving discretization. The discretization procedure leading to the finite-dimensional scheme has the important property that positive elements of $B$ are mapped into positive elements of $B_{L}$ . Moreover, the transfer operator on $S_{*} (B)$ is mapped into a completely positive, trace-nonincreasing map, so the finite-dimensional data assimilation system is a quantum operation. We call this system “matrix mechanical.” By virtue of these properties, the sign of sign-definite observables, i.e., observables which are either positive or negative, is preserved. Relevant examples include positive physical quantities such as mass and temperature but also statistical quantities such as probability density or standard deviation, which are useful for uncertainty quantification. We emphasize that the approach of first embedding classical data assimilation in $A$ to the nonabelian operator setting of $B$ and then projecting to the finite-dimensional system on $B_{L}$ is important in positivity preservation.
4.
Data-driven formulation. The matrix mechanical system on $B_{L}$ admits a data-driven approximation in which all operators are represented in a kernel eigenbasis learned from time-ordered training data, without requiring a priori knowledge of the equations of motion. In the limit of large training data, predictions made by the data-driven assimilation system converge to those from , which in turn converge to those from the infinite-dimensional system on $B$ as L → ∞.
5.
Route to quantum computing. QMDA is well suited for implementation on quantum computers as we demonstrate here with simulated quantum circuit experiments. Our approach provides a route to quantum algorithms that sequentially alternate between unitary evolution and projective measurement to perform inference and prediction of classical dynamics with quantum computers.

To place this work in the context of our previous work (17, 22), we note that ref. (17) proposed QMDA as an ad hoc data assimilation scheme, without attempting to connect it to the underlying classical Bayesian framework. Moreover, ref. (17) did not investigate the positivity-preserving aspects of QMDA, nor did it employ an operator-valued feature map in the analysis step. Ref. (22) developed a scheme for quantum computational simulation without addressing data assimilation. There, the focus was on representing classical dynamics via quantum circuits of low complexity (depth); to that end, the scheme was limited to systems with pure point spectra, which do not include systems with mixing (chaotic) dynamics that lie in the scope of QMDA.

Embedding Data Assimilation in Operator Algebras

Consider a dynamical flow Φ^t : X → X, $t \in R$ , on a completely metrizable, separable space X with an ergodic, invariant, Borel probability measure μ. The flow induces Koopman operators U^t : f ↦ f ° Φ^t, which are isomorphisms of the L^p(X, μ) spaces with p ∈ [1, ∞]. The flow also induces transfer operators P^t : L^p(X, μ)^* → L^p(X, μ)^* on the dual spaces L^p(X, μ)^*, given by the adjoint of the Koopman operator, P^tα = α ⚬ U^t. Under the canonical identification of L^p(X, μ)^*, p ∈ [1, ∞), with finite, complex Borel measures with densities in L^q(X, μ), $\frac{1}{p} + \frac{1}{q} = 1$ , the transfer operator is identified with the inverse Koopman operator; that is, for α ∈ L^p(X, μ)^* with density $ϱ = \frac{d α}{d μ} \in L^{q} (X, μ)$ , α_t := P^tα has density $ϱ_{t} = \frac{d α_{t}}{d μ} \in L^{q} (X, μ)$ with 𝜚_t = U^−t𝜚. In what follows, ∥f∥_{L^p(X, μ)} = (∫_X|f|^p dμ)^1/p for p ∈ [1, ∞) and ∥f∥_{L^∞(X, μ)} = lim_{p → ∞}∥f∥_{L^p(X, μ)} denote the standard L^p(X, μ) norms.

Among the L^p(X, μ) spaces, H := L²(X, μ) is a Hilbert space, and $A : = L^{\infty} (X, μ)$ is an abelian von Neumann algebra with respect to function multiplication and complex conjugation. In particular, for any two elements $f, g \in A$ , we have

\begin{matrix} {| | f g | |}_{A} \leq {| | f | |}_{A} {| | g | |}_{A}, | | f^{*} {f | |}_{A} = {| | f | |}_{A}^{2}, \end{matrix}

[1]

making $A$ a C^*-algebra, and moreover, $A$ has a predual $A_{*} : = L^{1} (X, μ)$ (i.e., a Banach space whose dual is $A$ ), making it a von Neumann algebra. We let ⟨f, g⟩ = ∫_Xf^*g dμ denote the inner product on H. On H, the Koopman operator is unitary, U^t* = U^−t.

Embedding Observables.

Let $B : = B (H)$ be the space of bounded operators on H, equipped with the operator norm, ${‖ A ‖}_{B} = sup_{f \in H} \frac{{‖ A f ‖}_{H}}{{‖ f ‖}_{H}}$ . This space is a nonabelian von Neumann algebra with respect to operator composition and adjoint. That is, for any $A, B \in B$ , we have

\begin{matrix} {| | A B | |}_{B} \leq {| | A | |}_{B} {| | B | |}_{B}, | | A^{*} {A | |}_{B} = {| | A | |}_{B}^{2}, \end{matrix}

which is the nonabelian analog of Eq. 1 making $B$ a C^*-algebra. Moreover, $B$ has a predual, $B_{*} : = B_{1} (H)$ , making it a von Neumann algebra. Here, the space of trace-class operators B₁(H)⊆B(H) is equipped with the norm ${‖ A ‖}_{1} : = tr \sqrt{A^{*} A}$ , which can be thought of as a nonabelian analog of L¹(X, μ). The unitary group of Koopman operators U^t on H induces a unitary group $U^{t} : B \to B$ (i.e., a group of linear maps mapping unitary operators to unitary operators), which acts by conjugation, i.e.,

\begin{matrix} U^{t} A = U^{t} A U^{t *} . \end{matrix}

[2]

The abelian algebra $A$ embeds isometrically into $B$ through the map $π : A \to B$ , such that πf is the multiplication operator by f, (πf)g = fg. This map is injective, and satisfies π(fg) = (πf)(πg), π(f^*)=(πf)^* for all $f, g \in A$ . Thus, π is a ^*-representation, preserving the von Neumann algebra structure of $A$ . The representation π is also compatible with Koopman evolution, in the sense that 𝒰^t ⚬ π = π ⚬ U^t holds for all $t \in R$ . Equivalently, we have the commutative diagram

graphic file with name pnas.2211115120eq1.jpg

which shows that π provides a dynamically consistent representation of observables of the dynamical system in $A$ as elements of the nonabelian operator algebra $B$ . In Fig. 1, we refer to the level of description involving $B$ as quantum mechanical due to the central role that operator algebras play in the algebraic formulation of quantum mechanics (13). In particular, Eq. 2 is mathematically equivalent to the Heisenberg picture for the unitary evolution of quantum observables (here, under the Koopman operator).

Embedding States.

A dual construction to the representation $π : A \to B$ of observables can be carried out for states. Let $ω_{p} \in S_{*} (A)$ be a normal state induced by a probability density $p \in A_{*}$ . Since p is a positive function with ${‖ p ‖}_{A_{*}} = 1$ , we have that $\sqrt{p}$ is a real unit vector in H, and thus $ρ = ⟨ \sqrt{p}, \cdot ⟩ \sqrt{p}$ is a rank-1 orthogonal projection. Every such projection is a density operator inducing a normal state $ω_{ρ} \in S_{*} (B)$ , where

\begin{matrix} ω_{ρ} A = tr (ρ A) = ⟨ \sqrt{p}, A \sqrt{p} ⟩, \forall A \in B . \end{matrix}

[3]

Such states ω_ρ induced by unit vectors in H are called vector states. In fact, ω_ρ is a pure state, i.e., it is an extremal point of the state space $S (B)$ (which is a convex set). Defining the map $Γ : S_{*} (A) \to S_{*} (B)$ such that Γ(ω_p)=ω_ρ, one can readily verify that Γ is compatible with the regular representation π; i.e., for every observable $f \in A$ and probability density $p \in A_{*}$ ,

\begin{matrix} ω_{p} f = Γ (ω_{p}) (π f) . \end{matrix}

[4]

Next, analogously to the transfer operator $P^{t} : S (A) \to S (A)$ , we define $P^{t} : S (B) \to S (B)$ as the adjoint of $U^{t} : B \to B$ from Eq. 2, 𝒫^tω = ω ⚬ 𝒰^t. Note that 𝒰^t and 𝒫^t form a dual pair, i.e., (𝒫^tω)A = ω(𝒰^tA) for every state $ω \in S (B)$ and element $A \in B$ . Moreover, if $ω_{ρ} \in S_{*} (B)$ is induced by a density operator $ρ \in B_{*}$ , then 𝒫^tω_ρ = ω_{ρ_t}, where ρ_t is the density operator given by ρ_t = 𝒰^−tρ = U^t*ρU^t.

In quantum mechanics, the evolution ρ ↦ ρ_t is known as the Schrödinger picture, and it is the dual of the Heisenberg picture from Eq. 2. In the particular case that ρ = ⟨ξ, ⋅⟩ξ is a vector state induced by ξ ∈ H (which would be called a wavefunction in quantum mechanical language), we have ρ_t = ⟨ξ_t, ⋅⟩ξ_t, where ξ_t = U^t*ξ. Using this fact and Eq. 3, it follows that Γ is compatible with the evolution on $S_{*} (A)$ and $S_{*} (B)$ under the transfer operator; that is, 𝒫^t ° Γ = Γ ° P^t. This relation is represented by the commutative diagram

graphic file with name pnas.2211115120eq2.jpg

[5]

which also captures the correspondence between the abelian and quantum mechanical forecast steps in Fig. 1.

Probabilistic Forecasting.

In both abelian and nonabelian data assimilation, we can describe probabilistic forecasting of observables of the dynamical system using the formalism of positive operator-valued measures (POVMs) (23). First, we recall that an element a of a C^*-algebra $W$ is i) self-adjoint if a^* = a; ii) positive (denoted as a ≥ 0) if a = b^*b for some $b \in W$ ; and iii) a projection if a^* = a = a². Supposing that $W$ is also a von Neumann algebra, a map $E : Σ \to W$ on the σ-algebra of a measurable space (Ω, Σ) is said to be a POVM if i) for every set S ∈ Σ, E(S)≥0; ii) E(Ω)=I, where I is the unit of $W$ ; and iii) for every countable collection S₁, S₂, … of disjoint sets in Σ, E(⋃_iS_i) = ∑_iE(S_i), where the sum converges in the weak-^* topology of $W$ (i.e., for every $γ \in W_{*}$ , $E (⋃_{i} S_{i}) γ = lim_{n \to \infty} \sum_{i = 1}^{n} E (S_{i}) γ$ .) These properties imply that for every $γ \in W_{*}$ , the map $P_{E, γ} : Σ \to C$ given by

\begin{matrix} P_{E, γ} (S) = E (S) γ, \end{matrix}

[6]

is a complex normalized measure. In particular, if γ induces a normal state $ω_{γ} \in S_{*} (W)$ , then $P_{E, γ}$ is a probability measure on Ω. We say that the POVM E is a projection-valued measure (PVM) if E(S) is a projection for every S ∈ Σ.

In quantum mechanics, a triple (Ω, Σ, E) where E is a POVM is referred to as an observable. We alert the reader to the fact that in dynamical systems theory, an observable is generally understood as a function f : X → V on state space X taking values in a vector space, V. Thus, in situations where the space of dynamical observables forms an algebra (e.g., the $A = L^{\infty} (X, μ)$ algebra corresponding to $V = C$ ), the term “observable” is overloaded, and its meaning must be understood from the context.

Given a POVM (Ω, Σ, E) as above, and a bounded, measurable function $u : Ω \to C$ , we define the integral ∫_Ωu(ω) dE(ω) as the unique element a of $W$ such that for every $γ \in W_{*}$ , $a γ = \int_{Ω} u (ω) d P_{E, γ} (ω)$ . If a is a self-adjoint element of $W$ , i.e., a^* = a, the spectral theorem states that there exists a unique PVM E : ℬ(ℝ)→𝔚 on the Borel σ-algebra ℬ(ℝ) of the real line such that a = ∫_ℝω dE(ω).

In abelian data assimilation, the self-adjoint elements are the real-valued functions f in the von Neumann algebra $A$ , and every such f has an associated PVM $E_{f} : B (R) \to A$ . Explicitly, we have E_f(S)=χ_f⁻¹(S), where $χ_{f^{- 1} (S)} : X \to R$ is the characteristic function of the set f⁻¹(S)⊆X. If, at time t, the data assimilation system is in a state $ω_{p_{t}} \in S_{*} (A)$ induced by a probability density $p_{t} \in A_{*}$ , then the forecast distribution for f at lead time τ ≥ 0 is $P_{f, t, τ} \equiv P_{E_{f}, p_{t, τ}}$ , where p_{t, τ} is the probability density associated with the state ω_{p_{t, τ}} = P^τω_{p_t}.

The forecast distribution $P_{f, t, τ}$ is equivalent to the distribution obtained via classical probability theory. That is, given an observable f ∈ L^∞(X, μ), the density p_{t, τ} ∈ L¹(X, μ) induces a probability measure on $R$ such that Prob(S) = ∫_f⁻¹(S)p_{t, τ} dμ is the probability that f takes values in a set $S \in B (R)$ . It follows by definition of $P_{f, t, τ}$ that $Prob (S) = P_{f, t, τ} (S)$ .

In the non-abelian setting of $B$ , the spectral theorem states that for every self-adjoint operator $A \in B$ , there exists a unique PVM $E_{A} : B (R) \to B$ , such that $A = \int_{R} y d E_{A} (y)$ . If, at time t, the nonabelian data assimilation system is in a normal state ω_{ρ_t} induced by a density operator $ρ_{t} \in B_{*}$ , then the forecast distribution for A at lead time τ ≥ 0 is given by ℙ_{A, t, τ} ≡ ℙ_{E_A, ρ_{t, τ}}, where ρ_{t, τ} is the density operator associated with $ω_{ρ_{t, τ}} = P^{τ} ω_{ρ_{t}}$ . This distribution is compatible with the embeddings of states $Γ : S_{*} (A) \to S_{*} (B)$ and observables $π : A \to B$ introduced above. That is, for every observable $f \in A$ , probability density $p_{t, τ} \in S_{*} (A)$ , and Borel set $S \in B (R)$ , we have $P_{f, p_{t, τ}} (S) = P_{π f, ρ_{t, τ}} (S)$ , where Γ(ω_{p_{t, τ}})=ω_{ρ_{t, τ}}.

Representing Observations by Effects.

For a unital C^*-algebra $W$ , an effect is an element $e \in W$ satisfying 0 ≤ e ≤ I. Intuitively, one can think of effects as generalizations of logical truth values, used to model outcomes of measurements or observations (24, 25). In Boolean logic, truth values lie in the set {0, 1}. In fuzzy logic, truth values are real numbers in the interval [0, 1]. In unital C^*-algebras, the analogs of truth values are elements e satisfying 0 ≤ e ≤ I (26). We denote the set of effects in a C^*-algebra $W$ as $E (W)$ . It can be shown that $E (W)$ is a convex space, whose extremal points are projections. Given a state $ω \in S (W)$ and an effect $e \in E (W)$ , the number ωe ∈ [0, 1] is called the validity of e. Note that every effect $e \in E (W)$ induces a binary POVM $E : {0, 1} \to W$ such that E({1}) = e and E({0}) = I − e.

Suppose now that $W$ is a von Neumann algebra, let $ω_{ρ} \in S_{*} (W)$ be a normal state induced by an element $ρ \in W_{*}$ , and let $e \in E (W)$ be an effect. If the validity ωe is nonzero, we can define the conditional state $ω_{ρ} |_{e} \in S_{*} (W)$ as the normal state induced by ${ρ |}_{e} \in W_{*}$ , where

\begin{matrix} {ρ |}_{e} = \frac{\sqrt{e} p \sqrt{e}}{ω_{p} e} . \end{matrix}

[7]

The map ω_ρ ↦ ω_ρ|_e generalizes the Bayesian conditioning rule employed in the analysis step of classical data assimilation.

As an example, let $W = A$ and χ_S : X → {0, 1} be the characteristic function of measurable set (event) $S \in B (X)$ . According to Bayes’ theorem, if $p \in A_{*}$ is a probability density and ∫_Sp dμ > 0, the conditional density of p given S is

\begin{matrix} q = \frac{p χ_{S}}{\int_{X} p χ_{S} d μ} = \frac{\sqrt{χ_{S}} p \sqrt{χ_{S}}}{\int_{X} p χ_{S} d μ} . \end{matrix}

[8]

Since χ_S(x)∈{0, 1} for every x ∈ X, it follows that χ_S is an effect in $A$ , and since ∫_Xpχ_S dμ = ω_pχ_S, the Bayesian formula above is a special case of Eq. 7 with p|_{χ_S} = q. Note that to obtain the second equality in Eq. 8, we made use of the commutativity of function multiplication, which does not hold in a nonabelian algebra.

An important compatibility result between effects in the abelian algebra $A$ and effects in the nonabelian algebra $B$ is as follows: The regular representation $π : A \to B$ maps the effect space $E (A)$ into the effect space $E (B)$ . As a result, and by virtue of Eq. 4, for every normal state $ω_{p} \in S_{*} (A)$ and effect $e \in E (A)$ , the conditioned state ω_p|_e satisfies

\begin{matrix} Γ (ω_{p} |_{e}) = (Γ ω_{p}) |_{π e} . \end{matrix}

[9]

This means that conditioning by effects in $E (A)$ consistently embeds to conditioning by effects in $E (B)$ .

Next, let Y be a set. In first-order logic, a predicate is a map F : Y → {0, 1} such that F(y)=1 means that the proposition F(y) is true, and F(y)=0 means that it is false. In fuzzy logic, predicates are generalized to maps F : Y → [0, 1]. In quantum logic, predicates are represented by effect-valued maps $F : Y \to E (W)$ . Applying Eq. 7 for e = F(y) leads to the update rule p ↦ p|_F(y), which represents the conditioning of the normal state associated with p by the truth value of the proposition F(y) associated with y ∈ Y.

In our algebraic data assimilation framework, we use an effect-valued map to carry out the analysis step given observations of the system in a space Y (Fig. 1). Specifically, let h : X → Y be a measurable observation map, such that y = h(x) corresponds to the assimilated data given that the system is in state x ∈ X. Let ψ : Y × Y → [0, 1] be a measurable kernel function on Y, taking values in the unit interval. Every such kernel induces an effect-valued map $F : Y \to E (A)$ given by F(y)=ψ(y, h(⋅)). Possible choices for ψ include bump kernels—in such cases, F(y) can be viewed as a relaxation of a characteristic function χ_S of a set S containing h⁻¹({y}) (SI Appendix, Eq. S25).

If, immediately prior to an observation at time t + Δt, the abelian data assimilation system has state $ω_{p_{t, Δ t}} \in S_{*} (A)$ (recall that p_{t, Δt} is the forecast density for lead time Δt initialized at time t), and F(y) has nonzero validity with respect to ω_{p_{t, Δt}}, our analysis step updates ω_{p_{t, Δt}} to the conditional state ω_{p_{t, Δt}}|_F(y) ≡ ω_{p_{t + Δt}} using Eq. 7. In the nonabelian setting, we promote F to the operator-valued function $F : Y \to E (B)$ with $F = π ⚬ F$ and use again Eq. 7 to update the prior state $ω_{ρ_{t, Δ t}} \in S_{*} (B)$ to $ω_{ρ_{t, Δ t}} |_{F (y)} \equiv ω_{ρ_{t + Δ t}}$ ; see the Analysis column of the schematic in Fig. 1. By Eq. 9, the abelian and nonabelian analysis steps are mutually consistent, in the sense that if ω_{ρ_{t, Δt}} = Γ(ω_{p_{t, Δt}}), then for every observable $f \in A$ , we have ω_{ρ_{t + Δt}}(πf)=ω_{p_{t + Δt}}f.

We should note that the effect-based analysis step introduced above can naturally handle data spaces Y of arbitrarily large dimension, overcoming an important limitation of the QMDA framework proposed in ref. (17). It is also worthwhile pointing out connections between effect-valued maps and feature maps from RKHS theory (27): If ψ is positive-definite, there is an associated RKHS $H$ of complex-valued functions on X with w(x, x′) := ψ(h(x),h(x′)) as its reproducing kernel. The map F then takes values in the space $E (A) \cap H$ and is thus an instance of a feature map. In the nonabelian case, one can think of $F$ as an operator-valued feature map.

Positivity-Preserving Discretization.

The abelian and nonabelian formulations of data assimilation described thus far employ the infinite-dimensional algebras $A$ and $B$ , respectively. To arrive at practical computational algorithms, these algebras must be projected to finite dimensions, carrying along the associated dynamical and observation operators to finite-rank operators. We refer to this process as discretization.

To motivate our approach, we recall the definitions of quantum operations and channels (15): A linear map $T : W_{2} \to W_{1}$ between two von Neumann algebras $W_{1}$ and $W_{2}$ is said to be a quantum operation if i) 𝒯 is completely positive, i.e., for every $n \in N$ , the tensor product map $T \otimes {Id}_{n} : M_{n} (W_{2}) \to M_{n} (W_{1})$ is positive, where $M_{n} (W_{1})$ and $M_{n} (W_{2})$ are the von Neumann algebras of n × n matrices over $W_{1}$ and $W_{2}$ , respectively; ii) $T$ is the adjoint of a map $T_{*} : W_{1 *} \to W_{2 *}$ such that $ω_{T_{*} ρ} 1 \leq 1$ for every normal state $ω_{ρ} \in S_{*} (W_{1})$ . If, in addition, $ω_{T_{*} ρ} 1 = 1$ , $T$ is said to be a quantum channel.

In quantum theory, operations and channels characterize the transfer of information in open and closed systems, respectively. Here, the requirement of complete positivity of $T : W_{1} \to W_{2}$ (as opposed to mere positivity) ensures that $T$ is extensible to a state-preserving map between any two systems that include $W_{1}$ and $W_{2}$ as subsystems. If $W_{1}$ is abelian, then positivity and complete positivity of $T$ are equivalent notions. If $W_{2} = B (H_{2})$ for a Hilbert space H₂, Stinespring’s theorem (28) states that $T$ is completely positive if and only if there is a Hilbert space H₁, a representation $ϖ : W_{1} \to B (H_{1})$ , and a bounded linear map V : H₂ → H₁ such that $T a = V^{*} ϖ (a) V$ .

It follows from these considerations that the Koopman operator $U^{t} : A \to A$ is a quantum operation (since U^t is positive, the transfer operator preserves normal states, and $A$ is abelian), and so is $U^{t} : B \to B$ (by Stinespring’s theorem). In fact, U^t and $U^{t}$ are both quantum channels. It is therefore natural to require that the discretization procedure leads to a quantum operation in both of the abelian and nonabelian cases. A second key requirement is that the discretization procedure is positivity preserving; that is, positive elements of the infinite-dimensional algebra are mapped into positive elements of the finite-dimensional algebra associated with the projected system. This requirement is particularly important when modeling physical systems, where failure to preserve signs of sign-definite quantities may result in loss of physical interpretability and lead to numerical instabilities (29). Our third requirement is that the finite-dimensional approximations converge in an appropriate sense to the original system as the dimension increases. One of the main perspectives put forward in this paper is that the construction of discretization schemes meeting these requirements is considerably facilitated by working in the nonabelian setting of $B$ rather than the abelian setting of $A$ .

First, as an illustration of the fact that a “naive” projection will fail to meet our requirements, consider the Koopman operator U^t : H → H. Fix an orthonormal basis {ϕ₀, ϕ₁, …} of H with $ϕ_{l} \in A$ , and let Π_L : H → H be the orthogonal projection that maps into the L-dimensional subspace H_L := span{ϕ₀, …, ϕ_{L − 1}}. A common approach to Koopman and transfer operator approximation (30, 31) is to orthogonally project elements of H to elements of H_L, f ↦ f_L := Π_Lf and similarly approximate U^t by the finite-rank operator U_L^(t) := Π_LU^tΠ_L. The rank of U_L^(t) is at most L, and it is represented in the {ϕ_l} basis by an L × L matrix U with elements U_ij = ⟨ϕ_i, U^tϕ_j⟩. Note the inclusions $H_{L} \subset A \subset H$ and that H_L and $A$ are invariant subspaces of H under U_L^(t). Moreover, U_L^(t) maps f ∈ H to g = U_L^(t)f ∈ H_L such that $g = \sum_{i, j = 0}^{L - 1} ϕ_{i} U_{ij} {\hat{f}}_{j}$ , where ${\hat{f}}_{j} = ⟨ ϕ_{j}, f ⟩$ . Letting $f = {({\hat{f}}_{0}, \dots, {\hat{f}}_{L - 1})}^{⊤}$ and $g = {({\hat{g}}_{0}, \dots, {\hat{g}}_{L - 1})}^{⊤}$ with ${\hat{g}}_{l} = ⟨ ϕ_{l}, g ⟩$ be the L-dimensional column vectors giving the representation of Π_Lf and g in the {ϕ_l} basis of H, respectively, we can express the action of U_L^(t) on f as the matrix–vector product g = Uf.

Unfortunately, such methods are not positivity preserving; that is, if f is a positive function in $A$ , Π_Lf need not be positive. A classical example is a tophat function on the real line, which develops oscillations to negative values upon Fourier filtering (the Gibbs phenomenon). Even if f is a positive function in the finite-dimensional subspace H_L (so that Π_Lf = f), the function g = U_L^(t)f need not be positive. Thus, standard discretization approaches based on orthogonal projections fail to meet the requirements laid out above.

Next, we turn to positivity-preserving discretizations utilizing the abelian algebra $A$ , as opposed to the Hilbert space H. Recalling that the projections in $A$ are the characteristic functions of measurable sets, let S be a measurable subset of X, and consider the multiplication operator $M_{S} : A \to A$ such that M_Sf = χ_Sf. The map M_S is positive, and the projected Koopman operator, M_SU^tM_S is a quantum operation. However, in order for M_S to be a discretization map, we must have that its range is a finite-dimensional algebra. This is equivalent to asking that the restriction of μ to S is supported on a finite number of atoms, i.e., measurable sets that have no measurable subsets of positive measure. This is a highly restrictive condition that fails to hold for broad classes of dynamical systems (e.g., volume-preserving flows on manifolds), so the abelian algebra $A$ does not provide an appropriate environment to perform discretizations meeting our requirements.

We now come to discretizations based on the operator algebra $B$ . Working with $B$ allows us to use both Hilbert space techniques to construct finite-rank operators by orthogonal projection and algebraic techniques to ensure that these projections are positivity preserving. With H_L as above, consider the von Neumann algebra $B_{L} : = B (H_{L})$ . This algebra has dimension L² and is isomorphic to the algebra $M_{L} \equiv M_{L} (C)$ of L × L complex matrices. In particular, each element $A \in B_{L}$ is represented by a matrix $A \in M_{L}$ with elements A_ij = ⟨ϕ_i, Aϕ_j⟩. Correspondingly, we refer to data assimilation based on $B_{L}$ as matrix mechanical; see Inline graphic in Fig. 1.

Next, note that $B_{L}$ can be canonically identified with the subalgebra of $B$ consisting of all operators A satisfying kerA ⊇ H_L^⊥ and ranA ⊆ H_L. Thus, we can view the projection $Π_{L} : B \to B$ with $Π_{L} A = Π_{L} A Π_{L}$ as an operator from $B$ to $B_{L}$ . By Stinespring’s theorem, $Π_{L}$ is completely positive. As a result, i) the projection $A \in B \mapsto Π_{L} A \in B_{L}$ is positivity preserving, and thus, so is the projected representation $π_{L} : A \to B_{L}$ with $π_{L} = Π_{L} ⚬ π$ ; and ii) the projected Koopman operator $U_{L}^{(t)} : B_{L} \to B_{L}$ with $U_{L}^{(t)} A = U_{L}^{(t)} A U_{L}^{(t) *}$ is a quantum operation. Moreover, since {ϕ_l} is an orthonormal basis of H, for any f ∈ H, we have lim_{L → ∞}Π_Lf = f. This implies that for every $A \in B$ , the operators $A_{L} = Π_{L} A \in B_{L}$ converge strongly to A, i.e., lim_{L → ∞}A_Lg = Ag, for all g ∈ H. In particular, π_Lf with $f \in A$ converges strongly to πf. Further details on these approximations can be found in SI Appendix, sections 2.D–2.H. Note that, in general, π_Lf is not a multiplication operator. That is, the act of embedding $A$ in the nonabelian algebra $B$ using $π : A \to B$ and then projecting to the finite-dimensional subalgebra $B_{L}$ using $Π_{L} : B \to B_{L}$ is not equivalent to projecting $A$ into H_L using Π_L and then embedding H_L into $B$ using π.

Consider now a normal state $ω_{p} \in S_{*} (A)$ induced by a probability density $p \in A_{*}$ , and let ω_ρ = Γ(ω_p) be the associated normal state on $B$ obtained via Eq. 3. For L sufficiently large, $C_{L} (ρ) : = Π_{L} ρ$ is nonzero, and thus, $ρ_{L} = Π_{L} ρ / C_{L} (ρ)$ is a density operator in $B_{L}$ inducing a state $ω_{ρ_{L}} \in S_{*} (B_{L})$ , which extends to $S_{*} (B)$ . In Fig. 1, we denote the map ω_ρ ↦ ω_{ρ_L} as $Π'_{L}$ . By construction, the state ω_{ρ_L} satisfies $ω_{ρ_{L}} A = ω_{ρ} (Π_{L} A) / C_{L} (ρ)$ for all $A \in B$ . Setting, in particular, A = πf with $f \in A$ , it follows from Eq. 4 and the strong convergence of π_Lf to πf that

\begin{matrix} lim_{L \to \infty} ω_{ρ_{L}} (π_{L} f) = ω_{ρ} (π f) = ω_{p} f ; \end{matrix}

[10]

SI Appendix, section 2.E. It should be kept in mind that, aside from special cases, ω_{ρ_L} is not the image of a state $ω_{p_{L}} \in S_{*} (A)$ under Γ for a probability density $p_{L} \in A_{*}$ ; that is, in general, ω_{ρ_L} is a “nonclassical” state. Note also that ω_{ρ_L} is a vector state (Eq. 3) induced by the unit vector $ξ_{L} = Π_{L} \sqrt{p} / {‖ Π_{L} \sqrt{p} ‖}_{H}$ , which, as just mentioned, is generally not the square root of a probability density.

Let now $P_{L}^{(t)} : S (B_{L}) \to S (B_{L})$ with $P_{L}^{(t)} ω = ω ⚬ U_{L}^{(t)}$ be the projected transfer operator on $S (B_{L})$ . Unless H_L is a U^t-invariant subspace, $P_{L}^{(t)} ⚬ Π'_{L}$ is not equal to $Π'_{L} ⚬ P^{t}$ ; see the dashed arrow in the third column of the schematic in Fig. 1. Nevertheless, we have the asymptotic consistency $lim_{L \to \infty} ((P_{L}^{t} ⚬ Π'_{L}) ω_{ρ}) A_{L} = (P^{t} ω_{ρ}) A$ , which holds for all $ω_{ρ} \in S_{*} (B)$ and $A \in B$ ; SI Appendix, section 2.I. Applying this result for A = πf and ω_ρ = Γ(ω_p), with $f \in A$ and $ω_{p} \in S_{*} (A)$ , it follows that

\begin{matrix} lim_{L \to \infty} ((P_{L}^{(t)} ⚬ Π_{L}^{'}) ω_{ρ}) (π_{L} f) = (P^{t} ω_{ρ}) (π f) = (P^{t} ω_{p}) f . \end{matrix}

[11]

Eq. 11 implies that the matrix mechanical data assimilation scheme consistently recovers forecasts from data assimilation in the abelian algebra $A$ in the limit of infinite dimension L.

In SI Appendix, section 2.F.2, we describe how for any self-adjoint element $A \in B$ , the spectral measures of $Π_{L} A$ converge to the spectral measure of A. Since $π f \in B$ is self-adjoint whenever $f \in A$ is real, the spectral convergence of π_Lf to πf implies that the forecast distributions $P_{π_{L} f, t, τ}$ induced by $ω_{ρ_{t, τ, L}} = (P_{L}^{(τ)} ⚬ Π'_{L}) ω_{ρ_{t, L}}$ consistently recover the forecast distributions $P_{π f, t, τ}$ and $P_{f, t, τ}$ from the infinite-dimensional quantum mechanical and abelian systems, respectively.

With a similar approach (SI Appendix, section 2.J), one can deduce that the analysis step is also consistently recovered: Defining the effect-valued map $F_{L} : Y \to E (B_{L})$ with $F_{L} = Π_{L} ⚬ F$ , it follows from Eq. 9 and Eq. 10 that for every $f \in A$ and $ω_{p} \in S_{*} (A)$ ,

\begin{matrix} lim_{L \to \infty} ω_{ρ_{L}} |_{F_{L} (y)} (π f) = ω_{ρ} {|_{F (y)} (π f) = ω_{p} |}_{F (y)} f, \end{matrix}

[12]

where ω_ρ = Γ(ω_p) and $ω_{ρ_{L}} = Π'_{L} ω_{ρ_{L}}$ , so the matrix mechanical analysis step is asymptotically consistent with the infinite-dimensional quantum mechanical and abelian analyses.

On the basis of Eqs. 11 and 12, we conclude that as the dimension L increases, the matrix mechanical data assimilation scheme is consistent with the abelian formulation of sequential data assimilation. Moreover, the discretization leading to this scheme is positivity preserving, and the projected Koopman operator $U_{L}^{(t)}$ is a quantum operation. Thus, matrix mechanical data assimilation provides a nonabelian, finite-dimensional framework that simultaneously meets all of the requirements listed in the beginning of this subsection.

Data-Driven Approximation.

The matrix mechanical data assimilation scheme described above admits a consistent data-driven approximation using kernel methods for machine learning (17, 20, 22). The data-driven scheme employs three, possibly related, types of training data, all acquired along a dynamical trajectory X_N = {x₀, x₁, …, x_{N − 1}}⊂X with x_n = Φ^n
Δt(x₀), where Δt > 0 is a sampling interval: i) samples y_n = h(x_n) from the observation map h : X → Y; ii) samples f_n = f(x_n) from the forecast observable $f \in A$ ; iii) samples z_n = z(x_n) from a map z : X → Z, used as proxies of the dynamical states x_n. If the x_n are known, we set Z = X and z = Id. Otherwise, we set Z = Y^{2Q + 1} for a parameter $Q \in N$ and define z as the delay-coordinate map z(x)=(h(Φ^−Q
Δt(x)), h(Φ^{( − Q + 1) Δt}(x)), …, Φ^Q
Δt(x)), giving z_n = (y_{n − Q}, y_{n − Q + 1}, …, y_Q). By delay-embedding theory (32), for sufficiently large Q and typical observation maps h and sampling intervals Δt, z is an injective map.

The dynamical trajectory x_n has an associated sampling measure $μ_{N} : = \sum_{n = 0}^{N - 1} δ_{x_{n}} / N$ and a finite-dimensional Hilbert space ${\hat{H}}_{N} : = L^{2} (X, μ_{N})$ . By ergodicity, as N increases, the measures μ_N converge to the invariant measure μ in weak-^* sense, so we can interpret ${\hat{H}}_{N}$ as a data-driven analog of the infinite-dimensional Hilbert space H (SI Appendix, section 1). Given the training data z₀, z₁, …, z_{N − 1}, and without requiring explicit knowledge of the underlying states x_n, we use kernel integral operators to build an orthonormal basis {ϕ_{0, N}, …, ϕ_{L − 1, N}} of an L-dimensional subspace $H_{L, N} \subseteq {\hat{H}}_{N}$ that plays the role of a data-driven counterpart of H_L. More specifically, the basis elements ϕ_{l, N} are eigenvectors of a kernel integral operator $K_{N} : {\hat{H}}_{N} \to {\hat{H}}_{N}$ induced by a kernel function $κ : Z \times Z \to R$ . The operator K_N is represented by an N × N kernel matrix K_N constructed from the training data z_n; SI Appendix, section 2.A. We let $B_{L, N} = B (H_{L, N})$ be the L²-dimensional algebra of linear maps on H_{L, N}, which, as in the case of $B_{L}$ , is isomorphic to the matrix algebra $M_{L}$ .

Every operator employed in the matrix-mechanical scheme described in the previous section has a data-driven counterpart, represented as an L × L matrix with respect to the {ϕ_{l, N}} basis. Specifically, the projected Koopman operator U_L^(t) at time t = q Δt, $q \in Z$ , is replaced by an operator $U_{L, N}^{(q)} \in B_{L, N}$ induced by the shift map on the trajectory x_n (30), with a corresponding quantum operation $U_{L, N}^{(q)} : B_{L, N} \to B_{L, N}$ . Moreover, the projected multiplication operator π_Lf is replaced by $π_{L, N} {\hat{f}}_{N} \in B_{L, N}$ , and the effect-valued map $F_{L}$ by a map $F_{L, N} : Y \to E (B_{L, N})$ . Here, ${\hat{f}}_{N}$ is the restriction of f on ${\hat{X}}_{N}$ . Further details are provided in SI Appendix, sections 2.D–2.J.

The data-driven scheme is positivity preserving and constitutes a quantum operation analogously to the matrix mechanical scheme. Moreover, by results on spectral approximation of kernel integral operators (33) and ergodicity of the dynamics, the kernel matrices K_N exhibit spectral convergence in the large-data limit, N → ∞, to a kernel integral operator K : H → H in a suitable sense (SI Appendix, Theorem 1). Correspondingly, all matrix representations of operators, and thus all predictions made by the data-driven scheme, converge to the predictions of the matrix mechanical scheme Inline graphic in Fig. 1. Overall, we obtain a data-driven, positivity-preserving, and asymptotically consistent data assimilation scheme. The data requirements and computational complexity of this scheme are comparable to standard kernel methods for supervised machine learning (SI Appendix, section 2.K).

Lorenz 96 Multiscale System

As our first numerical example, we apply QMDA to assimilate and predict the slow variables of the Lorenz 96 (L96) multiscale system (34). This system was introduced by Lorenz in 1996 as a low-order model of atmospheric circulation at a constant-latitude circle. The dynamical degrees of freedom include K slow variables x₁, …, x_K, representing the zonal (west to east) component of the large-scale atmospheric velocity field at K zonally equispaced locations. Each slow variable x_k is coupled to J fast variables y_{1, k}, …, y_{J, k}, representing small-scale processes such as atmospheric convection. The dynamical state space is thus X = ℝ^{J(K + 1)} with $x = {(x_{k}, y_{j, k})}_{j, k = 1, 1}^{J, K} \in X$ .

The governing equations are

\begin{matrix} {\dot{x}}_{k} & = - x_{k - 1} (x_{k - 2} - x_{k + 1}) - x_{k} + F + \frac{h_{x}}{J} \sum_{j = 1}^{J} y_{j, k}, \\ {\dot{y}}_{j, k} & = \frac{1}{ε} (- y_{j + 1, k} (y_{j + 2, k} - y_{j - 1, k}) - y_{j, k} + h_{y} x_{k}), \\ x_{k + K} & = x_{k}, y_{j, k + K} = y_{j, k}, y_{j + J, k} = y_{j, k + 1}, \end{matrix}

[13]

where the parameter F represents large-scale forcing (e.g., solar heating), h_x and h_y control the coupling between the slow and fast variables, and ε is a parameter that controls the timescale separation between the fast and slow variables. The governing equations for x_k feature large-scale forcing, F, a quadratic nonlinearity, −x_{k − 1}(x_{k − 2} − x_{k + 1}), representing advection, a linear damping term, −x_k, representing surface drag, and a flux term, $h_{x} \sum_{j = 1}^{J} y_{j, k} / J$ , representing forcing from the fast variables. The terms in the y_{j, k} equations have similar physical interpretations. In general, the dynamics becomes more turbulent/chaotic as F increases.

Here, we focus on the chaotic dynamical regime studied in refs. (35) and (36) with K = 9, J = 8, ε = 1/128, F = 10, h_x = −0.8, and h_y = 1. In this regime, ε is sufficiently small so that the dynamics of the (x₁, …, x_K) variables is approximately Markovian. We consider that the observation map h : X → Y projects the state vector x ∈ X to the slow variables, i.e., $Y = R^{K}$ and h(x)=y := (x₁, …, x_K). Our forecast observable $f \in A$ is the first slow variable, f(x)=x₁.

Training.

We employ a training dataset consisting of N = 40,000 samples y₀, …, y_{N − 1} ∈ Y and $f_{0}, \dots, f_{N - 1} \in R$ with y_n = h(x_n), f_n = f(x_n), and x_n = Φ^n
Δt(x₀), taken at a sampling interval Δt = 0.05. To assess forecast skill, we use $\hat{N} = 7,000$ samples ${\hat{y}}_{0}, \dots, {\hat{y}}_{\hat{N} - 1}$ with ${\hat{y}}_{n} = h ({\hat{x}}_{n})$ and ${\hat{x}}_{n} = Φ^{n Δ t} ({\hat{x}}_{0})$ , taken on an independent dynamical trajectory from the training data. The data z₀, z₁, …, z_{N − 1} ∈ Z for computation of the data-driven basis {ϕ_{l, N}} of H_{L, N} consist of snapshots of the slow variables, z_n = y_n. That is, we have $Z = R^{(2 Q + 1) K} = Y$ with Q = 0 delays and z = Id. This choice is motivated by the fact that the evolution of y_n is approximately Markovian for ε ≪ 1, and the forecast observable f(x)=x₁ depends on x ∈ X only through y = h(x). Following ref. (21), we compute the ϕ_{l, N} using a variable-bandwidth Gaussian kernel (37) with a bistochastic normalization (38). Further details on this kernel and the L96 data are provided in SI Appendix, sections 2.B and 5.A, respectively.

Using the basis vectors, we compute L × L matrix representations of the projected Koopman operators $U_{L, N}^{(τ_{j})}$ for lead times τ_j = j Δt with j ∈ {0, 1, …, J_f}, J_f = 150 (SI Appendix, Algorithm S9). Moreover, using the ϕ_{l, N} and the training samples f_n, we compute the L × L matrix representation A_{L, N} of the operator A_{L, N} := π_{L, N}f associated with the forecast observable. To evaluate forecast distributions for f, we compute the PVM E_{A_{L, N}} of A_{L, N}, which amounts to computing an eigendecomposition of A_{L, N} (SI Appendix, Algorithm S7). To report forecast probabilities, we evaluate E_{A_{L, N}} on a collection of bins $S_{1}, \dots, S_{M} \subset R$ of equal probability mass in the equilibrium distribution of f. As our observation kernel ψ : Y × Y → [0, 1], we use a variable-bandwidth bump function. The corresponding effect-valued map $F_{L, N} : Y \to E (B_{L, N})$ is represented by a matrix-valued function; further details are provided in SI Appendix, section 2.J. In Figs. 2 and 3, we show results for Hilbert space dimension L = 2, 000, though forecast skill does not change appreciably for values of L in the range 500 to 2,000 (SI Appendix, Fig. S1).

Fig. 2. — Running QMDA forecasts of the x₁ variable of the L96 multiscale system in a chaotic regime. The panels show the true x₁ evolution (black lines), the logarithm of the discrete forecast probability density 𝜚_{n, j} (colors), and the corresponding forecast mean (red lines) as a function of verification time for lead times in the range 0 to 5 model time units (Top to Bottom). The assimilated observable is the K-dimensional vector (x₁, …, x_K) of the L96 slow variables.

Fig. 3. — NRMSE (A) and AC score (B) of the L96 forecasts from Fig. 2.

Data Assimilation.

We perform data assimilation experiments initialized with the pure state $ω_{0} \equiv ω_{ρ_{0}} \in S (B_{L, N})$ induced by the density operator $ρ_{0} = {⟨ 1_{X}, \cdot ⟩}_{{\hat{H}}_{N}} 1_{X} \in B_{L, N}$ . We interpret this state as an uninformative equilibrium state, in the sense that i) $ω_{0} A_{L, N} = tr (ρ_{0} A_{L, N}) = {\bar{f}}_{N}$ , where ${\bar{f}}_{N} = \sum_{n = 0}^{N - 1} f_{n} / N$ is the empirical mean of f, and ii) ω_ρ₀ is invariant under the action of the transfer operator, i.e., $P_{L, N}^{(t)} ω_{0} : = ω_{0} ⚬ U_{L, N}^{(t)} = ω_{0}$ .

Starting from ω₀, QMDA produces a sequence of states $ω_{0}, ω_{1}, \dots, ω_{\hat{N} - 1}$ by repeated application of the forecast–analysis steps, as depicted schematically in Fig. 1 and in pseudocode form in SI Appendix, Algorithm S1. Specifically, for $n \in {1, \dots, \hat{N} - 1}$ , we compute ω_n by first using the transfer operator to compute the state $ω_{n - 1, 1} : = P_{L, N}^{(Δ t)} ω_{n - 1}$ (which is analogous to the prior in classical data assimilation) and then applying the effect map to observation ${\hat{y}}_{n}$ to yield $ω_{n} = ω_{n - 1, 1} |_{F_{L, N} ({\hat{y}}_{n})}$ (which is analogous to the classical posterior). For each $n \in {0, \dots, \hat{N} - 1}$ , we also compute forecast states $ω_{n, j} = P_{L, N}^{(τ_{j})} ω_{n}$ and associated forecast distributions $P_{n, j}$ for the observable f. We evaluate $P_{n, j}$ on the bins S_m and normalize the result by the corresponding bin size, s_m := length(S_m) to produce discrete probability densities 𝜚_{n, j} = (𝜚_{n, j, 0}, …, 𝜚_{n, j, M − 1}) with $ϱ_{n, j, m} : = P_{n, n} (S_{m}) / s_{m}$ . We also compute the forecast mean and standard deviation, ${\bar{f}}_{n, j} = ω_{ρ_{n, j}} A_{L, N}$ and $σ_{n, j} = {(ω_{ρ_{n, j}} {(A_{L, N})}^{2} - {\bar{f}}_{n, j}^{2})}^{1 / 2}$ , respectively. We assess forecast skill through the normalized mean square error (NRMSE) and anomaly correlation (AC) scores, computed for each lead time τ_j by averaging over the $\hat{N}$ samples in the verification dataset (SI Appendix, section S4).

Fig. 2 shows the forecast probability densities 𝜚_{n, j} (colors), forecast means ${\bar{f}}_{n, j}$ (black lines), and true signal ${\hat{f}}_{n + j}$ (red lines), plotted as a function of verification time t_{n + j} over intervals spanning 20 time units for representative lead times τ_j in the range 0 to 5 time units. The corresponding NRMSE and AC scores are displayed in Fig. 3. Given the turbulent nature of the dynamics, we intuitively expect the forecast densities 𝜚_{n, j} to start from being highly concentrated around the true signal for small τ_j and progressively broaden as τ_j increases (i.e., going down the panels of Fig. 2), indicating that the forecast uncertainty increases. Correspondingly, we expect ${\bar{f}}_{n, j}$ to accurately track the true signal for small τ_j and progressively relax toward the equilibrium mean ∫_Xf dμ.

The results in Figs. 2 and 3 are broadly consistent with this behavior: The forecast starts at τ_j = 0 from a highly concentrated density around the true signal (note that Fig. 2 shows logarithms of 𝜚_{n, j}), which is manifested by low NRSME and large AC values in Fig. 3 of approximately 0.24 and 0.98, respectively. As τ_j increases, the forecast distribution broadens, and the NRMSE (AC) scores exhibit a near-monotonic increase (decrease). In Fig. 3A, the estimated error based on the forecast variance σ_{n, j} is seen to track well the NRMSE score, which indicates that the forecast distribution 𝜚_{n, j} well represents the true forecast uncertainty. It should be noted that errors are present even at time τ_j = 0, particularly for periods of time where the true signal takes extreme positive or negative values. Such reconstruction errors are expected for a fully data-driven driven method applied to a system with a high-dimensional attractor. Overall, the skill scores in Fig. 3 are comparable with the results obtained in ref. (36) using the kernel analog forecasting (KAF) technique (39).

El Niño Southern Oscillation

The El Niño Southern Oscillation (ENSO) (40) is the dominant mode of interannual (3- to 5-y) variability of the Earth’s climate system. Its primary manifestation is an oscillation between positive sea surface temperature (SST) anomalies over the eastern tropical Pacific Ocean, known as El Niño events, and episodes of negative anomalies known as La Niñas (41). Through atmospheric teleconnections, ENSO drives seasonal weather patterns throughout the globe, affecting the occurrence of extremes such as floods and droughts, among other natural and societal impacts (42). Here, we demonstrate that QMDA successfully predicts ENSO within a comprehensive climate model by assimilating high-dimensional SST data.

Our experimental setup follows closely ref. (43), who performed data-driven ENSO forecasts using KAF. As training and test data, we use a control integration of the Community Climate System Model Version 4 (CCSM4) (44), conducted with fixed preindustrial greenhouse gas forcings. The simulation spans 1,300 y, sampled at an interval Δt = 1 month. Abstractly, the dynamical state space X consists of all degrees of freedom of CCSM4, which is of order 10⁷ and includes variables such as density, velocity, and temperature for the atmosphere, ocean, and sea ice, sampled on discretization meshes over the globe. Since this simulation has no climate change, there is an implicit invariant measure μ sampled by the data, and we can formally define the algebras $A$ and $B$ associated with the invariant measure as described above.

In our experiments, the observation map h : X → Y returns monthly averaged SST fields on an Indo-Pacific domain; that is, we have $Y = R^{d}$ , where d is the number of surface ocean gridpoints within the domain. We have d = 44,414, so these experiments test the ability of QMDA to assimilate high-dimensional data. However, note that h is a highly noninvertible map since Indo-Pacific SST comprises only a small subset of CCSM4’s dynamical degrees of freedom. As our forecast observable $f \in A$ we choose the Niño 3.4 index—a commonly used index for ENSO monitoring defined as the average SST anomaly over a domain in the tropical Pacific Ocean. Large positive (negative) values of Niño 3.4 represent El Niño (La Niña) conditions, whereas values near zero represent neutral conditions. Additional information on the CCSM4 data is included in SI Appendix, section 5.B.

Following ref. (43), we use the SST and Niño 3.4 samples from the first 1,100 y of the simulation as training data and the corresponding samples for the last 200 y as test data. Thus, with the notation of the previously described L96 experiments, our training data are y_n = h(x_n) (Indo-Pacific SST) and f_n = f(x_n) (Niño 3.4) for n ∈ {0, …, N − 1} and N = 1,100 × 12= 13,200, and our test data are ${\hat{y}}_{n} = h (x_{n + N})$ and ${\hat{f}}_{n} = f (x_{n + N})$ for $n \in {0, \dots, \hat{N} - 1}$ and $\hat{N} = 200 \times 12 =$ 2,400. Here, x_n = Φ^n
Δt(x₀)∈X is the (unknown) dynamical trajectory of the CCSM4 model underlying our training and test data. Using the SST samples y_n, we build the training data z_n using delay-coordinate maps with parameter Q = 5; i.e., the data z_n used for building the basis of H_{L, N} are SST “videos” that span a total of 2Q + 1 = 11 months and have dimension 11d ≃ 4.9 × 10⁵. We compute the basis {ϕ_{l, N}} using a kernel κ that depends on the pairwise Euclidean distances between points in Z as well as Niño 3.4 trajectories evaluated on these points (SI Appendix Eq. S5). This approach improves the ability of the basis vectors to capture covariability between Indo-Pacific SST fields and the Niño 3.4 index, leading to a modest improvement of short-term forecast skill and a more significant improvement of uncertainty quantification over kernels that depend only on SST. Aside from the different kernel κ, the procedure for initializing and running QMDA is identical to the L96 experiments.

Fig. 4 shows the forecast probability density (𝜚_{n, j}; colors), forecast mean ( ${\bar{f}}_{n, j}$ ; black lines), and true signal ( ${\hat{f}}_{n + j}$ ; red lines) for the Niño 3.4 index as a function of verification time t_{n + j} over 20-y portions of the test dataset for lead times τ_j in the range 0 to 12 mo, obtained for Hilbert space dimension L = 1,000. The corresponding NRMSE and AC scores are displayed in Fig. 5. The skill scores do not vary significantly for values of L in the interval 500 to 2,000 (SI Appendix, Fig. S2).

Fig. 5. — NRMSE (A) and AC score (B) of the Niño 3.4 forecasts from Fig. 4.

Qualitatively, the forecast density 𝜚_{n, j} displays a similar behavior as in the L96 experiments; that is, it is concentrated around the true signal on short lead times (τ_j ≲ 3 months) and gradually broadens as forecast uncertainty grows with increasing lead time τ_j due to chaotic climate dynamics. In Fig. 5A, the estimated forecast error based on the forecast variance σ_{n, j} agrees reasonably well with the actual NRMSE evolution. Adopting AC = 0.6 as a commonly used threshold for ENSO predictability, we see from the AC results in Fig. 5B that QMDA produces useful forecasts out to τ_j ≃ 12 months. The performance of QMDA in terms of the NRMSE and AC metrics is comparable to that found for KAF in ref. (43), but QMDA has the advantage of producing full forecast probability distributions instead of point estimates. Compared to KAF, QMDA also has the advantage of being positivity preserving. While this property may not be critical for sign-indefinite ENSO indices, there are many climatic variables where sign preservation is particularly important.

Quantum Circuit Implementation

As a demonstration of the potential of QMDA for implementation on quantum computing platforms, we present forecasting results for the L96 multiscale system obtained from quantum circuit simulations performed using the Qiskit Aer Python library (45). Our quantum circuit architecture consists of initialization, Koopman evolution, eigenbasis rotation, and measurement stages, depicted in Fig. 6 for a 4-qubit setup. The initialization and Koopman stages implement the QMDA analysis and forecast steps, respectively, via unitary operations acting on the qubits. The eigenbasis rotation implements a unitary induced by the eigenvectors of the quantum mechanical forecast observable A_{L, N} so that measurement at the output of the circuit samples the desired probability distribution $P_{n, j}$ at the given initialization time t_n and forecast lead time τ_j.

Fig. 6. — Four-qubit circuit implementation of an analysis–forecast QMDA cycle.

In more detail, associated with a quantum computational system of $n$ qubits is a tensor product Hilbert space $B_{n} = B^{\otimes n}$ of dimension $2^{n}$ , where $B = span {|0〉, |1〉}$ is the two-dimensional Hilbert space generated by |0⟩ (“up”) and |1⟩ (“down”) vectors (46). The standard basis of $B_{n}$ , known as quantum computational basis, has the tensor product form ${|b〉 = |b_{1}〉 \otimes \dots \otimes |b_{n}〉}_{b \in {0, 1}^{n}}$ , indexed by binary strings b = (b₁, …, b_𝔫) of length $n$ . A (noise-free) quantum computer is represented as a quantum channel $T : M_{n} \to M_{n}$ on the $2^{2 n}$ -dimensional von Neumann algebra $M_{n} : = B (B_{n})$ , which is isomorphic to the algebra $M_{2^{n}}$ of $2^{n} \times 2^{n}$ matrices. Typically, the channel $T$ has the form $T A = T^{*} A T$ , where $T : B_{n} \to B_{n}$ is a unitary map. In the circuit shown in Fig. 6, we express T as the composition T = T_rot ⚬ T_K(j) ⚬ T_init(ξ, y), where T_init(ξ, y) represents the initialization (analysis) step given a prior state vector ξ ∈ H_{L, N}, and an observation y ∈ Y, T_K(j) represents Koopman evolution over $j \in N$ forecast timesteps, and T_rot is the eigenbasis rotation stage of the circuit.

In an actual quantum computational environment, T_init, T_K, and T_rot would be implemented by combining quantum logic gates through operations such as compositions and tensor products, the goal being to implement T by a circuit of low depth (longest path from input to output). For instance, circuits whose depth scales as a polynomial in $n$ allow simulation of quantum states and observables on the exponentially large-dimensional Hilbert space $B_{n}$ in a polynomial running time. Here, we do not address the important questions of how to implement T_init, T_K, and T_rot efficiently and robustly on an actual quantum computer, so the results presented in this section should be viewed as a proof of concept.

Suppose that the Hilbert space dimension of the matrix mechanical data assimilation system Inline graphic is $L = 2^{n}$ for some $n \in N$ . Given the data-driven basis {ϕ_{l, N}} of H_{L, N} indexed by integers l ∈ {0, …, L − 1}, we define a unitary $W_{L} : H_{L} \to B_{n}$ such that W_Lϕ_{l, N} = |b⟩, where $b = (b_{1}, \dots, b_{n})$ is the binary representation of l, i.e., $l = \sum_{i = 1}^{n} b_{i} 2^{n - i}$ . Using W_L, we can encode any $A \in B_{L, N}$ into an operator $B = W_{L} A : = W_{L} A W_{L}^{*} \in M_{n}$ . In particular, if $ω_{ρ} \in S (B_{L, N})$ is a state induced by a density operator $ρ \in B_{L, N}$ , then the transformed state $ω_{σ} \in S (M_{n})$ with $σ = W_{L} ρ$ satisfies ω_ρ(A)=ω_σ(B), so predictions from the matrix mechanical and quantum computational systems are equivalent. If ω_ρ is a vector state induced by unit vector ξ ∈ H_{L, N}, then ω_σ is a vector state induced by $ζ = W_{L} ξ \in B_{n}$ .

Given a prior state $ω_{n - 1, 1} \in S (B_{L, N})$ at time t_n with corresponding state vector ξ_{n − 1, 1} ∈ H_{L, N}, an observation ${\hat{y}}_{n} \in Y$ , and a forecast lead time τ_j, the circuit in Fig. 6 operates by acting on the computational basis vector |0⟩ (which is the “default” state vector at the start of a quantum computation) by the transformations $T_{_{} init} (ξ_{n - 1, 1}, {\hat{y}}_{n})$ , T_K(j), and T_rot, leading to the state vector $ζ_{n, j} = T_{} rot ⚬ T_{} K (j) ⚬ T_{} init (ξ_{n - 1, 1}, {\hat{y}}_{n}) |0〉$ . At the end of the computation, a measurement in the quantum computational basis yields a random binary string b with probability $P_{n, j} ({b}) = | {⟨ ζ_{n, j}, b ⟩}_{n}^{2} |$ . It can be shown that $a_{l} \in R$ , where l ∈ {0, …, L − 1} is the integer with binary representation b and a_l is the l-th eigenvalue in the spectrum of A_{L, N} (ordered in increasing order), is a sample from the distribution $P_{n, j} ({a_{l}})$ induced from the spectral measure E_{A_{L, N}} and the quantum state ω_{n, j}. Repeating this procedure over M identically prepared circuits leads to an ensemble of measurements (or “shots”) {a_l₁, …, a_{l_M}}, which provides a Monte Carlo approximation of the theoretical forecast distribution $P_{n, j}$ . Further details are provided in SI Appendix, section 3.

Fig. 7 shows representative forecast distributions of the x₁ variable of the L96 multiscale system obtained via this approach using $n = 10$ qubits (i.e., L = 2¹⁰ = 1024). In these experiments, the verification time is held fixed, so there is a fixed true value x₁ ≈ −1.86 and the lead time τ_j = j Δt varies from 0 to 0.75 in increments of 0.25. The panels show histograms of the normalized counts ${\tilde{ϱ}}_{n, j} (l) = M_{njl} / (s_{l} M)$ centered on x₁ = a_l, where M_njl is the number of occurrences of eigenvalue a_l of the multiplication operator A_{L, N} in the experiment with lead time τ_j, M = 10⁶ is the number of shots, and s_l = (a_{l + 1} − a_{l − 1})/2 is an effective bin size. Also shown are the empirical mean ${\tilde{f}}_{n, j} = \sum_{l = 0}^{L - 1} a_{l} M_{njl} / M$ that approximates the forecast expectation ${\bar{f}}_{n, j}$ and the true value of x₁ (yellow and red lines, respectively).

The time evolution of the histograms illustrates the increase of forecast uncertainty due to chaotic dynamics in conjunction with finite-rank operator approximation. At τ_j = 0, the empirical density is strongly concentrated around the truth, and the empirical mean ${\tilde{f}}_{n, j} \approx - 1.99$ has an approximately 5.9% error. As τ_j increases, the probability density spreads predominantly to larger values of x₁, causing the mean forecast to increasingly deviate from the truth. Intriguingly, the peak of the histograms remains collocated with the truth, which suggests that there may be opportunities to increase skill using an estimator based on the mode of the distribution rather than the mean. Overall, the quantum circuit simulation results are qualitatively consistent with the L96 results from deterministic computation presented earlier, which demonstrates the suitability of QMDA for implementation on quantum computers.

Concluding Remarks

We have developed theory and methods for sequential data assimilation of partially observed dynamical systems using techniques from operator algebra, quantum information, and ergodic theory. At the core of this framework, called quantum mechanical data assimilation (QMDA), is the nonabelian algebraic structure of spaces of operators. One of the main advantages that this structure provides is that it naturally enables finite-dimensional discretization schemes that preserve the sign of sign-definite observables in ways that are not possible with classical projection-based approaches.

We build these schemes starting from a generalization of Bayesian data assimilation based on a dynamically consistent embedding into an infinite-dimensional operator algebra acting on the L² space associated with an invariant measure of the system. Under this embedding, forecasting is represented by a quantum operation induced by the Koopman operator of the dynamical system, and Bayesian analysis is represented by quantum effects. In addition to providing a useful starting point for discretizing data assimilation, this construction draws connections between statistical inference methods for classical dynamical systems with quantum information and quantum probability, which should be of independent interest.

QMDA leverages properties of operator algebras to project the infinite-dimensional framework into the level of a matrix algebra in a manner that positive operators are represented by positive matrices, and the finite-dimensional system is a quantum operation. QMDA also has a data-driven formulation based on kernel methods for machine learning with consistent asymptotic behavior as the amount of training data increases. We have demonstrated the efficacy of QMDA with forecasting experiments of the slow variables of the Lorenz 96 multiscale system in a chaotic regime and the El Niño Southern Oscillation in a climate model. QMDA was shown to perform well in terms of point forecasts from quantum mechanical expectations, while also providing uncertainty quantification by representing entire forecast distributions via quantum states.

This work motivates further application and development of algebraic approaches and quantum information to building models and performing inference of complex dynamical systems. In particular, as we enter the quantum computing era, there is a clear need to lay out the methodological and algorithmic foundations for quantum simulation of complex classical systems. Being firmly rooted in quantum information and operator theory, the QMDA framework presented in this paper is a natural candidate for implementation in quantum computers, which we have demonstrated here by means of simulated quantum circuit experiments. As noted in the opening section of the paper, efforts to simulate classical dynamical systems on quantum computers are being actively pursued (22, 47, 48). Porting data assimilation algorithms such as QMDA to a physical quantum computational environment presents new challenges as the iterative nature of the forecast–analysis cycle will require repeated interaction between the quantum computer and the assimilated classical system, possibly using quantum sensors (49). We believe that addressing these challenges is a fruitful area for future research with both theoretical and applied dimensions.

Supplementary Material

Appendix 01 (PDF)

Click here for additional data file.^{(780.2KB, pdf)}

Acknowledgments

We thank Philipp Pfeffer, Travis Russell, and Jörg Schumacher for stimulating discussions. D.G. acknowledges support from the US National Science Foundation under grants 1842538 and DMS-1854383, the US Office of Naval Research under MURI grant N00014-19-1-242, and the US Department of Defense, Basic Research Office under Vannevar Bush Faculty Fellowship grant N00014-21-1-2946. D.C.F. is supported as a PhD student under the last grant. A.O. was supported by the US Department of Energy, Office of Science, Basic Energy Sciences under award DE-SC0002164 (underlying dynamical techniques), and by the US National Science Foundation under awards STC-1231306 (underlying data analytical techniques) and DBI-2029533 (underlying analytical models). J.S. acknowledges support from NSF EAGER grant 1551489.

Author contributions

D.G., A.O., and J.S. designed research; D.F., D.G., B.M., A.O., and J.S. performed research; D.F., D.G., B.M., and J.S. contributed new reagents/analytic tools; D.G. and J.S. analyzed data; and D.G. wrote the paper.

Competing interests

The authors declare no competing interest.

Footnotes

This article is a PNAS Direct Submission.

Data, Materials, and Software Availability

The CCSM4 data analyzed in this study are available at the Earth System Grid repository under accession code https://www.earthsystemgrid.org/dataset/ucar.cgd.ccsm4.joc.b40.1850.track1.1deg.006.html (accessed January 2023). MATLAB code reproducing the ENSO and L96 results in Figs. 2–5 is available in the repository https://doi.org/10.5281/zenodo.7554628 under directory /pubs/FreemanEtAl23_PNAS. This directory also contains a Python Jupyter notebook that reproduces the quantum circuit simulation results in Fig. 7.

Supporting Information

References

1.Cressman G. P., An operational objective analysis system. Mon. Wea. Rev. 87, 367–374 (1959). [Google Scholar]
2.Kalman R. E., A new approach to linear filtering and prediction problems. J. Basic Eng. 82, 35–45 (1960). [Google Scholar]
3.Majda A. J., Harlim J., Filtering Complex Turbulent Systems (Cambridge University Press, Cambridge, 2012). [Google Scholar]
4.Law K., Stuart A., Zygalakis K., Data Assimilation: A Mathematical Introduction, Texts in Applied Mathematics (Springer, New York, 2015), vol. 62. [Google Scholar]
5.Kalnay E., Atmospheric Modeling, Data Assimilation, and Predictability (Cambridge University Press, Cambridge, 2003). [Google Scholar]
6.Bannister R. N., A review of operational methods of variational and ensemble-variational data assimilation. Quart. J. Roy. Meteorol. Soc. 143, 607–633 (2016). [Google Scholar]
7.Karspeck A. R., et al. , A global coupled ensemble data assimilation system using the Community Earth System Model and the Data Assimilation Research Testbed. Quart. J. Roy. Meteor. Soc. 144, 2404–2430 (2018). [Google Scholar]
8.van Leuuwen P. J., Künsch H. R., Nerger L., Potthast R., Reich S., Particle filters for high-dimensional geoscience applications: A review. Quart. J. Roy. Meteorol. Soc. (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Takesaki M., Theory of Operator Algebras I, Encyclopaedia of Mathematical Sciences (Springer, Berlin, 2001), vol. 124. [Google Scholar]
10.Koopman B. O., Hamiltonian systems and transformation in Hilbert space. Proc. Natl. Acad. Sci. U.S.A. 17, 315–318 (1931). [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Baladi V., Positive Transfer Operators and Decay of Correlations, Advanced Series in Nonlinear Dynamics (World Scientific, Singapore, 2000), vol. 16. [Google Scholar]
12.Eisner T., Farkas B., Haase M., Nagel R., Operator Theoretic Aspects of Ergodic Theory, Graduate Texts in Mathematics (Springer, Cham, 2015), vol. 272. [Google Scholar]
13.Emch G. G., Algebraic Methods in Statistical Mechanics and Quantum Field Theory (Dover Publications, Mineola, 2009). [Google Scholar]
14.Alber G., et al. , Quantum Information: An Introduction to Basic Theoretical Concepts and Experiments, Springer Tracts in Modern Physics (Springer-Verlag, Berlin, 2001), vol. 173. [Google Scholar]
15.Holevo A. S., Statistical Structure of Quantum Theory, Lecture Notes in Physics Monographs (Springer, Berlin, 2001), vol. 67. [Google Scholar]
16.Wilde M. M., Quantum Information Theory (Cambridge University Press, Cambridge, 2013). [Google Scholar]
17.Giannakis D., Quantum mechanics and data assimilation. Phys. Rev. E 100, 032207 (2019). [DOI] [PubMed] [Google Scholar]
18.Takhtajan L. A., Quantum Mechanics for Mathematicians, Graduate Series in Mathematics (American Mathematical Society, Providence, 2008), vol. 95. [Google Scholar]
19.Giannakis D., Slawinska J., Zhao Z., “Spatiotemporal feature extraction with data-driven Koopman operators” in Proceedings of the 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015, Proceedings of Machine Learning Research, Storcheus D., Rostamizadeh A., Kumar S., Eds. (PMLR, Montreal, Canada, 2015), vol. 44, pp. 103–115. [Google Scholar]
20.Giannakis D., Data-driven spectral decomposition and forecasting of ergodic dynamical systems. Appl. Comput. Harmon. Anal. 47, 338–396 (2019). [Google Scholar]
21.Das S., Giannakis D., Slawinska J., Reproducing kernel Hilbert space compactification of unitary evolution groups. Appl. Comput. Harmon. Anal. 54, 75–136 (2021). [Google Scholar]
22.Giannakis D., Ourmazd A., Schumacher J., Slawinska J., Embedding classical dynamics in a quantum computer. Phys. Rev. A 105, 052404 (2022). [Google Scholar]
23.Davies E. B., Lewis J. T., An operational approach to quantum probability. Commun. Math. Phys. 17, 239–260 (1970). [Google Scholar]
24.Leifer M. S., Spekkens R. W., Towards a formulation of quantum theory as a causeally neutral theory of Bayesian inference. Phys. Rev. A 88, 052130 (2013). [Google Scholar]
25.Jacobs B., Zanasi F., A predicate/state transformer semantics for Bayesian learning. Electron. Notes Theor. Comput. Sci. 325, 185–200 (2016). [Google Scholar]
26.Gudder S., “Quantum probability” in Handbook of Quantum Logic and Quantum Structures, Engesser K., Gabbary D. M., Lehmann D., Eds. (Elsevier, Amsterdam, 2007), pp. 121–146. [Google Scholar]
27.Paulsen V. I., Raghupathi M., An Introduction to the Theory of Reproducing Kernel Hilbert Spaces, Cambridge Studies in Advanced Mathematics (Cambridge University Press, Cambridge, 2016), vol. 152. [Google Scholar]
28.Stinespring W. F., Positive functions on C^*-algebras. Proc. Amer. Math. Soc. 6, 211–216 (1955). [Google Scholar]
29.Yuval J., O’Gormann P. A., Stable machine-learning parameterization of subgrid processes for climate modeling at a range of resolutions. Nat. Commun. 11, 3295 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Berry T., Giannakis D., Harlim J., Nonparametric forecasting of low-dimensional dynamical systems. Phys. Rev. E. 91, 032915 (2015). [DOI] [PubMed] [Google Scholar]
31.Klus S., Koltai P., Schütte C., On the numerical approximation of the Perron-Frobenius and Koopman operator. J. Comput. Dyn. 3, 51–79 (2016). [Google Scholar]
32.Sauer T., Yorke J. A., Casdagli M., Embedology. J. Stat. Phys. 65, 579–616 (1991). [Google Scholar]
33.von Luxburg U., Belkin M., Bousquet O., Consitency of spectral clustering. Ann. Stat. 26, 555–586 (2008). [Google Scholar]
34.Lorenz E. N., “Predictability of weather and climate” in Predictability of Weather and Climate, Palmer T., Hagedorn R., Eds. (Cambridge University Press, Cambridge, 1996), pp. 40–58. [Google Scholar]
35.Fatkullin I., Vanden-Eijnden E., A computational strategy for multiscale systems with applications to Lorenz 96 model. J. Comput. Phys. 200, 605–638 (2004). [Google Scholar]
36.Burov D., Giannakis D., Manohar K., Stuart A., Kernel analog forecasting: Multiscale test problems. Multiscale Model. Simul. 19, 1011–1040 (2021). [Google Scholar]
37.Berry T., Harlim J., Variable bandwidth diffusion kernels. Appl. Comput. Harmon. Anal. 40, 68–96 (2016). [Google Scholar]
38.Coifman R., Hirn M., Bi-stochastic kernels via asymmetric affinity functions. Appl. Comput. Harmon. Anal. 35, 177–180 (2013). [Google Scholar]
39.Alexander R., Giannakis D., Operator-theoretic framework for forecasting nonlinear time series with kernel analog techniques. Phys. D 409, 132520 (2020). [Google Scholar]
40.Bjerknes J., Atmospheric teleconnections from the equatorial pacific. Mon. Wea. Rev. 97, 163–172 (1969). [Google Scholar]
41.Wang C., et al. , Southern Oscillation (ENSO): A review in Coral Reefs of the Eastern Tropical Pacific: Persistence and Loss in a Dynamic Environment, Coral Reefs of the World, Glynn P. W., Manzello D. P., Enoch I. C., Eds. (Springer Netherlands, Dordrecht, 2017), vol. 8, pp. 85–106. [Google Scholar]
42.McPhaden M. J., Zebiak S. E., Glantz M. H., ENSO as an integrating concept in earth system science. Science 314, 1740–1745 (2006). [DOI] [PubMed] [Google Scholar]
43.Wang X., Slawinska J., Giannakis D., Extended-range statistical ENSO prediction through operator-theoretic techniques for nonlinear dynamics. Sci. Rep. 10, 2636 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Gent P. R., et al. , The community climate system model version 4. J. Climate 24, 4973–4991 (2011). [Google Scholar]
45.Anis M. S., et al. , Qiskit: An open-source framework for quantum computing (2021).
46.Nielsen M. A., Chuang I. L., Quantum Computation and Quantum Information (Cambridge University Press, Cambridge, 2010). [Google Scholar]
47.Joseph I., Koopman-von Neumann approach to quantum simulation of nonlinear classical dynamics. Phys. Rev. Res. 2, 043102 (2020). [Google Scholar]
48.Liu J. P., Kolden H. Ø., Krovi H. K., Childs A. M., Efficient quantum algorithm for dissipative nonlinear differential equations. Proc. Natl. Acad. Sci. U.S.A. 118, e2026805118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Huang H. Y., et al. , Quantum advantage in learning from experiments. Science 376, 1182–1186 (2022). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix 01 (PDF)

Click here for additional data file.^{(780.2KB, pdf)}

Data Availability Statement

[r1] 1.Cressman G. P., An operational objective analysis system. Mon. Wea. Rev. 87, 367–374 (1959). [Google Scholar]

[r2] 2.Kalman R. E., A new approach to linear filtering and prediction problems. J. Basic Eng. 82, 35–45 (1960). [Google Scholar]

[r3] 3.Majda A. J., Harlim J., Filtering Complex Turbulent Systems (Cambridge University Press, Cambridge, 2012). [Google Scholar]

[r4] 4.Law K., Stuart A., Zygalakis K., Data Assimilation: A Mathematical Introduction, Texts in Applied Mathematics (Springer, New York, 2015), vol. 62. [Google Scholar]

[r5] 5.Kalnay E., Atmospheric Modeling, Data Assimilation, and Predictability (Cambridge University Press, Cambridge, 2003). [Google Scholar]

[r6] 6.Bannister R. N., A review of operational methods of variational and ensemble-variational data assimilation. Quart. J. Roy. Meteorol. Soc. 143, 607–633 (2016). [Google Scholar]

[r7] 7.Karspeck A. R., et al. , A global coupled ensemble data assimilation system using the Community Earth System Model and the Data Assimilation Research Testbed. Quart. J. Roy. Meteor. Soc. 144, 2404–2430 (2018). [Google Scholar]

[r8] 8.van Leuuwen P. J., Künsch H. R., Nerger L., Potthast R., Reich S., Particle filters for high-dimensional geoscience applications: A review. Quart. J. Roy. Meteorol. Soc. (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r9] 9.Takesaki M., Theory of Operator Algebras I, Encyclopaedia of Mathematical Sciences (Springer, Berlin, 2001), vol. 124. [Google Scholar]

[r10] 10.Koopman B. O., Hamiltonian systems and transformation in Hilbert space. Proc. Natl. Acad. Sci. U.S.A. 17, 315–318 (1931). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r11] 11.Baladi V., Positive Transfer Operators and Decay of Correlations, Advanced Series in Nonlinear Dynamics (World Scientific, Singapore, 2000), vol. 16. [Google Scholar]

[r12] 12.Eisner T., Farkas B., Haase M., Nagel R., Operator Theoretic Aspects of Ergodic Theory, Graduate Texts in Mathematics (Springer, Cham, 2015), vol. 272. [Google Scholar]

[r13] 13.Emch G. G., Algebraic Methods in Statistical Mechanics and Quantum Field Theory (Dover Publications, Mineola, 2009). [Google Scholar]

[r14] 14.Alber G., et al. , Quantum Information: An Introduction to Basic Theoretical Concepts and Experiments, Springer Tracts in Modern Physics (Springer-Verlag, Berlin, 2001), vol. 173. [Google Scholar]

[r15] 15.Holevo A. S., Statistical Structure of Quantum Theory, Lecture Notes in Physics Monographs (Springer, Berlin, 2001), vol. 67. [Google Scholar]

[r16] 16.Wilde M. M., Quantum Information Theory (Cambridge University Press, Cambridge, 2013). [Google Scholar]

[r17] 17.Giannakis D., Quantum mechanics and data assimilation. Phys. Rev. E 100, 032207 (2019). [DOI] [PubMed] [Google Scholar]

[r18] 18.Takhtajan L. A., Quantum Mechanics for Mathematicians, Graduate Series in Mathematics (American Mathematical Society, Providence, 2008), vol. 95. [Google Scholar]

[r19] 19.Giannakis D., Slawinska J., Zhao Z., “Spatiotemporal feature extraction with data-driven Koopman operators” in Proceedings of the 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015, Proceedings of Machine Learning Research, Storcheus D., Rostamizadeh A., Kumar S., Eds. (PMLR, Montreal, Canada, 2015), vol. 44, pp. 103–115. [Google Scholar]

[r20] 20.Giannakis D., Data-driven spectral decomposition and forecasting of ergodic dynamical systems. Appl. Comput. Harmon. Anal. 47, 338–396 (2019). [Google Scholar]

[r21] 21.Das S., Giannakis D., Slawinska J., Reproducing kernel Hilbert space compactification of unitary evolution groups. Appl. Comput. Harmon. Anal. 54, 75–136 (2021). [Google Scholar]

[r22] 22.Giannakis D., Ourmazd A., Schumacher J., Slawinska J., Embedding classical dynamics in a quantum computer. Phys. Rev. A 105, 052404 (2022). [Google Scholar]

[r23] 23.Davies E. B., Lewis J. T., An operational approach to quantum probability. Commun. Math. Phys. 17, 239–260 (1970). [Google Scholar]

[r24] 24.Leifer M. S., Spekkens R. W., Towards a formulation of quantum theory as a causeally neutral theory of Bayesian inference. Phys. Rev. A 88, 052130 (2013). [Google Scholar]

[r25] 25.Jacobs B., Zanasi F., A predicate/state transformer semantics for Bayesian learning. Electron. Notes Theor. Comput. Sci. 325, 185–200 (2016). [Google Scholar]

[r26] 26.Gudder S., “Quantum probability” in Handbook of Quantum Logic and Quantum Structures, Engesser K., Gabbary D. M., Lehmann D., Eds. (Elsevier, Amsterdam, 2007), pp. 121–146. [Google Scholar]

[r27] 27.Paulsen V. I., Raghupathi M., An Introduction to the Theory of Reproducing Kernel Hilbert Spaces, Cambridge Studies in Advanced Mathematics (Cambridge University Press, Cambridge, 2016), vol. 152. [Google Scholar]

[r28] 28.Stinespring W. F., Positive functions on C^*-algebras. Proc. Amer. Math. Soc. 6, 211–216 (1955). [Google Scholar]

[r29] 29.Yuval J., O’Gormann P. A., Stable machine-learning parameterization of subgrid processes for climate modeling at a range of resolutions. Nat. Commun. 11, 3295 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r30] 30.Berry T., Giannakis D., Harlim J., Nonparametric forecasting of low-dimensional dynamical systems. Phys. Rev. E. 91, 032915 (2015). [DOI] [PubMed] [Google Scholar]

[r31] 31.Klus S., Koltai P., Schütte C., On the numerical approximation of the Perron-Frobenius and Koopman operator. J. Comput. Dyn. 3, 51–79 (2016). [Google Scholar]

[r32] 32.Sauer T., Yorke J. A., Casdagli M., Embedology. J. Stat. Phys. 65, 579–616 (1991). [Google Scholar]

[r33] 33.von Luxburg U., Belkin M., Bousquet O., Consitency of spectral clustering. Ann. Stat. 26, 555–586 (2008). [Google Scholar]

[r34] 34.Lorenz E. N., “Predictability of weather and climate” in Predictability of Weather and Climate, Palmer T., Hagedorn R., Eds. (Cambridge University Press, Cambridge, 1996), pp. 40–58. [Google Scholar]

[r35] 35.Fatkullin I., Vanden-Eijnden E., A computational strategy for multiscale systems with applications to Lorenz 96 model. J. Comput. Phys. 200, 605–638 (2004). [Google Scholar]

[r36] 36.Burov D., Giannakis D., Manohar K., Stuart A., Kernel analog forecasting: Multiscale test problems. Multiscale Model. Simul. 19, 1011–1040 (2021). [Google Scholar]

[r37] 37.Berry T., Harlim J., Variable bandwidth diffusion kernels. Appl. Comput. Harmon. Anal. 40, 68–96 (2016). [Google Scholar]

[r38] 38.Coifman R., Hirn M., Bi-stochastic kernels via asymmetric affinity functions. Appl. Comput. Harmon. Anal. 35, 177–180 (2013). [Google Scholar]

[r39] 39.Alexander R., Giannakis D., Operator-theoretic framework for forecasting nonlinear time series with kernel analog techniques. Phys. D 409, 132520 (2020). [Google Scholar]

[r40] 40.Bjerknes J., Atmospheric teleconnections from the equatorial pacific. Mon. Wea. Rev. 97, 163–172 (1969). [Google Scholar]

[r41] 41.Wang C., et al. , Southern Oscillation (ENSO): A review in Coral Reefs of the Eastern Tropical Pacific: Persistence and Loss in a Dynamic Environment, Coral Reefs of the World, Glynn P. W., Manzello D. P., Enoch I. C., Eds. (Springer Netherlands, Dordrecht, 2017), vol. 8, pp. 85–106. [Google Scholar]

[r42] 42.McPhaden M. J., Zebiak S. E., Glantz M. H., ENSO as an integrating concept in earth system science. Science 314, 1740–1745 (2006). [DOI] [PubMed] [Google Scholar]

[r43] 43.Wang X., Slawinska J., Giannakis D., Extended-range statistical ENSO prediction through operator-theoretic techniques for nonlinear dynamics. Sci. Rep. 10, 2636 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r44] 44.Gent P. R., et al. , The community climate system model version 4. J. Climate 24, 4973–4991 (2011). [Google Scholar]

[r45] 45.Anis M. S., et al. , Qiskit: An open-source framework for quantum computing (2021).

[r46] 46.Nielsen M. A., Chuang I. L., Quantum Computation and Quantum Information (Cambridge University Press, Cambridge, 2010). [Google Scholar]

[r47] 47.Joseph I., Koopman-von Neumann approach to quantum simulation of nonlinear classical dynamics. Phys. Rev. Res. 2, 043102 (2020). [Google Scholar]

[r48] 48.Liu J. P., Kolden H. Ø., Krovi H. K., Childs A. M., Efficient quantum algorithm for dissipative nonlinear differential equations. Proc. Natl. Acad. Sci. U.S.A. 118, e2026805118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r49] 49.Huang H. Y., et al. , Quantum advantage in learning from experiments. Science 376, 1182–1186 (2022). [DOI] [PubMed] [Google Scholar]

PERMALINK

Data assimilation in operator algebras

David Freeman

Dimitrios Giannakis

Brian Mintz

Abbas Ourmazd

Joanna Slawinska

Significance

Abstract

Fig. 1.

Previous Work.

Contributions.

Embedding Data Assimilation in Operator Algebras

Embedding Observables.

Embedding States.

Probabilistic Forecasting.

Representing Observations by Effects.

Positivity-Preserving Discretization.

Data-Driven Approximation.

Lorenz 96 Multiscale System

Training.

Fig. 2.

Fig. 3.

Data Assimilation.

El Niño Southern Oscillation

Fig. 4.

Fig. 5.

Quantum Circuit Implementation

Fig. 6.

Fig. 7.

Concluding Remarks

Supplementary Material

Acknowledgments

Author contributions

Competing interests

Footnotes

Data, Materials, and Software Availability

Supporting Information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases