Ensemble Kalman Methods With Constraints

David J Albers; Paul-Adrien Blancquart; Matthew E Levine; Elnaz Esmaeilzadeh Seylabi; Andrew Stuart

doi:10.1088/1361-6420/ab1c09

. Author manuscript; available in PMC: 2020 Nov 20.

Published in final edited form as: Inverse Probl. 2019 Aug 21;35(9):095007. doi: 10.1088/1361-6420/ab1c09

Ensemble Kalman Methods With Constraints

David J Albers ^1,², Paul-Adrien Blancquart ³, Matthew E Levine ⁴, Elnaz Esmaeilzadeh Seylabi ⁵, Andrew Stuart ⁴

PMCID: PMC7677878 NIHMSID: NIHMS1039154 PMID: 33223593

Abstract

Ensemble Kalman methods constitute an increasingly important tool in both state and parameter estimation problems. Their popularity stems from the derivative-free nature of the methodology which may be readily applied when computer code is available for the underlying state-space dynamics (for state estimation) or for the parameter-to-observable map (for parameter estimation). There are many applications in which it is desirable to enforce prior information in the form of equality or inequality constraints on the state or parameter. This paper establishes a general framework for doing so, describing a widely applicable methodology, a theory which justifies the methodology, and a set of numerical experiments exemplifying it.

Keywords: ensemble Kalman methods, equality and inequality constraints, derivative-free optimization, convex optimization

1. Introduction

1.1. Overview

Kalman filter based methods have been enormously successful in both state and parameter estimation problems. However, a major disadvantage of such methods is that they do not naturally take constraints into account. The ability to constrain a system often has a number of advantages that can play an important role in state and parameter estimation: they can be used to enforce physicality of modeled systems (non-negativity of physical quantities, for example); relatedly they can be used to ensure that computational models are employed only within state and parameter regimes where the model is well-posed; and finally the application of constraints may provide robustness to outlier data. Resulting improvements in algorithmic efficiency and performance, by means of enforcing constraints, has been demonstrated in the recent literature in a diverse set of fields, including process control [1], biomechanics [2], cell energy metabolism [3], medical imaging [4], engine health estimation [5], weather forecasting [6], chemical engineering [7], and hydrology [8]. Within the Kalman filtering literature the need to incorporate constraints is widely recognized and has been addressed in a systematic fashion by viewing Kalman filtering from the perspective of optimization. Indeed this optimization perspective leads naturally to many extensions, and to the incorporation of constraints in particular. Including constraints in Kalman filtering, via optimization, lends itself to an elegant mathematical framework, to a practical computational framework, and has potential in numerous applications. Surveys of the work may be found in the papers of Aravkin, Burke and co-workers [9, 10] and our work in this paper may be viewed as generalizing their perspective to the ensemble setting.

In the probabilistic view of filtering methods, constraints may be introduced by moving beyond the Gaussian assumptions that underpin Kalman methods and imposing constraints through the prior distributions on states and/or parameters. This, however, can create significant computational burden as the resulting distributions cannot be represented in closed form, through a finite number of parameters, in the way that Gaussian distributions can be. Here we circumvent this issue by taking the viewpoint that ensemble Kalman methods constitute a form of derivative-free optimization methodology, eschewing the probabilistic interpretation. The ensemble is used to calculate surrogates for derivatives. With this optimization perspective, constraints may be included in a natural way. Standard ensemble Kalman methods employ a quadratic optimization problem encapsulating relative strengths of belief in the predictions of the model and the data; these optimization problems have explicit analytic solutions. To impose constraints the optimization problem is solved only within the constraint set; when the constraints form a non-empty closed convex set, this constrained optimization problem has a unique solution.

In this introductory section, we give a literature review of existing work in this setting, we describe the contributions in this paper, and we outline notation used throughout.

1.2. Literature Review

Overviews of state estimation using Kalman based methods may be found in [11, 12, 13, 14]. The focus of this article is on ensemble based Kalman methods, introduced by Evensen in [15] and further developed in [16, 11]. The extension of the ensemble Kalman methodology to parameter estimation and inverse problems is overviewed in [17], especially for oil reservoir applications, and in an application-neutral formulation in [18]. Equipping Kalman-based methods with constraints can be desirable for a variety of inter-linked reasons described in the previous subsection: to enforce known physical boundaries in order to improve estimation accuracy; to operationalize filtering of a model which is ill-posed in subsets of its state or parameter space; and to provide robustness to noisy data and outlier events.

In extending the Kalman filter to non-Gaussian settings, a number of methods may be considered. Particle filters provide the natural methodology if propagation of probability distributions is required for state [19] or parameter [20] estimation. In the optimization setting, there are three primary methodologies: the extended Kalman filter, the unscented Kalman filter and the ensemble Kalman filter. The extended Kalman filter is based on linearization of the nonlinear system and therefore needs the computation of derivatives for propagation of the state covariance; this makes them unattractive in high dimensional problems. Unscented and ensemble Kalman filters, on the other hand, can be considered as particle-based methods which are derivative-free. In the unscented Kalman filter, the particles (sigma points) are chosen deterministically and are propagated through the nonlinear system to approximate the covariance, which is then corrected using the Kalman gain to compute the new sigma points. In the ensemble Kalman filter, the particles (ensemble members) are chosen randomly from the initial ensemble and are propagated through the dynamical system and corrected using the Kalman gain without needing to maintain the covariance.

In [21], and more recently in [22], overviews of different ways to impose constraints in linear and nonlinear state estimation are presented. To ensure that the estimates satisfy the constraints, moving horizon based estimators that solve a constrained optimization problem have been proposed [23, 24]. The paper [25] proposed a recursive nonlinear dynamic data reconciliation (RNDDR) approach based on extended Kalman filtering to ensure that state and parameter estimates satisfy the imposed bounds and constraints. The updated state estimates in this method are obtained by solving an optimization problem instead of using the Kalman gain. The resulting covariance calculations are, however, still similar to the Kalman filter: that is, unconstrained propagation and correction involving the Kalman gain, which can affect the accuracy of the estimates. To eliminate this deficiency, [26] proposed a Kullback-Leibler based method to update states and error covariances by solving a convex optimization problem involving conic constraints.

On the other hand, the paper [27] combined the concept of the unscented transformation [28] with the RNDDR formulation. In the prediction step, they propose step sizes to scale sigma points asymmetrically to better approximate the covariance information in the presence of lower and upper bounds. Then, for the update of each sigma point, they solve a constrained optimization problem. One disadvantage of this procedure is that the chosen step sizes for scaling the sigma points can only ensure the bound constraints. The paper [1] also tested various algorithms based on constrained optimization, projection [29] and truncation [5] to enforce bound constraints on unscented Kalman filtering. The paper [30] developed a class of estimators named constrained unscented recursive estimators to address the limitations of the unscented RNDDR method using optimization-based projection algorithms for obtaining sigma points in the presence of convex, non-convex and bound constraints.

As mentioned earlier, since the corrected covariance is used to compute the sigma points, unscented formulations always require enforcing constraints in both propagation and correction/update steps. In contrast, ensemble-based methods only require constraints to be enforced in the update step. In this context, the paper [8] tested projection and accept/reject methods to constrain ensemble members in a post-processing step, after application of the unconstrained ensemble Kalman filter. In the former, they project the updated ensemble members to the feasible space if they violate the constraints and in the latter they enforce the updated ensemble members to obey the constraints by resampling the dynamic and/or data model errors. On the other hand, [31, 32] proposed updating the state estimates in ensemble Kalman filtering by solving a constrained optimization problem while truncating the Gaussian distribution of the initial ensemble. The paper [6] demonstrated how to enforce a physics-based conservation law on an ensemble Kalman filtering based state estimation problem by formulating the filter update as a set of quadratic programming problems arising from a linear data acquisition model subject to linear constraints. Here we develop this body of work on constraining ensemble Kalman techniques, providing a unifying framework with an underpinning theoretical basis.

1.3. Our Contribution

The preceding literature review demonstrates that the imposition of constraints on state and parameter estimation procedures is highly desirable. It also indicates that ensemble Kalman methods offer the most natural context in which to attempt to do this, as extended Kalman methods do not scale well to high dimensional state or parameter space, whilst the unscented filter does not lend itself as naturally to the incorporation of constraints.

In this paper we build on the application-specific papers [8, 6] which demonstrate how to impose a number of particular constraints on ensemble based parameter and state estimation problems respectively. We formulate a very general methodology which is application-neutral and widely applicable, thereby making the ideas in [8, 6] accessible to a wide community of researchers working in inverse problems and state estimation. We also describe a straightforward mathematical analysis which demonstrates that the resulting algorithms are well-defined since they involve the solution of quadratic minimization problems subject to convex constraints at each step of the algorithm; these optimization problems have a unique solution. And finally we showcase the methodology on two applications, one from biomedicine and one from seismology. All of the algorithms discussed are clearly stated in pseudo-code.

Section 2 outlines the ensemble Kalman (EnKF) methodology for state estimation, with and without constraints. In section 3 the same program is carried out for ensemble Kalman inversion (EKI). Section 4 describes the numerical experiments which illustrate the foregoing ideas.

1.4. Notation

Throughout the paper we use $ℕ$ to denote the positive integers {1, 2, 3,⋯} and $ℤ^{+}$ to denote the non-negative integers $ℕ \cup {0} = {0, 1, 2, 3, \dots}$ . The matrix I_M denotes the identity on $ℝ^{M}$ . We use |⋅| to denote the Euclidean norm, and the corresponding inner-product is denoted ⟨⋅,⋅⟩. A symmetric, square matrix A is positive definite (resp. positive semi-definite) if the quadratic form ⟨u, Au⟩ is positive (resp. non-negative) for all u ≠ 0. By |⋅|_B we denote the weighted norm defined by $| v |_{B}^{2} = v^{*} B^{- 1} v$ for any positive-definite B. The corresponding weighted Euclidean inner-product is given by ⟨⋅,⋅⟩_B:= ⟨⋅, B⁻¹⟩. We use ⊗ to denote the outer product between two vectors: (a ⊗ b) = ⟨b, c⟩a.

2. Ensemble Kalman State Estimation

2.1. Filtering Problem

Consider the discrete-time dynamical system with noisy state transitions and noisy observations in the form:

Dynamics Model: v_{j + 1} = Ψ (v_{j}) + ξ_{j}, j \in ℤ^{+}

Data Model: y_{j + 1} = H v_{j + 1} + η_{j + 1}, j \in ℤ^{+}

Probabilistic Structure: v_{0} ~ N (m_{0}, C_{0}), ξ_{j} ~ N (0, Σ), η_{j} ~ N (0, Γ)

Probabilistic Structure: v_{0} ⊥ {ξ_{j}} ⊥ {η_{j}} independent

We assume that $H_{1}$ , $H_{2}$ are finite dimensional Hilbert spaces. Then $v_{j} \in H_{1}$ ,and $Ψ : H_{1} \mapsto H_{1}$ is the state-transition operator. The operator $H : H_{1} \mapsto H_{2}$ is the linear observation operator and $y_{j} \in H_{2}$ . The covariance operators C₀, Σ are assumed to be invertible. The objective of filtering is to estimate the state v_j of the dynamical systems at time J, given the data ${y_{l}}_{l = 1}^{j}$ Remark 2.1. • We may extend the methodology in this paper to the setting where $H_{1}$ , $H_{2}$ are separable infinite dimensional Hilbert spaces. The covariance operators C₀, Σ are assumed trace-class on $H_{1}$ , and Γ on $H_{2}$ to ensure that the initial condition v₀ and the noises ξ_J and η_J live in $H_{1}$ , $H_{1}$ and $H_{2}$ (respectively) with probability one. The update formulae we derive require operator composition and inversion, together with minimization of quadratic functionals on $H_{1}$ subject to convex constraints. Provided all of these operations can be carried out, then the methods derived here are well-defined in the general Hilbert space setting. This fact is important because it means that the methods derived have a robustness to mesh refinement and similar procedures arising when the problem of interest is specified via a partial differential equation, or other infinite dimensional problem.

We restrict attention to linear observation operators H because this leads to solvable quadratic optimization problems within the context of Kalman-based methods. In principle, a non-linear observation operator could be used, but the optimization problems defining the algorithms arising in this paper might not have a unique solution in this setting.

2.2. Ensemble Kalman Filter

The ensemble Kalman filter is a particle-based sequential optimization approach to the state estimation problem. The particles are denoted by ${v_{j}^{(n)}}_{n = 1}^{N}$ and represent a collection of N candidate state estimates at time j. The method proceeds as follows. The state of all the particles at time j + 1 are predicted using the dynamics model to give ${{\hat{v}}_{j + 1}^{(n)}}_{n = 1}^{N}$ . The resulting empirical covariance of the particles is then used to define the objective function I_filter,_,(v), which encapsulates the model-data compromise. This is minimized in order to obtain the updates ${v_{j + 1}^{(n)}}_{n = 1}^{N}$ To understand the origin of this optimization perspective on ensemble Kalman methods we argue as follows. In equation (4.10) of [33], it is shown that the data incorporation step of the Kalman filter may be written as a quadratic optimization problem for the state. In equation (4.15) of [33], the ensemble Kalman filter is written by using this quadratic minimization principle with an empirically (from the ensemble) computed covariance.

The prediction step is

{\hat{v}}_{j + 1}^{(n)} = Ψ (v_{j}^{(n)}) + ξ_{j}^{(n)}, n = 1, \dots, N

(1a)

{\hat{m}}_{j + 1} = \frac{1}{N} \sum_{n = 1}^{N} {\hat{v}}_{j + 1}^{(n)}

(1b)

{\hat{C}}_{j + 1} = \frac{1}{N} \sum_{n = 1}^{N} ({\hat{v}}_{j + 1}^{(n)} - {\hat{m}}_{j + 1}) {({\hat{v}}_{j + 1}^{(n)} - {\hat{m}}_{j + 1})}^{T} .

(1c)

Here we have $ξ_{j}^{(n)} ~ N (0, Σ)$ i.i.d.. Because the empirical covariance contains only N − 1 independent pieces of information, (1c) is sometimes scaled by N − 1 and not N; making this change would lead to no changes in the statements and proofs of all the theorems, and would only affect the definition of covariance within the algorithms.

Let $R ({\hat{C}}_{j + 1})$ denote the range of ${\hat{C}}_{j + 1}$ The update step is then

v_{j + 1}^{(n)} = \underset{v}{argmin} I_{filter,j,n} (v)

(2)

where

I_{filter, j, n} (v) : = {\begin{array}{l} \frac{1}{2} {| y_{j + 1}^{(n)} - H v |}_{Γ}^{2} + \frac{1}{2} {| v - {\hat{v}}_{j + 1}^{(n)} |}^{2} & if v - {\hat{v}}_{j + 1}^{(n)} \in R ({\hat{C}}_{j + 1}) . \\ \infty & otherwise. \end{array}

(3)

It can be useful to rewrite the objective function for the optimization problem in an equivalent and more standard form for input to software:

{\begin{array}{l} \frac{1}{2} v^{T} (H^{T} Γ^{- 1} H + {\hat{C}}_{j + 1}^{- 1}) v - {({\hat{C}}_{j + 1}^{- 1^{T}} {\hat{v}}_{j + 1}^{(n)} + H^{T} Γ^{- 1^{T}} y_{j + 1}^{(n)})}^{T} v & if v - {\hat{v}}_{j + 1}^{(n)} \in R ({\hat{C}}_{j + 1}) . \\ \infty & otherwise. \end{array}

The $y_{j + 1}^{(n)}$ are either identical to the data Y_J+1, or found by perturbing it randomly.

Note that ${\hat{C}}_{j + 1}$ is an operator of rank at most N − 1, and thus can only be invertible when N − 1 is larger than the dimension of $H_{1}$ . For moderate- and high-dimensional systems, it is often impractical to satisfy this condition. However, the minimizing solution can be found by regularizing ${\hat{C}}_{j + 1}$ by addition of ∈I for ∈ > 0, deriving the update equations and then letting ∈ → 0. We give the resulting formulae, and then justify them immediately afterwards, in the following subsubsection. Alternatively it is possible to directly seek a solution in $R ({\hat{C}}_{j + 1})$ , which is a subspace of dimension N − 1; this is done in the subsequent subsubsection.

2.2.1. Formulation In The Original Variables

The well-known Kalman update formulae arising from solution of the minimization problem (2), (3) are as follows:

S_{j + 1} = H {\hat{C}}_{j + 1} H^{T} + Γ

(4a)

K_{j + 1} = {\hat{C}}_{j + 1} H^{T} S_{j + 1}^{- 1} (Kalman Gain)

(4b)

y_{j + 1}^{(n)} = y_{j + 1} + s η_{j + 1}^{(n)}, n = 1, \dots, N

(4c)

v_{j + 1}^{(n)} = (I - K_{j + 1} H) {\hat{v}}_{j + 1}^{(n)} + K_{j + 1} y_{j + 1}^{(n)}, n = 1, \dots, N

(4d)

Here $η_{j}^{(n)} ~ N (0, Γ)$ i.i.d. and the constant s takes value 0 or 1. When s = 1 the $y_{j + 1}^{(n)}$ are referred to as perturbed observations. The choice s = 1 is made to ensure the correct statistics of the updates in the linear Gaussian setting when a probabilistic viewpoint is taken, and more generally to introduce diversity into the ensemble procedure when an optimization viewpoint is taken.

Derivation of the formulae may be found in [33]. In brief the formulae arise from completing the square in the objective function I _filter,j,n(·) and then applying the Sherman–Morrison formula to rewrite the updates in the data space rather than state space; the latter is advantageous in many applications where $H_{2}$ has dimension much smaller than $H_{1}$ .

We summarize with the following pseudo-code:

2.2.

An equivalent formulation of the minimization problem is now given by means of a penalized Lagrangian approach to incorporate the property that the solution of the optimization problem lies in the range of the empirical covariance. The perspective is particularly useful when further constraints are imposed on the solution of the optimization problem.

Theorem 2.2. Suppose that the dimensions of $H_{1}$ and $H_{2}$ are finite. Let j be in $ℤ^{+}$ and 1 ≤ n ≤ N. Define $y^{'} = y_{j + 1}^{(n)} - H {\hat{v}}_{j + 1}^{(n)}$ Then the update formulae (4), which follow from the minimization problem (2), (3), may be given alternatively as

v_{j + 1}^{(n)} = {\hat{v}}_{j + 1}^{(n)} + \underset{(a, v^{'}) \in A}{argmin} (\frac{1}{2} {| y^{'} - H v^{'} |}_{Γ}^{2} + \frac{1}{2} 〈 a, v^{'} 〉)

(5)

where $A = {(a, v^{'}) \in H_{1} \times H_{1} : {\hat{C}}_{j + 1} a = v^{'}}$ and the argmin is projected from the pair (a,v’) onto the v^′ coordinate only. Moreover $v_{j + 1}^{(n)} = lim_{ϵ \to 0} v_{ϵ}$ with

v_{ϵ} = \underset{v \in H_{1}}{argmin} (\frac{1}{2} {| y_{j + 1}^{(n)} - H v |}_{Γ}^{2} + \frac{1}{2} {| {\hat{v}}_{j + 1}^{(n)} - v |}_{{\hat{C}}_{ϵ}}^{2})

(6)

and $\hat{C} = {\hat{C}}_{j + 1} + \in I$ .

proof. For notational convenience denote $\hat{C} = {\hat{C}}_{j + 1}$ . The objective function I_filter,,(v) appearing in (3) is infinite if and only if $v - {\hat{v}}_{j + 1}^{(n)}$ is in the range of $\hat{C}$ . Thus, since the range of $\hat{C}$ is non-empty, we may confine minimization to the set of $(a, v^{'}) \in A$ . Note that $\hat{C}$ is in general not invertible, as it has rank N − 1,which may be less than the dimension of $H_{1}$ . The set $A$ thus comprises all v^′ in the range of $\hat{C}$ (a convex set) and, for each such v^′ the set of a solving $\hat{C} a = v^{'}$ ; such an a is unique up to translations in the null-space of $\hat{C}$ . Thus $A$ is a convex set. Notice that, for such pairs $(a, v^{'}) \in A$ , $〈 a, v^{'} 〉 = {| v^{'} |}_{\hat{C}}^{2}$ with v^′ lying in the range of the operator $\hat{C}$ . Although the element a is uniquely defined only up to translations in the nullspace of $\hat{C}$ , such translations do not change the value of the inner product ⟨a, v^′⟩. The restriction of $\hat{C}$ over the constraint set is positive definite which means that the quadratic objective function, now depending only on v^′, is strongly convex. Therefore the problem has a unique solution and its Lagrangian is written as:

L (v^{'}, a, λ) = \frac{1}{2} {| y^{'} - H v^{'} |}_{Γ}^{2} + \frac{1}{2} 〈 a, v^{'} 〉 + 〈 λ, \hat{C} a - v^{'} 〉

To express optimality conditions compute the derivatives and set them to zero:

- H^{T} Γ^{- 1} (y^{'} - H v^{'}) + \frac{1}{2} a - λ = 0, \frac{1}{2} v^{'} + \hat{C} λ = 0, v^{'} - \hat{C} a = 0.

The last two equations imply that $\hat{C} (2 λ + a) = 0$ . Thus we set $λ = - \frac{1}{2} a$ and drop the second equation, replacing the first by

- H^{T} Γ^{- 1} (y^{'} - H \hat{C} a) + a = 0.

Solving the resulting equation for a gives

a = {(H^{T} Γ^{- 1} H \hat{C} + I)}^{- 1} H^{T} Γ^{- 1} y^{'} .

From this formula it follows that

v_{j + 1}^{(n)} = {\hat{v}}_{j + 1}^{(n)} + v^{'} = {\hat{v}}_{j + 1}^{(n)} + \hat{C} a = {\hat{v}}_{j + 1}^{(n)} + \hat{C} {(H^{T} Γ^{- 1} H \hat{C} + I)}^{- 1} H^{T} Γ^{- 1} y^{'} = {\hat{v}}_{j + 1}^{(n)} + \hat{C} {(H^{T} Γ^{- 1} H \hat{C} + I)}^{- 1} H^{T} Γ^{- 1} (y_{j + 1}^{(n)} - H {\hat{v}}_{j + 1}^{(n)}) .

If we define

K = \hat{C} {(H^{T} Γ^{- 1} H \hat{C} + I)}^{- 1} H^{T} Γ^{- 1}

(8)

then we see that

v_{j + 1}^{(n)} = (I - K H) {\hat{v}}_{j + 1}^{(n)} + K y_{j + 1}^{(n)} .

(9)

This is precisely the form of the ensemble Kalman update, and to complete the proof of the first part of the theorem it remains to show that this defintion of K agrees with the formulae given in (4); this amounts to verifying the identity

{(H^{T} Γ^{- 1} H \hat{C} + I)}^{- 1} H^{T} Γ^{- 1} = H^{T} {(H \hat{C} H^{T} + Γ)}^{- 1} .

(10)

To verify this we start from the matrix identity

{(H^{T} Γ^{- 1} H \hat{C} + I)}^{- 1} (H^{T} Γ^{- 1} H \hat{C} H^{T} + H^{T}) = H^{T}

noting that it may be factored to write

{(H^{T} Γ^{- 1} H \hat{C} + I)}^{- 1} (H^{T} Γ^{- 1}) (H \hat{C} H^{T} + Γ) = H^{T} .

Inverting $(H \hat{C} H^{T} + Γ)$ on the right gives the desired identity (10).

We now study the alternative representation of the minimization problem (2), (3), by (6). We first note that $H^{T} Γ^{- 1} H + {\hat{C}}_{ϵ}^{- 1}$ is strictly positive definite and hence the related quadratic function is strongly convex. As a consequence we have existence and uniqueness of the solution, and the optimality condition becomes,

(H^{T} Γ^{- 1} H + {\hat{C}}_{ϵ}^{- 1}) v_{ϵ} = H^{T} Γ^{- 1} y_{j + 1}^{(n)} + {\hat{C}}_{ϵ}^{- 1} {\hat{v}}_{j + 1}^{(n)} .

Then if we apply Woodbury matrix identity we obtain

v_{ϵ} = ({\hat{C}}_{ϵ} - {\hat{C}}_{ϵ} H^{T} {(H {\hat{C}}_{ϵ} H^{T} + Γ)}^{- 1} H {\hat{C}}_{ϵ}) (H^{T} Γ^{- 1} y_{j + 1}^{(n)} + {\hat{C}}_{ϵ}^{- 1} {\hat{v}}_{j + 1}^{(n)}) .

Note that the matrix multiplying ${\hat{v}}_{j + 1}^{(n)}$ is

({\hat{C}}_{ϵ} - {\hat{C}}_{ϵ} H^{T} {(H {\hat{C}}_{ϵ} H^{T} + Γ)}^{- 1} H {\hat{C}}_{ϵ}) {\hat{C}}_{ϵ}^{- 1} = (I - {\hat{C}}_{ϵ} H^{T} {(H {\hat{C}}_{ϵ} H^{T} + Γ)}^{- 1} H)

and that the matrix multiplying $y_{j + 1}^{(n)}$ is

({\hat{C}}_{ϵ} - {\hat{C}}_{ϵ} H^{T} {(H {\hat{C}}_{ϵ} H^{T} + Γ)}^{- 1} H {\hat{C}}_{ϵ}) H^{T} Γ^{- 1} = {\hat{C}}_{ϵ} H^{T} (I - {(H {\hat{C}}_{ϵ} H^{T} + Γ)}^{- 1} H {\hat{C}}_{ϵ} H^{T}) Γ^{- 1} = {\hat{C}}_{ϵ} H^{T} {(H {\hat{C}}_{ϵ} H^{T} + Γ)}^{- 1} Γ Γ^{- 1} = {\hat{C}}_{ϵ} H^{T} {(H {\hat{C}}_{ϵ} H^{T} + Γ)}^{- 1}

so that

v_{ϵ} = (I - {\hat{C}}_{ϵ} H^{T} {(H {\hat{C}}_{ϵ} H^{T} + Γ)}^{- 1} H) {\hat{v}}_{j + 1}^{(n)} + {\hat{C}}_{ϵ} H^{T} {(H {\hat{C}}_{ϵ} H^{T} + Γ)}^{- 1} y_{j + 1}^{(n)} .

Finally, as A↦A⁻¹ is continuous over the set of invertible matrices, letting ϵ → 0 gives:

lim_{ϵ \to 0} v_{ϵ} = (I - K_{j + 1} H) {\hat{v}}_{j + 1}^{(n)} + K_{j + 1} y_{j + 1}^{(n)}

which concludes the proof.

2.2.2. Formulation In Range Of The Covariance

The minimization problem for each individual particle has a solution which, when suitably shifted, lies in the range of the empirical covariance. This allows us to seek the solution of the minimization problem as a linear combination of a given set of vectors, and to minimize over the scalars which define this linear combination. This reformulation of the optimization problem is widely employed in a variety of applications, such as weather forecasting, where the number of ensemble members N is much smaller than the dimension of the data space; this is because the inversion of S to form the Kalman gain K takes place in the data space.

In order to implement the minimization in the N dimensional subspace we note that I _filter,j,n(v) is infinite unless

v - {\hat{v}}_{j + 1}^{(n)} = {\hat{C}}_{j + 1} a

for some $a \in ℝ^{n}$ _.From the structure of ${\hat{C}}_{j + 1}$ given in (1c) it follows that

v = {\hat{v}}_{j + 1}^{(n)} + \frac{1}{N} \sum_{m = 1}^{N} b_{m} e^{(m)}, e^{(m)} : = {\hat{v}}_{j + 1}^{(m)} - {\hat{m}}_{j + 1} .

(11)

Here each unknown parameter $b_{m} \in ℝ$ and $b : = {b_{m}}_{m = 1}^{N}$ , is the unknown vector to be determined. This form for v follows from the fact that which in turn implies that

{\hat{C}}_{j + 1} = \frac{1}{N} \sum_{m = 1}^{N} e^{(m)} \otimes e^{(m)}

(12)

which in turn implies that

{\hat{C}}_{j + 1} a = \frac{1}{N} \sum_{m = 1}^{N} b_{m} e^{(m)} .

(13)

Note that the unknown vector b depends on n as we need to solve the constrained minimization problem for each of the particles, indexed by n = 1,…, N; we have suppressed the dependence of b on n for notational simplicity.

The expression (11) for v in terms of the e^(m) can be substituted into (3) to obtain a functional J_filter,j,n(b) to be minimized over $b \in ℝ^{N}$ , because v is an affine function of b. Equation (11) may be written in compact form as

v = {\hat{v}}_{j + 1}^{(n)} + B b

(14)

where B is the linear mapping from $ℝ^{N}$ into $H_{1}$ defined by

B b : = \frac{1}{N} \sum_{m = 1}^{N} b_{m} e^{(m)} .

We now identify J_filter,j,n(b). We note that (13) is solved by taking

b_{m} = 〈 e^{(m)}, a 〉 .

Although a is not unique, the non-uniqueness stems only from translations in the nullspace of ${\hat{C}}_{j + 1}$ . Translations do affect the values taken by the b_m, but do not affect the vector v given by (14) because they result in changes to b which are in the null-space of B. Furthermore, for any such solution, independently of which a is chosen,

\frac{1}{2} {| v - {\hat{v}}_{j + 1}^{(n)} |}_{{\hat{C}}_{j + 1}}^{2} = \frac{1}{2} 〈 a, {\hat{C}}_{j + 1} a 〉 = \frac{1}{2 N} \sum_{m = 1}^{N} b_{m}^{2} .

Using this and (14) in the definition of I_filterj,n(v) we obtain

J_{filter, j, n} (b) = I_{filter, j, n} ({\hat{v}}_{j + 1}^{(n)} + B b)

and hence, from (3),

J_{filter, j, n} (b) : = \frac{1}{2} {| y_{j + 1}^{(n)} - H {\hat{v}}_{j + 1}^{(n)} - H B b |}_{Γ}^{2} + \frac{1}{2 N} | b |^{2}

(15a)

= \frac{1}{2} b^{T} (B^{T} H^{T} Γ^{- 1} H B + \frac{1}{N} I) b - {(B^{T} H^{T} Γ^{- 1} (y_{j + 1}^{(n)} - H {\hat{v}}_{j + 1}^{(n)}))}^{T} b + const .

(15b)

Once b is determined it may be substituted back into (14) to obtain the solution to the minimization problem.

The preceding considerations also yield the following result, concerning the unconstrained Kalman minimization problem; its proof is a corollary of the more general Theorem 2.4 from the next subsection, which includes constraints in the minimization problem.

Corollary 2.3. Suppose that the dimensions of $H_{1}$ and $H_{2}$ are finite. Given the prediction (1a), the unconstrained Kalman update formulae may be found by minimizing J_filter,j,n (b) from (15) with respect to b and substituting into (14).

We summarize the ensemble Kalman state estimation algorithm, using minimization over the vector b, in the following pseudo-code:

2.3. Constrained Ensemble Kalman Filter

In this subsection we introduce linear equality and inequality constraints on the state variable into the ensemble Kalman filter. We make prediction according to (1), and then incorporate data by solving the minimization problem (3) subject to the additional constraints

F v = f,

(16a)

G v ≼ g .

(16b)

Here F and G are linear mappings which, respectively, take the state v into the number of equality and inequality constraints; the notation ≼ denotes inequality componentwise.

2.3.1. Formulation In The Original Variables

The preceding considerations lead to the following algorithm for ensemble Kalman filtering subject to constraints. The existence of a solution to the constrained minimization follows from Theorem 2.4 below.

2.3.

2.3.2. Formulation In Range Of The Covariance

The linear constraints (16) can be rewritten in terms of the vector b, by means of (14), as follows:

F B b = f - F {\hat{v}}_{j + 1}^{(n)},

(17a)

G B b ≼ g - G {\hat{v}}_{j + 1}^{(n)} .

(17b)

We may thus predict and then optimize the objective function J_filter,j,n (b),given by (15), subject to the constraints (17). Implementation of this leads to following algorithm for ensemble Kalman filtering subject to constraints:

2.3.

Justification for the use of this algorithm, working in the constrained space parameterized by b, is a consequence of the following:

Theorem 2.4. Suppose that the dimensions of $H_{1}$ and $H_{2}$ are finite. The problem of finding $v_{j + 1}^{(n)}$ as the minimizer of I_filterj,n(v) subject to the constraints (16) is equivalent to finding b to minimize J_{filter, J,n} (b) subject to the constraints (17) and then using (14) to find $v_{j + 1}^{(n)}$ from b. Furthermore, both of these constrained minimization problems have a unique solution provided that the constraint sets are non-empty.

Proof. For notational convenience set $\hat{v} = {\hat{v}}_{j + 1}^{(n)}$ , $y = y_{j + 1}^{(n)}$ , $y^{'} = y - H \hat{v}$ , $\hat{C} = {\hat{C}}_{j + 1}$ and ${\hat{C}}_{ϵ} = {\hat{C}}_{j + 1} + ϵ I$ .

Denote

v^{*} = \underset{v^{'}}{argmin} \frac{1}{2} {| y^{'} - H v^{'} |}_{Γ}^{2} + \frac{1}{2} 〈 a, v^{'} 〉 subject to • \hat{C} a = v^{'} • F v^{'} = f - F \hat{v} • G v^{'} ≼ g - G \hat{v}

(18)

v_{ϵ} = \underset{v}{argmin} \frac{1}{2} | y - H v |_{Γ}^{2} + \frac{1}{2} | v - \hat{v} |_{{\hat{C}}_{ϵ}}^{2} subject to • F v = f • G v ≼ g

(19)

and

J (v) = \frac{1}{2} | y - H v |_{Γ}^{2} + \frac{1}{2} | v - \hat{v} |_{\hat{C}}^{2}

J_{ϵ} (v) = \frac{1}{2} | y - H v |_{Γ}^{2} + \frac{1}{2} | v - \hat{v} |_{{\hat{C}}_{ϵ}}^{2}

The part of the statement of Theorem 2.4 concerning existence of a minimizer is a consequence of the Lemma 2.5 stated and proved below. The second part, concerning the equivalence of minimization over b and over v (or v^′) was shown in equations (11)–(15). This concludes the proof.

Notice that in the following lemma, the variables v_ϵ, $\hat{v}$ , and v^* are analogous to $v_{j + 1}^{(n)}$ , ${\hat{v}}_{j + 1}^{(n)}$ , and the minimizer over v^′ from (5), respectively.

Lemma 2.5. Suppose that the constraint sets of (18) and (19) are non empty, then v^* exists and is unique and for all ϵ> 0,v_ϵ exists and is unique. Furthermore $lim_{ϵ \to 0} v_{ϵ} = \hat{v} + v^{*}$

Proof. The proof is broken into two parts. In the first we prove existence and uniqueness of a solution by using the idea that the constraint relating a and v^′ renders the problem convex. In the second part of the proof we study the ∈ → 0 limit of the regularized solution, extracting convergent subsequences through compactness, and demonstrating that they converge to the desired limit. To prove existence and uniqueness of the solution of (18), notice that it can be reformulated as

\underset{v^{'}}{argmin} J (\hat{v} + v^{'}) subject to • \hat{C} a = v^{'} • F v^{'} = f - F \hat{v} • G v^{'} ≼ g - G \hat{v}

and that the restriction of $\hat{C}$ over its range is strictly positive definite. Hence J is a strongly convex function being minimized over a nonempty closed convex set. From standard theory v^* exists and is unique. Then as ${\hat{C}}_{ϵ}$ is strictly positive definite, the same type of arguments provide existence and uniqueness v_∈.

Now we prove the second part of the lemma. We note that $\hat{v} + v^{*}$ matches the constraints of (19). It follows that for all ϵ > 0, $J_{ϵ} (v_{ϵ}) \leq J_{ϵ} (\hat{v} + v^{*})$ . Then let us prove that $J_{ϵ} (\hat{v} + v^{*}) \underset{ϵ \to 0}{\to} J (\hat{v} + v^{*})$ . First denote by λ₁ ≤⋯≤λ_N−1 the strictly positive eigenvalues of $\hat{C}$ (recall that $\hat{C}$ is symmetric positive semidefinite and that $rank (\hat{C})$ almost surely). Hence ${\hat{C}}_{ϵ}^{- 1} = \sum_{k = 1}^{N - 1} \frac{1}{λ_{k} + ϵ} a_{k} a_{k}^{T} + \sum_{k = N}^{dim (H_{1})} \frac{1}{ϵ} a_{k} a_{k}^{T}$ where the a_k’s are the eigenvectors of $\hat{C}$ (the first and second sums respectively gather the vectors of the range and of the nullspace of $\hat{C}$ ). As v^* lies in the range of $\hat{C}$ , it holds that ${| v^{*} + \hat{v} - \hat{v} |}_{{\hat{C}}_{ϵ}}^{2} = {| v^{*} |}_{{\hat{C}}_{ϵ}}^{2} = \sum_{k = 1}^{N - 1} \frac{1}{λ_{k} + ϵ} {(a_{k}^{T} v^{*})}^{2}$ Now as the a_k’s do not depend on ∈, by letting ∈ tending to zero, this quantity will tend to

\sum_{k = 1}^{N - 1} \frac{1}{λ_{k}} {(a_{k}^{T} v^{*})}^{2} = {| v^{*} |}_{\hat{C}}^{2} = {| v^{*} + \hat{v} - \hat{v} |}_{\hat{C}}^{2} .

Therefore it holds that $J_{ϵ} (\hat{v} + v^{*}) \underset{ϵ \to 0}{\to} J (\hat{v} + v^{*})$ . From this we deduce that there exists δ > 0 such that for all 0 < ∈ < δ, $J_{ϵ} (v_{ϵ}) \leq J (\hat{v} + v^{*}) + 1$ Then set $w_{ϵ} = v_{ϵ} - \hat{v} = w_{ϵ}^{0} + w_{ϵ}^{1}$ where $w_{ϵ}^{0}$ lies in the nullspace of $\hat{C}$ and $w_{ϵ}^{1}$ in its range (recall that for a symmetric matrix nullspace and range are orthogonal) and see that $J_{ϵ} (v_{ϵ}) = \frac{1}{2} {| y^{'} - H w_{ϵ} |}_{Γ}^{2} + \frac{1}{2} {| w_{ϵ} |}_{{\hat{C}}_{ϵ}}^{2}$ . It holds that $\frac{1}{2} {| w_{ϵ} |}_{{\hat{C}}_{ϵ}}^{2} \leq J_{ϵ} (v_{ϵ}) \leq J (\hat{v} + v^{*}) + 1$ for ∈ sufficiently small. Furthermore ${| w_{ϵ} |}_{{\hat{C}}_{ϵ}}^{2} {\hat{C}}_{ϵ} = {| w_{ϵ}^{0} |}_{{\hat{C}}_{ϵ}}^{2} + {| w_{ϵ}^{1} |}_{{\hat{C}}_{ϵ}}^{2} = \frac{1}{ϵ} {| w_{ϵ}^{0} |}^{2} + {| w_{ϵ}^{1} |}_{{\hat{C}}_{ϵ}}^{2}$ , and since this quantity is bounded from above we deduce that $w_{ϵ}^{0} \underset{ϵ \to 0}{\to} 0$ and that $w_{ϵ}^{1}$ is bounded. Let ${(ϵ_{m})}_{m \in ℕ}$ be a sequence of positive real numbers such that $ϵ_{m} \underset{m \to \infty}{\to} 0$ , and from the preceding extract a converging subsequence (denoted ${(ϵ_{m})}_{m \in ℕ}$ for simplicity) such that ${(w_{ϵ_{m}}^{1})}_{m \in ℕ}$ converges to a limit denoted w*. As $w_{ϵ_{m}}^{1}$ lies in $R (\hat{C})$ , we can use the eigenvalue decomposition of $\hat{C}$ to show that ${| w_{ϵ_{m}}^{1} |}_{{\hat{C}}_{ϵ_{m}}}^{2} \underset{m \to \infty}{\to} {| w^{*} |}_{\hat{C}}^{2}$ . This limiting identity, and the fact that $w_{ϵ}^{0}$ has limit 0, may be used to establish the first equality within the following chain of equalities and inequalities:

J (\hat{v} + w^{*}) = lim_{m \to \infty} \frac{1}{2} {| y^{'} - H w_{ϵ_{m}} |}_{Γ}^{2} + \frac{1}{2} {| w_{ϵ_{m}}^{1} |}_{{\hat{C}}_{ϵ_{m}}}^{2} \leq lim_{m \to \infty} J_{ϵ_{m}} (v_{ϵ_{m}}) \leq lim_{m \to \infty} J_{ϵ_{m}} (\hat{v} + v^{*}) = J (\hat{v} + v^{*}) .

Now note that w* matches all the constraints of (18). Indeed $w_{ϵ_{m}}^{1}$ lies in the range of $\hat{C}$ which is a closed space, also $v_{ϵ_{m}} - \hat{v} = w_{ϵ_{m}}^{0} + w_{ϵ_{m}}^{1} \underset{m \to \infty}{\to} w^{*}$ .It is a clear that $v_{ϵ_{m}} - \hat{v}$ matches the equality and inequality constraints of (18) for all m and hence passing to the limit we have that w* satisfies the equalities and inequalities.

From the uniqueness of the minimizer of (18) we have that w^* is equal to v^*. In particular this means that v^* is the unique cluster point of the original sequence ${(w_{ϵ_{m}}^{1})}_{m \in ℕ}$ . Since the original sequence was arbitrarily chosen, we conclude that $lim_{ϵ \to 0} v_{ϵ} = \hat{v} + v^{*}$ Remark 2.6. Notice that the proof remains true if we take general convex inequalities. We simply need the constrained sets to be closed and convex; however we have restricted to linear equality and inequality constraints for simplicity and because these arise most often in practice.

3. Ensemble Kalman Inversion

3.1. Inverse Problem

In this section we show how a generic inverse problem may be formulated as a partially observed dynamical system. This enables the machinery from the preceding section 2 to be used to solve inverse problems.

We are interested in the inverse problem of finding $u \in H_{1}$ from $y \in H_{2}$ where

y = G (u) + η, η ~ N (0, Γ) .

Time does not appear (explicitly) in this equation (although G may involve solution of a time-dependent differential equation, for example). In order to use the ideas from the previous section, we introduce a new variable w = G(u) and rewrite the equation as

w = G (u), y = w + η .

The key point about writing the equation this way is that the data y is now linearly related to the variable v = (u, w)^T and now we may apply the ideas of the previous section to the model by introducing the following dynamical system, taking y_j+1 = y as the given data:

u_{j + 1} = u_{j}, w_{j + 1} = G (u_{j}), y_{j + 1} = w_{j + 1} + η_{j + 1} .

If we introduce the new variables

v = {(u, w)}^{T}, Ψ (v) = {(u, G (u))}^{T}

(20a)

H = [0, I], H^{⊥} = [I, 0],

(20b)

and write v_j = (u_j, w_j)^T, we may write the dynamical system in the form

v_{j + 1} = Ψ (v_{j})

(21a)

y_{j + 1} = H v_{j + 1} + η_{j + 1},

(21b)

which is exactly in the same form as in the previous section. We note that

H v = w, H^{⊥} v = u .

3.2. Ensemble Kalman Inversion

The prediction step and the Kalman gain are defined as in (3), and the solution of the optimization problem is given by (4). We now simplify these formulae using the specific structure on Ψ, v, H arising in the inverse problem and given in (20); this results in block form vectors and matrices. First we note that

{\hat{C}}_{j + 1} = [\begin{matrix} C_{j + 1}^{u u} & C_{j + 1}^{u w} \\ {(C_{j + 1}^{u w})}^{T} & C_{j + 1}^{w w} \end{matrix}], {\bar{v}}_{j + 1} = (\begin{matrix} {\bar{u}}_{j + 1} \\ {\bar{w}}_{j + 1} \end{matrix}) .

Here

{\bar{u}}_{j + 1} = \frac{1}{N} \sum_{n = 1}^{N} u_{j}^{(n)}, {\bar{w}}_{j + 1} = \frac{1}{N} \sum_{n = 1}^{N} G (u_{j}^{(n)}) : = {\bar{G}}_{j}

and

C_{j + 1}^{u w} = \frac{1}{N} \sum_{n = 1}^{N} (u_{j}^{(n)} - {\bar{u}}_{j + 1}) \otimes (G (u_{j}^{(n)}) - {\bar{G}}_{j}), C_{j + 1}^{w w} = \frac{1}{N} \sum_{n = 1}^{N} (G (u_{j}^{(n)}) - {\bar{G}}_{j}) \otimes (G (u_{j}^{(n)}) - {\bar{G}}_{j}), C_{j + 1}^{u u} = \frac{1}{N} \sum_{n = 1}^{N} (u_{j}^{(n)} - {\bar{u}}_{j + 1}) \otimes (u_{j}^{(n)} - {\bar{u}}_{j + 1}) .

The covariance $C_{j + 1}^{w w}$ denotes the empirical covariance of the ensemble in data space, $C_{j + 1}^{u u}$ denotes the empirical covariance of the ensemble in space of the unknown u, and $C_{j + 1}^{u w}$ denotes the empirical cross-covariance from data space to the space of the unknown.

Noting that $S_{j + 1} = {(C_{j + 1}^{w w} + Γ)}^{- 1}$ we obtain

K_{j + 1} = (\begin{matrix} C_{j + 1}^{u w} {(C_{j + 1}^{w w} + Γ)}^{- 1} \\ C_{j + 1}^{w w} {(C_{j + 1}^{w w} + Γ)}^{- 1} \end{matrix}) .

(22)

Combining equation (22) with the update equation within (4) it follows that

{v_{j}^{(n)}}_{n = 1}^{N} \to {v_{j + 1}^{(n)}}_{n = 1}^{N}

and

{H^{⊥} v_{j}^{(n)}}_{n = 1}^{N} \to {H^{⊥} v_{j + 1}^{(n)}}_{n = 1}^{N}

and hence that

u_{j + 1}^{(n)} = H^{⊥} v_{j + 1}^{(n)} = u_{j}^{(n)} + C_{j + 1}^{u w} {(C_{j + 1}^{w w} + Γ)}^{- 1} (y_{j + 1}^{(n)} - G (u_{j}^{(n)})) .

Thus we have derived the EKI update formula:

u_{j + 1}^{(n)} = u_{j}^{(n)} + C_{j + 1}^{u w} {(C_{j + 1}^{w w} + Γ)}^{- 1} (y_{j + 1}^{(n)} - G (u_{j}^{(n)})) .

(23)

We note also that

w_{j + 1}^{(n)} = G (u_{j}^{(n)}) + C_{j + 1}^{w w} {(C_{j + 1}^{w w} + Γ)}^{- 1} (y_{j + 1}^{(n)} - G (u_{j}^{(n)})) .

(24)

However $w_{j + 1}^{(n)}$ is not needed to update the state and so plays no role in this unconstrained EKI algorithm. (It may be used, however, to impose constraints on observation space, as discussed in the next subsection.)

In summary we have derived the following algorithm for solution of the unconstrained inverse problem:

3.3. Ensemble Kalman Inversion With Constraints

3.3.1. Formulation In The Original Variables

We now consider imposing constraints on the optimization step arising in ensemble Kalman inversion. As in the unconstrained case we do this by formulating the problem as a special case of the partially observed dynamical system, subject to constraints, from the previous section.

To this end we formulate the constraints in the space of the unknown and the data as follows:

F^{u} u = f^{u},

(25a)

F^{w} w = f^{w},

(25b)

G^{u} u ≼ g^{u},

(25c)

G^{w} w ≼ g^{w} .

(25d)

The algorithm proceeds by predicting according to equation (1), and then optimizing (3), all using the specific structure (20), and with the optimization subject to the constraints (25), written in the notation of the general Kalman updating formulae in (27), detailed below; in particular the rewrite (27) of the constraints expresses everything in terms of the variable v. We may summarize the constraints as follows, to allow direct application of the ideas of the previous section. To this end define

F = (\begin{matrix} F^{u} H^{⊥} \\ F^{w} H \end{matrix}) = (\begin{matrix} F^{u} & 0 \\ 0 & F^{w} \end{matrix})

(26a)

G = (\begin{matrix} G^{u} H^{⊥} \\ G^{w} H \end{matrix}) = (\begin{matrix} G^{u} & 0 \\ 0 & G^{w} \end{matrix})

(26b)

f = (\begin{array}{l} f^{u} \\ f^{w} \end{array}), g = (\begin{array}{l} g^{u} \\ g^{w} \end{array}) .

(26c)

Then the constraints (25) may be written as

F v = f,

(27a)

G v ≼ g .

(27b)

See Algorithm 6 for the resulting pseudo-code.

3.3.

3.3.2. Formulation In Range Of The Covariance

We describe an alternative way to approach the derivation of the EKI update formulae. We apply Theorem 2.4 with the specific structure (20), (27) arising from the dynamical system used in EKI. To this end we define

J_{filter, j, n} (b) : = \frac{1}{2} {| y_{j + 1}^{(n)} - G (u_{j}^{(n)}) - B^{w} b |}_{Γ}^{2} + \frac{1}{2 N} | b |^{2}

(28a)

= \frac{1}{2} b^{T} ({(B^{w})}^{T} Γ^{- 1} B^{w} + \frac{1}{N} I) b - {({(B^{w})}^{T} Γ^{- 1} (y_{j + 1}^{(n)} - G (u_{j}^{(n)})))}^{T} b + const.

(28b)

where b is the vector of N scalar weights b_m and

B^{u} b = \frac{1}{N} \sum_{m = 1}^{N} b_{m} (u_{j}^{(m)} - {\bar{u}}_{j}),

(29a)

B^{w} b = \frac{1}{N} \sum_{m = 1}^{N} b_{m} (G (u_{j}^{(m)}) - {\bar{G}}_{j}),

(29b)

B b = (\begin{matrix} B^{u} b \\ B^{w} b \end{matrix}) .

(29c)

Once this quadratic form has been minimized with respect to b then the update formula (11) gives

u_{j + 1}^{(n)} = u_{j}^{(n)} + \frac{1}{N} \sum_{m = 1}^{N} b_{m} (u_{j}^{(m)} - {\bar{u}}_{j}),

(30a)

w_{j + 1}^{(n)} = G (u_{j}^{(n)}) + \frac{1}{N} \sum_{m = 1}^{N} b_{m} (G (u_{j}^{(m)}) - {\bar{G}}_{j}) .

(30b)

Note that the vector {b_m} depends on the particle label n; as in the previous section, we have suppressed this dependence for notational convenience. We may now impose linear equality and inequality constraints on both u and w = G(u) (i.e. in parameter and data spaces) and minimize (28) subject to these constraints. To be more specific if we impose the constraints (27) expressed in the variable b:

F B b = f - F {\hat{v}}_{j + 1}^{(n)},

(31a)

G B b ≼ g - G {\hat{v}}_{j + 1}^{(n)} .

(31b)

Here F, G, f and g are given by (26), B is defined by (29) and

{\hat{v}}_{j + 1}^{(n)} = (\begin{matrix} u_{j}^{(n)} \\ G (u_{j}^{(n)}) \end{matrix}) .

See Algorithm 7 for the resulting pseudo-code.

3.3.

Remark 3.1. As in the previous section, the result holds true for general convex inequality constraints; the linear case is considered for simplicity of exposition, and because it is most frequently arising in practice.

Remark 3.2. The EKI algorithm, with or without constraints, has the following invariant subspace property: define $A = span {(u_{0}^{(n)})}_{n \in {1, \dots, N}}$ , then for all J in {0, …, J} and for all n in {1,…, N}, then the $u_{j}^{(n)}$ defined by the three algorithms in this section all lie in $A$ . This is a direct consequence of writing the update formulae in terms of b and noting (30).

We can now state a result analogous to Theorem 2.4, and with proof that is a straightforward corollary of that result, using the specific structure (20):

Theorem 3.3. Suppose that the dimensions of $H_{1}$ and $H_{2}$ are finite. Suppose also that the specific structure (20) is applied. The problem of finding $u_{j + 1}^{(n)}$ from the minimizer of I_{filter, J,n}(v), defined in (3) and subject to the constraint (25), is equivalent to finding b that minimizes (28), subject to (31), and then using (30) to find $u_{j + 1}^{(n)}$ from b. Furthermore, both of these constrained minimization problems have a unique solution provided that the constraint sets are non-empty.

4. Numerical Results

This section contains numerical results which demonstrate the benefits of imposing constraints on ensemble Kalman methods. Subsection 4.1 concerns an application of state estimation (using EnKF) in biomedicine, using real patient data, whilst subsection 4.2 concerns an application of inversion (using EKI) in seismology and employs simulated data. When comparing results from the two experiments, recall that iterations of EKI correspond to an algorithmic dynamics intended to converge to a single distribution (over ensemble members) on the parameters for which we invert, whereas iterations of EnKF correspond to the incorporation of new data at every physical measurement time, and thus the distribution (over ensemble members) is not necessarily expected to converge as the iteration progresses. In both applications, minimizations were performed in MATLAB using the default interior-point method (fmincon) through the general quadratic programming function, quadprog.

4.1. State Estimation

Here we present an application of the constrained EnKF to the tracking and forecasting of human blood glucose levels. We use self-monitoring data collected by an individual with Type 2 Diabetes. We use the “P1” data set described by Albers et al. in [34]; this dataset includes measurements of blood glucose and consumed nutrition, and is publicly available on physionet.org. For more information on the data, and on an unconstrained data assimilation approach using the unscented Kalman filter, see [34]. We model the glucose-insulin system with the ultradian model proposed by [35]. The primary state variables are the glucose concentration, G, the plasma insulin concentration, I_p, and the interstitial insulin concentration, I_i; these three state variables are augmented with a three stage delay (h₁, h₂, h₃) which encodes a non-linear delayed hepatic glucose response to plasma insulin levels. The resulting ordinary differential equations have the form:

\frac{d I_{p}}{d t} = f_{1} (G) - E (\frac{I_{p}}{V_{p}} - \frac{I_{i}}{V_{i}}) - \frac{I_{p}}{t_{p}}

(32a)

\frac{d I_{i}}{d t} = E (\frac{I_{p}}{V_{p}} - \frac{I_{i}}{V_{i}}) - \frac{I_{i}}{t_{i}}

(32B)

\frac{d G}{d t} = f_{4} (h_{3}) + m_{G} (t) - f_{2} (G) - f_{3} (I_{i}) G

(32c)

\frac{d h_{1}}{d t} = \frac{1}{t_{d}} (I_{p} - h_{1})

(32d)

\frac{d h_{2}}{d t} = \frac{1}{t_{d}} (h_{1} - h_{2})

(32e)

\frac{d h_{3}}{d t} = \frac{1}{t_{d}} (h_{2} - h_{3})

(32f)

Here (t) represents a known rate of ingested carbohydrates, f₁(G) represents the rate of glucose-dependent insulin production, f₂(G) represents insulin-independent glucose utilization, f₃(I_i) G represents insulin-dependent glucose utilization and f₄(h₃) represents delayed insulin-dependent hepatic glucose production; the functional forms of these parameterized processes can be found in the appendix, along with a description of model parameters.

In the EnKF setting, we write u = [I_p, I_i, G, h₁, h₂, h₃], and use (32) to define F such that

\frac{d u}{d t} = F (u, t, θ),

Where θ contains model parameters. We then extend the state vector in order to perform joint parameter estimation: v = [u, R_g]^T.

For the purposes of this paper, the function (t) may be viewed as known; it is determined from data describing meals consumed by the patient. Since insulin (I_p and I_i) and delay variables (h₁, h₂, and h₃) are not measured, whilst glucose is measured, we define the measurement operator to be H = [0, 0, 1, 0, 0, 0, 0]. The discrete time forward model is obtained by integrating the deterministic model in (32) between consecutive measurement time-points and applying an identity map to R_g. Because these time-points may not be equally spaced, and because the time-dependent forcings (meals) will differ in different time-intervals, this leads to a map of the form

v_{j + 1} = Ψ_{j} (v_{j}) .

This is a slight departure from the methodology outlined in section 2, where Ψ does not depend on J (autonomous dynamics) but is a straightforward extension which the reader can easily provide.

We present EnKF results from a single patient’s data when run with and without constraints (Algorithms 1 and 3 respectively). We performed joint state-parameter estimation, augmenting the state with parameter R_g (see Appendix for details of where this parameter appears) and adding identity-map dynamics for parameter R_g. The following constraints were imposed:

[\begin{matrix} 0.01 \\ 0.01 \\ 2000 \\ 0.01 \\ 0.01 \\ 0.01 \\ 0 \end{matrix}] ≼ v ≼ [\begin{matrix} 10000 \\ 10000 \\ 40000 \\ 10000 \\ 10000 \\ 10000 \\ 1000000 \end{matrix}]

(33)

Figure 1 compares the overall distribution of updated state means over time when running EnKF with and without these state constraints. While individual particles in this experiment often violated the constraints, the overall updated means did not. Nevertheless, enforcement of lower-bound constraints shifts up the state distribution slightly. Note that upper bound constraints were never violated in this experiment.

The distribution of mean state updates when running EnKF with and without inequality constraints. Black vertical lines denote lower bound state constraints.

Figure 2 shows a two-dimensional state projection of updated particles at a given time step before and after applying the constrained optimization. Note that particles may additionally violate constraints in unplotted dimensions—this explains why one particle whose unconstrained update appears to live within the constraints is in fact differently updated under the constrained optimization. Time step 126 was selected for illustrative purposes, and was the measurement event in which particles most often violated the constraints.

Particle updates at a given time-step (here, measurement 126) are shown using a traditional Kalman gain versus using the constrained optimization. The black lines denote lower bound constraints on the states h₁ and h₃.

Figure 3 depicts the overall frequency of constraint violations. We observe that the the measured state (blood glucose) never violated a constraint, nor did the inferred parameter R_g. However, other model states did often violate constraints, and up to 30% (4/13) of particles simultaneously violated the constraints at a single time-step.

Percentage map of the constraint violations, where each lower-bound constraint is represented by a row. At each iteration, the percentage of particles that violated a constraint is color-coded, with yellow representing the largest proportion of constraint violations.

By adding constraints, we ensure that all the simulations which constitute the ensemble method are biologically plausible.

4.2. Inverse Problem

Here we present application of the constrained EKI in seismology. We study near-surface site characterization in which we invert for the shear wave velocity profile of the geomaterials in the earth shallow crust, using downhole array data. For forward modeling, we consider a semi-discrete form of the following wave equation in a horizontally stratified heterogeneous soil layer:

\frac{\partial}{\partial z} [c_{s}^{2} (z) \frac{\partial d (z, t)}{\partial z}] - \frac{\partial^{2} d (z, t)}{\partial t^{2}} = 0.

Here (z, t) is the displacement field of the wave response as a function of spatial variable z ∈ (0, H) and time variable t ∈ (0, T]. The function (z) is the shear wave velocity function. We impose the following boundary and initial conditions:

d (H, t) = d_{0} (t), \partial d (0, t) / \partial z = 0, d (z, 0) = 0, \partial d (z, 0) / \partial t = 0

Where d₀(t) is the prescribed displacement at depth z = H. Generally, the shear wave velocity changes as a piecewise constant function with depth. If the layering information, i.e., the total number of layers and their thickness, is not available or is poorly characterized, it is desired to use a generic function for site characterization, such as this:

c_{s} (z) = {\begin{cases} c_{s 0} & 0 \leq z \leq z_{0} \\ c_{s 0} {(1 + k (z - z_{0}))}^{n} & z_{0} \leq z ⩽ z_{1} \\ α c_{s 0} {(1 + k (z_{1} - z_{0}))}^{n} & z_{1} \leq z \leq H \end{cases} .

See, for example, [36]. In the constrained EKI setting, u = (c_s0, k, z₀, n, z₁, α) and

G (u) = \partial^{2} d (0, t) / \partial t^{2} .

For the numerical example studied here, G^u and g^u are determined by enforcing the constraints 0 ≤ c_s0 1000, 0 ≤ k ≤ 100, 0 ≤ z₀ ≤ z₁, 0 ≤ n ≤ 1, z₀ ≤ z₁ ≤ H, and 1 ≤ α ≤ 10. We generate the initial ensemble by drawing samples from uniform distributions and discard members that violate the enforced constraints. In order to avoid very large velocities at z = z₁, we also discard members with (z₁) > 5000m/s. If we perform parameter learning using the unconstrained EKI, the experiment fails at J = 1 because of incapability of the dynamic model to propagate unphysical values of the shear wave velocity c_s.

All results shown use Algorithm 7. Figure 4 shows the ensemble distribution of u at J = 2 before and after enforcing constraints whilst Figure 5 shows the evolution of the updated ensemble. Note that parameter k saturates with an ensemble close to the upper bound of 100 imposed through constraints on this parameter; however experiments in which we imposed different upper bounds on this parameter lead to different estimates for k, with little change to the estimated velocity profile and we conclude that this parameter suffers from identifiability issues. (Note that Figure 4 displays the updated ensemble distribution at a single step in the sequence of ensemble updates, comparing the effect of imposing constraints with neglecting them; in contrast Figure 1 shows the distribution over all measurement time-points of the ensemble means. The figures thus illustrate different phenomena).

The distribution of parameters before and after enforcing constraints in Algorithm 7 at iteration J = 2. Black vertical lines denote the lower and upper bound constraints.

Evolution of the updated ensemble with iteration. Black horizontal lines denote the lower and upper bound constraints.

Moreover, Figure 6a shows the map of violation for different constraints enforced on parameters whilst Figure 6b shows the estimated generic c_s profile after 40 iterations compared to the true profile and the initial estimate. Figure 6a shows the key role employed by the enforcing of constraints. In this case the addition of constraints ensures that all the simulations which constitute the ensemble method are physically meaningful, and also that the forward model remains well-posed.

(a) The percentage map of the constraint violations for the first 20 iterations; (b) the estimated velocity profile $({\bar{u}}_{j = 40})$ compared to the true profile (u ^†) and the initial estimate $({\bar{u}}_{j = 0})$

5. Conclusions

Constraints arise naturally in many state and parameter estimation problems. We have shown how convex constraints may be incorporated into ensemble Kalman based state or parameter estimation algorithms with relatively few changes to existing code: the standard algorithm is applied and for any ensemble member which violates a constraint, a quadratic optimization problem subject to convex constraints is solved instead. We have written the resulting algorithms in easily digested pseudo-code, we have developed an underpinning theory and we have given illustrative numerical examples.

Two primary directions suggest themselves in this area. The first is the use of these methods in applications. As indicated in the introduction, our general formulation is inspired by the two papers [8, 6] from the geosciences and we have demonstrated applicability to problems from biomedicine and seismology; but many other potential application domains are ripe for application of ensemble Kalman methodology, because of its black-box and derivative-free formulation, and the ability to impose constraints in a straightforward fashion will help to extend this methodology. The second is the theoretical analysis of these methods: can the inclusion of constraints be used to deduce improved accuracy of state or parameter estimates; or can the inclusion of constraints be used to demonstrate improved performance as measured, for example, by proportion of model runs which are physically (or biologically etc.) plausible? Furthermore, although the imposition of constraints is reasonable, it is not clear that it may not lead to pathologies in algorithmic performance and ruling out, or understanding, the occurrence of such pathological behaviour may be important.

Acknowlegments

This work was funded by NIH-NLM grant RO1 LM012734. AMS was also funded by AFOSR Grant FA955017-10185 and by ONR grant N00014-17-1-2079.

Appendix

We give the details of the ultradian model of glucose-insulin dynamics used as the forward model in subsection 4.1. An example of the induced dynamics is given in Figure 7.

\frac{d I_{p}}{d t} = f_{1} (G) - E (\frac{I_{p}}{V_{p}} - \frac{I_{i}}{V_{i}}) - \frac{I_{p}}{t_{p}}

(34)

\frac{d I_{i}}{d t} = E (\frac{I_{p}}{V_{p}} - \frac{I_{i}}{V_{i}}) - \frac{I_{i}}{t_{i}}

(35)

\frac{d G}{d t} = f_{4} (h_{3}) + m_{G} (t) - f_{2} (G) - f_{3} (I_{i}) G

(36)

\frac{d h_{1}}{d t} = \frac{1}{t_{d}} (I_{p} - h_{1})

(37)

\frac{d h_{2}}{d t} = \frac{1}{t_{d}} (h_{1} - h_{2})

(38)

\frac{d h_{3}}{d t} = \frac{1}{t_{d}} (h_{2} - h_{3})

(39)

where, for N meals at times ${t_{j}}_{j = 1}^{N}$ with carbohydrate composition ${m_{j}}_{j = 1}^{N}$

m_{G} (t) = \sum_{j = 1}^{N} \frac{m_{j} k}{60} exp (k (t_{j} - t)), N = # {t_{j} < t}

(40)

and

f_{1} (G) = \frac{R_{m}}{1 + exp (\frac{- G}{V_{g} c_{1}} + a_{1})} : the rate of insulin production

(41)

f_{2} (G) = U_{b} (1 - exp (\frac{- G}{C_{2} V_{g}})) : insulin-independent glucose utilization

(42)

f_{3} (I_{i}) = \frac{1}{C_{3} V_{g}} (U_{0} + \frac{U_{m} - U_{0}}{1 + {(κ I_{i})}^{- β}}), f_{3} (I_{i}) G : insulin-dependent glucose utilization

(43)

f_{4} (h_{3}) = \frac{R_{g}}{1 + exp (α (\frac{h_{3}}{C_{5} V_{p}} - 1))} : delayed insulin-dependent glucose utilization

(44)

κ = \frac{1}{C_{4}} (\frac{1}{V_{i}} - \frac{1}{E t_{i}})

(45)

Here we show the oscillating dynamics of the glucose-insulin response in the ultradian model, driven by an exponentially decaying nutritional driver m_G

References

[1].Teixeira BO, Tôrres LA, Aguirre LA and Bernstein DS 2010. Journal of Process Control 20 45–57 [Google Scholar]
[2].Bonnet V, Dumas R, Cappozzo A, Joukov V, Daune G, Kulić D, Fraisse P, Andary S and Venture G 2017. Journal of biomechanics 62 140–147 [DOI] [PubMed] [Google Scholar]
[3].Goffaux G, Perrier M and Cloutier M 2011. Cell energy metabolism: a constrained ensemble kalman filter Proceedings of the 18th IFAC world congress: Milano, Italy, International Federation of Automatic Control; pp 8391–8396 [Google Scholar]
[4].Lei J, Liu S and Wang X 2012. IET Science, Measurement & Technology 6 63–77 [Google Scholar]
[5].Simon D and Simon DL 2010. International Journal of Systems Science 41 159–171 [Google Scholar]
[6].Janjić T, McLaughlin D, Cohn SE and Verlaan M 2014. Monthly Weather Review 142 755–773 [Google Scholar]
[7].Yang X, Huang B and Prasad V 2014. Chemical Engineering Science 106 211–221 [Google Scholar]
[8].Wang D, Chen Y and Cai X 2009. Water resources research 45 [Google Scholar]
[9].Aravkin AY, Burke JV and Pillonetto G 2014. Optimization viewpoint on kalman smoothing with applications to robust and sparse estimation Compressed Sensing & Sparse Filtering (Springer; ) pp 237–280 [Google Scholar]
[10].Aravkin A, Burke JV, Ljung L, Lozano A and Pillonetto G 2017. Automatica 86 63–86 [Google Scholar]
[11].Evensen G 2009. Data assimilation: the ensemble Kalman filter (Springer Science & Business Media; ) [Google Scholar]
[12].Reich S and Cotter C 2015. Probabilistic forecasting and Bayesian data assimilation (Cambridge University Press; ) [Google Scholar]
[13].Law K, Stuart A and Zygalakis K 2015. Data Assimilation (Springer; ) [Google Scholar]
[14].Carrassi A, Bocquet M, Bertino L and Evensen G 2018. Data assimilation in the geosciences: An overview of methods, issues, and perspectives vol 9 (Wiley Interdisciplinary Reviews: Climate Change, 5(2018)) [Google Scholar]
[15].Evensen G 1994. Journal of Geophysical Research: Oceans 99 10143–10162 [Google Scholar]
[16].Burgers G, Jan van Leeuwen P and Evensen G 1998. Monthly weather review 126 1719–1724 [Google Scholar]
[17].Oliver DS, Reynolds AC and Liu N 2008. Inverse theory for petroleum reservoir characterization and history matching (Cambridge University Press; ) [Google Scholar]
[18].Iglesias MA, Law KJ and Stuart AM 2013. Inverse Problems 29 045001 [Google Scholar]
[19].Doucet A, De Freitas N and Gordon N 2001. An introduction to sequential monte carlo methods Sequential Monte Carlo methods in practice (Springer; ) pp 3–14 [Google Scholar]
[20].Del Moral P, Doucet A and Jasra A 2006. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68 411–436 [Google Scholar]
[21].Simon D 2010. IET Control Theory & Applications 4 1303–1318 [Google Scholar]
[22].Amor N, Rasool G and Bouaynaya NC 2018. Constrained state estimation — a review (Preprint arXiv:1807.03463)
[23].Robertson DG, Lee JH and Rawlings JB 1996. AIChE Journal 42 2209–2224 [Google Scholar]
[24].Rao CV, Rawlings JB and Mayne DQ 2003. IEEE transactions on automatic control 48 246–258 [Google Scholar]
[25].Vachhani P, Rengaswamy R, Gangwal V and Narasimhan S 2005. AIChE Journal 51 946–959 [Google Scholar]
[26].Li R, Jan NM, Prasad V and Huang B 2018. Constrained extended kalman filter based on kullback-leibler (kl) divergence 2018 European Control Conference (ECC) (IEEE; ) pp 831–836 [Google Scholar]
[27].Vachhani P, Narasimhan S and Rengaswamy R 2006. Journal of process control 16 1075–1086 [Google Scholar]
[28].Julier S, Uhlmann J and Durrant-Whyte HF 2000. IEEE Transactions on automatic control 45 477–482 [Google Scholar]
[29].Simon D and Simon DL 2006. IEE Proceedings-Control Theory and Applications 153 371–378 [Google Scholar]
[30].Mandela R, Kuppuraj V, Rengaswamy R and Narasimhan S 2012. Journal of Process Control 22 718–728 [Google Scholar]
[31].Prakash J, Patwardhan SC and Shah SL 2008. 2008 American Control Conference 3542–3547 [Google Scholar]
[32].Prakash J, Patwardhan SC and Shah SL 2010. Industrial & Engineering Chemistry Research 49 2242–2253 [Google Scholar]
[33].Stuart A and Zygalakis K 2015. Data assimilation: A mathematical introduction Tech. rep Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States) [Google Scholar]
[34].Albers DJ, Levine M, Gluckman B, Ginsberg H, Hripcsak G and Mamykina L 2017. PLoS computational biology 13 e1005232. [DOI] [PMC free article] [PubMed] [Google Scholar]
[35].Sturis J, Polonsky KS, Mosekilde E and Van Cauter E 1991. American Journal of Physiology-Endocrinology And Metabolism 260 E801–E809 [DOI] [PubMed] [Google Scholar]
[36].Shi J and Asimaki D 2018. Seismological Research Letters 89 1397–1409 [Google Scholar]

[R1] [1].Teixeira BO, Tôrres LA, Aguirre LA and Bernstein DS 2010. Journal of Process Control 20 45–57 [Google Scholar]

[R2] [2].Bonnet V, Dumas R, Cappozzo A, Joukov V, Daune G, Kulić D, Fraisse P, Andary S and Venture G 2017. Journal of biomechanics 62 140–147 [DOI] [PubMed] [Google Scholar]

[R3] [3].Goffaux G, Perrier M and Cloutier M 2011. Cell energy metabolism: a constrained ensemble kalman filter Proceedings of the 18th IFAC world congress: Milano, Italy, International Federation of Automatic Control; pp 8391–8396 [Google Scholar]

[R4] [4].Lei J, Liu S and Wang X 2012. IET Science, Measurement & Technology 6 63–77 [Google Scholar]

[R5] [5].Simon D and Simon DL 2010. International Journal of Systems Science 41 159–171 [Google Scholar]

[R6] [6].Janjić T, McLaughlin D, Cohn SE and Verlaan M 2014. Monthly Weather Review 142 755–773 [Google Scholar]

[R7] [7].Yang X, Huang B and Prasad V 2014. Chemical Engineering Science 106 211–221 [Google Scholar]

[R8] [8].Wang D, Chen Y and Cai X 2009. Water resources research 45 [Google Scholar]

[R9] [9].Aravkin AY, Burke JV and Pillonetto G 2014. Optimization viewpoint on kalman smoothing with applications to robust and sparse estimation Compressed Sensing & Sparse Filtering (Springer; ) pp 237–280 [Google Scholar]

[R10] [10].Aravkin A, Burke JV, Ljung L, Lozano A and Pillonetto G 2017. Automatica 86 63–86 [Google Scholar]

[R11] [11].Evensen G 2009. Data assimilation: the ensemble Kalman filter (Springer Science & Business Media; ) [Google Scholar]

[R12] [12].Reich S and Cotter C 2015. Probabilistic forecasting and Bayesian data assimilation (Cambridge University Press; ) [Google Scholar]

[R13] [13].Law K, Stuart A and Zygalakis K 2015. Data Assimilation (Springer; ) [Google Scholar]

[R14] [14].Carrassi A, Bocquet M, Bertino L and Evensen G 2018. Data assimilation in the geosciences: An overview of methods, issues, and perspectives vol 9 (Wiley Interdisciplinary Reviews: Climate Change, 5(2018)) [Google Scholar]

[R15] [15].Evensen G 1994. Journal of Geophysical Research: Oceans 99 10143–10162 [Google Scholar]

[R16] [16].Burgers G, Jan van Leeuwen P and Evensen G 1998. Monthly weather review 126 1719–1724 [Google Scholar]

[R17] [17].Oliver DS, Reynolds AC and Liu N 2008. Inverse theory for petroleum reservoir characterization and history matching (Cambridge University Press; ) [Google Scholar]

[R18] [18].Iglesias MA, Law KJ and Stuart AM 2013. Inverse Problems 29 045001 [Google Scholar]

[R19] [19].Doucet A, De Freitas N and Gordon N 2001. An introduction to sequential monte carlo methods Sequential Monte Carlo methods in practice (Springer; ) pp 3–14 [Google Scholar]

[R20] [20].Del Moral P, Doucet A and Jasra A 2006. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68 411–436 [Google Scholar]

[R21] [21].Simon D 2010. IET Control Theory & Applications 4 1303–1318 [Google Scholar]

[R22] [22].Amor N, Rasool G and Bouaynaya NC 2018. Constrained state estimation — a review (Preprint arXiv:1807.03463)

[R23] [23].Robertson DG, Lee JH and Rawlings JB 1996. AIChE Journal 42 2209–2224 [Google Scholar]

[R24] [24].Rao CV, Rawlings JB and Mayne DQ 2003. IEEE transactions on automatic control 48 246–258 [Google Scholar]

[R25] [25].Vachhani P, Rengaswamy R, Gangwal V and Narasimhan S 2005. AIChE Journal 51 946–959 [Google Scholar]

[R26] [26].Li R, Jan NM, Prasad V and Huang B 2018. Constrained extended kalman filter based on kullback-leibler (kl) divergence 2018 European Control Conference (ECC) (IEEE; ) pp 831–836 [Google Scholar]

[R27] [27].Vachhani P, Narasimhan S and Rengaswamy R 2006. Journal of process control 16 1075–1086 [Google Scholar]

[R28] [28].Julier S, Uhlmann J and Durrant-Whyte HF 2000. IEEE Transactions on automatic control 45 477–482 [Google Scholar]

[R29] [29].Simon D and Simon DL 2006. IEE Proceedings-Control Theory and Applications 153 371–378 [Google Scholar]

[R30] [30].Mandela R, Kuppuraj V, Rengaswamy R and Narasimhan S 2012. Journal of Process Control 22 718–728 [Google Scholar]

[R31] [31].Prakash J, Patwardhan SC and Shah SL 2008. 2008 American Control Conference 3542–3547 [Google Scholar]

[R32] [32].Prakash J, Patwardhan SC and Shah SL 2010. Industrial & Engineering Chemistry Research 49 2242–2253 [Google Scholar]

[R33] [33].Stuart A and Zygalakis K 2015. Data assimilation: A mathematical introduction Tech. rep Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States) [Google Scholar]

[R34] [34].Albers DJ, Levine M, Gluckman B, Ginsberg H, Hripcsak G and Mamykina L 2017. PLoS computational biology 13 e1005232. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] [35].Sturis J, Polonsky KS, Mosekilde E and Van Cauter E 1991. American Journal of Physiology-Endocrinology And Metabolism 260 E801–E809 [DOI] [PubMed] [Google Scholar]

[R36] [36].Shi J and Asimaki D 2018. Seismological Research Letters 89 1397–1409 [Google Scholar]

PERMALINK

Ensemble Kalman Methods With Constraints

David J Albers

Paul-Adrien Blancquart

Matthew E Levine

Elnaz Esmaeilzadeh Seylabi

Andrew Stuart

Abstract

1. Introduction

1.1. Overview

1.2. Literature Review

1.3. Our Contribution

1.4. Notation

2. Ensemble Kalman State Estimation

2.1. Filtering Problem

2.2. Ensemble Kalman Filter

2.2.1. Formulation In The Original Variables

2.2.2. Formulation In Range Of The Covariance

2.3. Constrained Ensemble Kalman Filter

2.3.1. Formulation In The Original Variables

2.3.2. Formulation In Range Of The Covariance

3. Ensemble Kalman Inversion

3.1. Inverse Problem

3.2. Ensemble Kalman Inversion

3.3. Ensemble Kalman Inversion With Constraints

3.3.1. Formulation In The Original Variables

3.3.2. Formulation In Range Of The Covariance

4. Numerical Results

4.1. State Estimation

Figure 1.

Figure 2.

Figure 3.

4.2. Inverse Problem

Figure 4.

Figure 5.

Figure 6.

5. Conclusions

Acknowlegments

Appendix

Figure 7.

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases