Estimation of the Distribution of Random Parameters in Discrete Time Abstract Parabolic Systems with Unbounded Input and Output: Approximation and Convergence

Melike Sirlanci; Susan E Luczak; I G Rosen

doi:10.12732/caa.v23i2.4

. Author manuscript; available in PMC: 2020 Jan 18.

Published in final edited form as: Commun Appl Anal. 2019 Jan 18;23(2):287–329. doi: 10.12732/caa.v23i2.4

Estimation of the Distribution of Random Parameters in Discrete Time Abstract Parabolic Systems with Unbounded Input and Output: Approximation and Convergence^†

Melike Sirlanci ¹, Susan E Luczak ², I G Rosen ^3,^*

PMCID: PMC6904110 NIHMSID: NIHMS1008733 PMID: 31824131

Abstract

A finite dimensional abstract approximation and convergence theory is developed for estimation of the distribution of random parameters in infinite dimensional discrete time linear systems with dynamics described by regularly dissipative operators and involving, in general, unbounded input and output operators. By taking expectations, the system is re-cast as an equivalent abstract parabolic system in a Gelfand triple of Bochner spaces wherein the random parameters become new space-like variables. Estimating their distribution is now analogous to estimating a spatially varying coefficient in a standard deterministic parabolic system. The estimation problems are approximated by a sequence of finite dimensional problems. Convergence is established using a state space-varying version of the Trotter-Kato semigroup approximation theorem. Numerical results for a number of examples involving the estimation of exponential families of densities for random parameters in a diffusion equation with boundary input and output are presented and discussed.

Keywords: Distribution estimation, Random parameters, Distributed parameter systems, Abstract parabolic systems, Regularly dissipative operators

1. Introduction

The work we report on here was motivated by a compound inverse or blind deconvolution problem involving the interpretation of data from a transdermal alcohol biosensor. The observation (dating back to the 1930s [25, 35, 36, 37, 38]) that ethanol is highly miscible and finds its way into all the water in the body, and in particular, sweat, has in the past two decades, led to the development of technology to measure the amount of ethanol excreted from the body transdermally (i.e. through the skin) through perspiration and to then use it to quantitatively assess intoxication level. The basis for the measurement is an oxidation-reduction (redox) reaction that produces four electrons for each ethanol molecule oxidized. This results in a continuous current whose level is proportional to the amount of ethanol evaporating from the surface of the skin beneath the sensor. Now while these devices have been available and in use, both experimentally and commercially, for a number of years, they have been used primarily as abstinence monitors because transdermal alcohol level or concentration (TAC) data cannot consistently be converted to breath and blood alcohol concentrations (BrAC/BAC) across individuals, devices, and environmental conditions. (BAC and BrAC are currently, and historically have been, the standard measures of intoxication among alcohol researchers and clinicians, as well as in the courts.) Indeed, unlike a breath analyzer, which relies on a relatively simple model from basic chemistry (i.e., Henrys Law) for the exchange of gases between circulating pulmonary blood and alveolar air (see, for example, [22]) that has been found to be reasonably robust across the population, the transport and filtering of alcohol by the skin is physiologically more complex and is affected by a number of factors that differ across individuals (e.g., skin layer thickness, porosity and tortuosity, etc.) and even drinking episodes within individuals (e.g., body and ambient temperature, skin hydration, vasodilation). The challenge in making these devices practicable is to develop a means to reliably convert biosensor measured TAC into BAC or BrAC.

In our earlier work ([14, 19, 28]) we have taken a strictly deterministic approach to converting TAC to either BAC or BrAC. We fit first principles physics-based models in the form of a distributed parameter (diffusion) system with unbounded input and output, and used individual calibration data to capture the dynamics of the forward process - the propagation of alcohol from the blood, through the skin, and its measurement by the sensor (i.e. the forward model) by estimating the parameters (diffusivity, input/output gain, propagation inertia, etc.) that appear in the model via nonlinear least squares. Then in a second phase of processing, we use the fit model to deconvolve BAC or BrAC from the TAC signal measured by the biosensor in the field. However, because of the challenges described above, this approach was not entirely satisfying. Indeed, while it was possible to fit the models quite well to any particular drinking episode, we observed significant variance in the values of the parameters across different individuals and across different drinking episodes for the same individual. Consequently, the fit models did not yield the desired level of accuracy when they were used to deconvolve BAC or BrAC from TAC for a drinking episode that they were not specifically trained on.

To deal with this problem we have been looking at the idea of fitting a population forward model (having BAC or BrAC as input and TAC as output) in the form of a random partial differential equation, to data from multiple drinking episodes and multiple individuals and then using the population model to solve the deconvolution problem. Fitting a population model of this form implies that rather than estimate particular values for the parameters, we treat the parameters as random variables and estimate their distributions. In this way, it will become possible to produce not only an estimate for the BAC or BrAC, but also some form of credible bands to go along with it providing a quantitative estimate of the level of uncertainty in the estimate.

The basic underlying assumption in such an approach is that our first principles physics/physiological based model in essence, describes the dynamics common to the entire population (population interpreted broadly here to include not only all individuals, but also all devices, environmental conditions, and in effect, all ethanol molecules) and to then attribute all unmodeled sources of uncertainty (primarily due to variations in physiology, hardware, and the environment) observed in individual data to random effects. Moreover, we assume that what we observe in any individual data set is the combination or average of these random effects. Thus, this approach is realized by letting the parameters in the PDE model be random variables, the distributions of which are to be estimated based on aggregate population data.

In this paper, we develop an abstract approximation framework and convergence theory for formulating and solving just such an estimation problem. In addition to the theory, we have also included a number of examples and numerical results. However, we do not discuss here the application of these ideas to either the alcohol biosensor problem discussed above or even the deconvolution problem. Those results are presented elsewhere ([31, 32, 33]). In our treatment here, we are strictly concerned with the problem of estimating the distributions of random parameters in a forward model from a particular class of abstract linear infinite dimensional systems for which the input is known and observations of the output for a sampling of members of the target population are available. That is, we are referring to the problem of fitting the population model.

The class of systems we consider here are those governed by abstract parabolic or hyperbolic operators with damping formulated in a Gelfand triple setting together with input and observations on the boundary of the domain. These types of operators are sometimes referred to as being regularly dissipative, and can typically be shown to generate holomorphic or analytic semigroups. We formulate the estimation problem in much the same way as it is in standard linear regression. That is, that each data point is assumed to be an observation of the mean population behavior plus random error. We then formulate the estimation problem as an optimization problem over the space of feasible distributions for the random parameters. The objective of the optimization problem is to minimize prediction error in the form of the difference between the observed output signal and the expectation of the output of the model. We then consider a sequence of approximating estimation problems in each of which the infinite dimensional system is replaced by a finite dimensional approximating system. We then demonstrate that under appropriate (and readily verifiable) assumptions, the solutions to the approximating estimation problems converge to a solution to the original estimation problem with the infinite dimensional state. These convergence results are formulated in a functional analytic or operator theoretic setting and are based on ideas and results from linear semigroup theory.

Our general approach relies heavily on three relatively recent papers: 1) Banks and Thompson’s [7] framework for the estimation of probability measures in random abstract evolution equations and the convergence of finite dimensional approximations in the Prohorov metric, 2) a more recent and enhanced version of the previous paper, [2], and 3) Gittelson, Andreev, and Schwab’s [20] theory for random abstract parabolic partial differential equations with dynamics defined in terms of coercive sesquilinear forms. While our effort here is similar in spirit and takes its cue from the treatment in [2] and [7], it is somewhat different in that we are forced to assume that the probability measures that describe the distribution of our random parameters can be defined in terms of a joint density function; that is, that the random parameters are jointly absolutely continuous.

The approach in [20] is novel in the way that it treats the random parameters in the PDE as another space-like independent variable. This is done by appropriately defining corresponding Bochner spaces in which the weak formulation of the problem is stated and shown to be well-posed. In fact, it turns out that the random parameter dependent regularly dissipative operators that determine the underlying PDE are regularly dissipative when embedded in these Bochner spaces. Consequently, we are able to use linear semigroup theory to develop our approximation framework in much the same way as we have in our earlier deterministic treatments. In this way, finite dimensional approximation is handled in much the same way that it is for the standard deterministic space variables, and the estimation of the distribution of the random parameters effectively becomes analogous to the problem of estimating a variable coefficient in a deterministic PDE, a problem which has been studied extensively over the last thirty years ([4] and [6]).

We use the framework in [20] together with generation and approximation results from linear semigroup theory, (i.e. the Hille-Yosida-Phillips theorem and a version of the Trotter Kato approximation theorem) to establish that the sufficient conditions for a Banks Thompson-like convergence result are satisfied. These theoretical results allow us to develop rigorously established convergent computational algorithms that yield numerical approximations to the desired distributions. Moreover, the solutions in the Bochner spaces and their finite dimensional approximations directly capture the explicit dependence of the state and output (and eventually the deconvolved input) on the random parameters. Using this together with the estimated distributions for the random parameters, it becomes straight forward to directly identify credible intervals for the output without having to re-solve the PDE many times as you would if you were attempting to identify these credible intervals by naive sampling.

An outline of the remainder of the paper is as follows. In Section (2) we formally develop the estimation problem, reformulate it as a nonlinear least squares optimization problem and establish the existence of solutions. In Section (3) we discuss infinite dimensional systems described by regularly dissipative operators involving unbounded input and output (this is typically the case for a PDE with input and output on the boundary). In Section (4) we discuss the framework in [20] for treating systems of the form discussed in Section (3) but now involving random parameters. Our approximation and convergence results are presented in Section (5) and a discussion of examples and our numerical results are in Section (6). Section (7) has a few concluding remarks regarding where we plan to go next with this line of research.

In our discussions to follow we will on occasion use the notation E[X∥f], $E [X ‖ F]$ , or E [X∥π] to denote the expectation of the random variable X with respect to the probability density function f, the cumulative distribution function F, or the probability measure π. We use the “double bar” as opposed to a “single bar” to distinguish what we mean here with conditional expectation.

2. Estimation of Random Discrete Time Dynamical Systems

We consider the family of discrete or sampled time initial value problems that are set in an, in general, infinite dimensional Hilbert state space, $H$ , given by

x_{j + 1, i} = g (t_{j}, x_{j, i}, u_{i}; q), j = 0, \dots, n_{i}, i = 1, 2, \dots, m,

(2.1)

x_{0, i} = x_{0, i} (q), i = 1, 2, \dots, m,

(2.2)

where $g : R^{+} \times H \times \prod_{j = 0}^{n_{i}} R^{μ} \times Q \to H$ and for j = 0, …, n_i and i = 1, 2, …, m, u_i = {u_i,j} is an external input or control with $u_{i, j} \in R^{μ}$ , and t_j = jτ, with τ > 0 the length of the sampling interval, describing the dynamics of a process common to the entire population. In addition, we assume that we can observe some function of the solutions of (2.1)-(2.2), x_j,i, as given by the output equation

y_{j, i} = y (t_{j}, x_{0, i}, u_{i}; q) = C (x_{j, i}, x_{0, i}, u_{i}; q), j = 0, \dots, n_{i}, i = 1, 2, \dots, m,

(2.3)

where $C : H \times H \times \prod_{j = 0}^{n_{i}} R^{μ} \times Q \to R^{ν}$ .

In equations (2.1)-(2.3), we assume q ∈ Q, where Q is the set of admissible parameters (a subset of Euclidean space endowed with Lebesgue measure), and the values of the parameters are specific to each individual in the population. Therefore, assuming that the parameters, q, are samples from a random vector $q$ , the objective is to estimate their (joint) distribution based on the aggregate data sampled from the population. For this purpose, we assume that the distribution of these random vectors is described by the joint pdf $f_{0} \in F (Q)$ , where $F (Q)$ represents a set of feasible pdfs with support in Q.

There are a number of ways to formulate the statistical model that will be used as the basis for the estimation of the distribution of the random parameters. One approach is to treat (2.1)- (2.3) as an, in general, nonlinear mixed effects model (see, for example, [16, 17, 18, 32]) wherein randomness in the parameters, q, are used to quantify uncertainty between subjects, and randomness in the output or measurements, y_j,i given in (2.3) is intended to capture uncertainty within individual subjects. In this case we assume that the observed data points are of the form

V_{j, i} = y_{j, i} + ε_{j, i}, j = 0, \dots, n_{i}, i = 1, \dots, m,

where ε_j,i, j = 0, …, n_i, i = 1, …, m, representing measurement noise are assumed to be independent across subjects (i.e with repsect to i), conditionally independent with respect to $q$ within subjects (i.e with repsect to j), identically distributed with mean 0 and known common variance σ², and with ε_j,i ~ φ, j = 0, …, n_i, i = 1, …, m. In this case, for example, using conditional probability and the total probability formula, a likelihood function could be defined formally as

L (f_{0}; {V_{j, i}}) = \prod_{i = 1}^{m} \int_{Q} L_{i} (q; {V_{j, i}}) f_{0} (q) d q = \prod_{i = 1}^{m} \int_{Q} \prod_{j = 0}^{n_{i}} φ (V_{j, i} - C (x_{j, i}, x_{0, i}, u_{i}; q)) f_{0} (q) d q .

Once one deals with a number of computational issues, specifically, the discretization or parameterization of f₀, finite dimensional approximation of the in general infinite dimensional state equation (2.1), the efficient evaluation of a potentially high dimensional integral, the loss of precision and underflow issues due to the fact that the evaluation of $L$ requires the computation of products of small numbers, etc., one could then seek a maximum likelihood estimator for f₀ by maximizing $L$ or, more typically, an expression involving $l o g L (f_{0}; {V_{j, i}})$ to avoid having to deal with the products. Under appropriate regularity assumptions on φ, f₀, and the system (2.1)- (2.3), one way to do this might be via a gradient based search. Another might be via stochastic optimization. One could also treat direct observations of $q$ as missing data and then use the iterative E-M algorithm to find the MLE (see, for example, [12]).

Alternatively, one could use the likelihood function defined above and take a Bayesian approach (see, for example, [8, 9, 10, 15, 33, 34]). One way of doing this would be to assume f₀ = f₀(·; ρ) has been parameterized by a parameter vector $ρ \in R$ , where $R$ denotes a parameter set. Then assume a prior $P$ on ρ and apply Bayes to obtain the posterior $\hat{P}$ as

\hat{P} (ρ) = \hat{P} (ρ ∣ {V_{j, i}}) = \frac{1}{Z} \hat{L} (ρ; {V_{j, i}}) P (ρ) = \frac{1}{Z} L (f_{0} (\cdot; ρ); {V_{j, i}}) P (ρ) .

where Z is the normalizing constant given by

Z = \int_{R} \hat{L} (ρ; {V_{j, i}}) P (ρ) d ρ = \int_{R} \prod_{i = 1}^{m} \int_{Q} \prod_{j = 0}^{n_{i}} φ (V_{j, i} - C (x_{j, i}, x_{0, i}, u_{i}; q)) f_{0} (q; ρ) d q P (ρ) d ρ .

Still another Bayesian approach could be used to estimate the distribution of $q \sim f_{0}$ directly where now the posterior for $q, \hat{P} = \hat{P} (q)$ serves as the estimator for f₀. In this case we assume that ε_j,i, j = 0, …, n_i, i = 1, …, m are simply independent both across and within subjects, identically distributed with mean 0 and known common variance σ², and with ε_j,i ~ φ, j = 0, …, n_i, i = 1, …, m. If we now let $P$ denote the prior for $q$ , then Bayes yields

\hat{P} (q) = \hat{P} (q ∣ {V_{j, i}}) = \frac{1}{Z} \prod_{i = 1}^{m} L_{i} (q; {V_{j, i}}) P (q) = \frac{1}{Z} \prod_{i = 1}^{m} \prod_{j = 0}^{n_{i}} φ (V_{j, i} - C (x_{j, i}, x_{0, i}, u_{i}; q)) P (q),

where the normalizing constant Z is now given by

Z = \int_{Q} \prod_{i = 1}^{m} L_{i} (q; {V_{j, i}}) P (q) d q = \int_{Q} \prod_{i = 1}^{m} \prod_{j = 0}^{n_{i}} φ (V_{j, i} - C (x_{j, i}, x_{0, i}, u_{i}; q)) P (q) d q .

Both of these Bayesian approaches also have some of the same computational issues as the MLE approach when some sort of MCMC technique such as Metropolis-Hastings or the Gibbs Sampler is used to sample the posterior distribution.

In our study here, however, we take a statistically somewhat less sophisticated approach. We consider the naive pooled data estimator. We do this for a number of reasons. 1) Our primary focus here is the finite dimensional approximation of the infinite dimensional state equation and the convergence of the corresponding estimators and the computational challenges described above would only serve to confound our findings, 2) The naive pooled estimator meshes especially well with the approach we take in dealing with the randomness in the family of PDEs (i.e. abstract parabolic, and eventually, damped hyperbolic) of particular interest to us here in the context of the alcohol biosensor problem described earlier. 3) A reasonable argument could be made that the data we observe is best described as pooled or averaged. We note that it in fact turns out that the approximation and convergence results we present here are highly relevant to the MLE and Bayesian approaches described in the previous paragraphs; we are currently investigating that and we will report on our findings and results in those cases elsewhere. Finally it is interesting to note that in the Bayesian approach, if the prior f₀ and the distribution of the measurement noise process, ε_j,i, as described by the density φ are both assumed to be normal, then the naive pooled data estimator we find here is in fact the Maximum A-Posteriori, or MAP, estimator.

In light of this, our statistical model assumes that the observed data points can be represented by the mean output of the model plus random error. Thus, we assume that we have random observations of the process given by a random array with components

V_{j, i} = E [y_{j, i} ∣ ∣ f_{0}] + ε_{j, i}, j = 0, \dots, n_{i}, i = 1, \dots, m,

(2.4)

where in (2.4), ε_j,i, j = 0, …, n_i, i = 1, …, m, represent measurement noise and are assumed to be independent and identically distributed with mean 0 and known common variance σ². For $f \in F (Q)$ , define

v_{i} (t_{j}; f) = E [y (t_{j}, x_{0, i}, u_{i}; q) ∣ ∣ f] = \int_{Q} C (x_{j, i}, x_{0, i}, u_{i}; q) f (q) d q,

(2.5)

the mean behavior at time t_j, j = 0, …, n_i, if $q \sim f$ .

The estimation problem is to estimate the pdf, f₀, using a least squares approach

\hat{f} = \underset{f \in F (Q)}{\arg \min} J (f; V) = \underset{f \in F (Q)}{\arg \min} \sum_{i = 1}^{m} \sum_{j = 0}^{n_{i}} {(V_{j, i} - v_{i} (t_{j}; f))}^{2} .

(2.6)

where the v_i(t_j; f) are as given in (2.5).

Solving the optimization problem given in (2.6) will typically require finite dimensional approximation of the dynamical system given in (2.1)-(2.2), and the parameterization of the feasible set of pdfs, $F (Q)$ . Indeed, in our treatment here, we assume that the set of pdfs, $F (Q)$ , is parameterized by a vector of parameters θ ∈ Θ, where $Θ \subseteq R^{r}$ is a set of feasible parameters. In this case, we denote the set of pdfs by $F_{Θ} (Q)$ .

We approximate the estimation problem given in (2.6) by a sequence of finite dimensional estimation problems by replacing v_i(t_j; f) with a finite dimensional approximation $v_{i}^{N} (t_{j}; f)$ . We obtain

{\hat{f}}^{N} = \underset{f \in F_{Θ} (Q)}{\arg \min} J^{N} (f; V) = \underset{f \in F_{Θ} (Q)}{\arg \min} \sum_{i = 1}^{m} \sum_{j = 0}^{n_{i}} {(V_{j, i} - v_{i}^{N} (t_{j}; f))}^{2} .

(2.7)

We note that ultimately, we will want to dispense with the assumption that $F (Q)$ has been parametrized by the finite dimensional parameter θ ∈ Θ and actually estimate the shape of f directly. In this case, $F (Q)$ will also have to be approximated or discretized with the level, or dimension of the parameterization having to grow in order to establish convergence. We are currently studying this extension to the results presented here and will discuss our findings elsewhere. Analogous to theorem 5.1 in [7], we have the following convergence result for the ${\hat{f}}^{N,} s$ .

Theorem 2.1. Let $Θ \subseteq R^{r}$ be compact. If

A. The maps on Θ, θ ↦ f (q; θ), for almost every q ∈ Q, and θ ↦ J^N (f (·; θ); V), for all N and $f \in F_{Θ} (Q)$ are continuous,

B. For any sequence of densities $f_{N} \in F_{Θ} (Q)$ with lim_N→∞ f_N(q) = f (q), a.e. q ∈ Q, for some $f \in F_{Θ} (Q)$ , we have $v_{i}^{N} (t_{j}; f_{N})$ converging to v_i(t_j; f) for all i ∈ {1, …, m} and j ∈ {0, …, n_i} as N → ∞, and

C. The v_i(t_j; f) and $v_{i}^{N} (t_{j}; f)$ are uniformly bounded for all j ∈ {0, …, n_i}, i ∈ {1, …, m} and $f \in F_{Θ} (Q)$ ,

then it will follow that there exist solutions ${\hat{f}}^{N}$ to the estimation problems over $F_{Θ} (Q)$ , given in (2.7), and there exists a subsequence of the ${\hat{f}}^{N,} s$ that converges to a solution $\hat{f}$ of the estimation problem over $F_{Θ} (Q)$ given in (2.6).

Proof. Finding the solution to the problem in (2.7) is equivalent to finding the parameters θ ∈ Θ such that J^N (f; V) is minimized. Since Θ is a compact set and the map θ → J^N (f(·; θ); V) is continuous for all N by (A), a solution ${\hat{f}}^{N}$ to the estimation problem (2.7) over $F_{Θ} (Q)$ exists.

Next, let ${f_{N}} \subseteq F_{Θ} (Q)$ be any sequence with lim_N→∞ f_N(q) = f (q), a.e. q ∈ Q for some $f \in F_{Θ} (Q)$ and consider that

∣ J^{N} (f_{N}; V) - J (f; V) ∣ = ∣ \sum_{i = 1}^{m} \sum_{j = 0}^{n_{i}} {(V_{j, i} - v_{i}^{N} (t_{j}; f_{N}))}^{2} - \sum_{i = 1}^{m} \sum_{j = 0}^{n_{i}} {(V_{j, i} - v_{i} (t_{j}; f))}^{2} ∣ \leq \sum_{i = 1}^{m} \sum_{j = 0}^{n_{i}} ∣ 2 V_{j, i} - (v_{i} (t_{j}; f) + v_{i}^{N} (t_{j}; f_{N})) ∣ \cdot ∣ v_{i} (t_{j}; f) - v_{i}^{N} (t_{j}; f_{N}) ∣ \leq M \sum_{i = 1}^{m} \sum_{j = 0}^{n_{i}} ∣ v_{i} (t_{j}; f) - v_{i}^{N} (t_{j}; f_{N}) ∣,

for some M > 0, since v_i(t_j; f) and $v_{i}^{N} (t_{j}; f)$ are uniformly bounded for all i ∈ {1, …, m} and j ∈ {0, …, n_i} (by assumption (C)), and $f \in F_{Θ} (Q)$ . Then, by (B), we obtain

J^{N} (f_{N}; V) \to J (f; V),

(2.8)

as N → ∞. On the other hand, since ${\hat{f}}^{N} = \hat{f} (\cdot; {\hat{θ}}^{N})$ , where ${\hat{θ}}^{N} \in Θ$ , is the minimizer of J^N(f; V), we have

J^{N} ({\hat{f}}^{N}; V) \leq J^{N} (f; V),

(2.9)

for all $f = f (\cdot; θ) \in F_{Θ} (Q)$ and N = 1, 2, …. Since ${{\hat{θ}}^{N}} \subset Θ$ , compact, there exists a subsequence ${\hat{θ}}^{N_{k}}$ with ${\hat{θ}}^{N_{k}} \to \hat{θ}$ as k → ∞. Thus, taking the limit as k → ∞ in (2.9) with N replaced by N_k, and using (2.8) (with $f_{k}^{N} = f$ , all k =1, 2, … when the limit is taken on the right hand side of (2.9)), we obtain

J (\hat{f}; V) \leq J (f; V),

(2.10)

for all $f \in F_{Θ} (Q)$ , where $\hat{f} = \hat{f} (\cdot; \hat{θ})$ . Thus, (2.10) implies that $\hat{f}$ is a solution of estimation problem given in (2.6) over $F_{Θ} (Q)$ . □

3. Abstract Parabolic Systems with Unbounded Input and Output

Let V and H be in general complex (but in many instances, real would suffice) Hilbert spaces with V ↪ H, i.e. V is continuously and densely embedded in H. By identifying H with its dual H*, we obtain the Gelfand triple V ↪ H ↪ V*. Let < ·, · >_H denote the H inner product and ∣·∣_H, ∥·∥_V denote norms on H and V, respectively, and assume that (Q, d_Q) is a compact metric space contained in Euclidean space endowed with Lebesgue measure. In what follows all multi-dimensional vectors, whether in Euclidean or some abstract space, are assumed to be column vectors, unless explicitly stated otherwise. For q ∈ Q, let $a (q; \cdot, \cdot) : V \times V \to C$ be a sesquilinear form that has the following properties

Boundedness There exists a constant α₀ > 0 such that ∣a(q; ψ₁, ψ₂)∣ ≤ α₀∥ψ1∥_V∥ψ₂∥_V, ψ₁, ψ₂ ∈ V, q ∈ Q,
Coercivity There exist constants $λ_{0} \in R$ and μ₀ > 0 such that $a (q_{1}; ψ, ψ) + λ_{0} ∣ ψ ∣_{H}^{2} \geq μ_{0} ‖ ψ ‖_{V}^{2}$ , ψ ∈ V, q ∈ Q,
Measurability For all ψ₁, ψ₂ ∈ V, the map q ↦ a(q; ψ₁, ψ₂) is measurable on Q with respect to all measures defined in terms of the densities in $F_{Θ} (Q)$ , where $Θ \subseteq R^{r}$ is the set of feasible parameters.

Assume further that b(q), c(q) are respectively μ and ν dimensional row vectors in V* with the maps q ↦< b(q), ψ >_V*,V and q ↦< c(q), ψ >_{V*, V} measurable on Q for ψ ∈ V, where < ·, · >_V*,V denotes the duality pairing between V and V*. We consider the system which is written in weak form as

{〈 \dot{x}, ψ 〉}_{V^{*}, V} + a (q; x, ψ) = {〈 b (q), ψ 〉}_{V^{*}, V} u, ψ \in V, x (0) = x_{0} \in H, y (t) = \int_{0}^{T} {〈 c (q), x_{(t)} (s) 〉}_{V^{*}, V} d s,

(3.1)

where T > 0, and φ_(t)(s) = φ(t − s)χ_{[0, T]}(s), s ∈ [0, T]. For $u \in L_{2} ([0, T], R^{μ})$ , it can be shown that (3.1) has a unique solution (see [24, 39]) $x \in W (0, T) ≔ {ψ : ψ \in L_{2} ([0, T], V), \dot{ψ} \in L_{2} ([0, T], V^{*})} \subseteq C ([0, T], H)$ which depends continuously on $u \in L_{2} ([0, T], R^{μ})$ . It follows that $y \in L_{2} ([0, T], R^{ν})$ .

For q ∈ Q, under the assumptions (i),(ii), the sesquilinear form a(q; ·, ·) defines a bounded linear operator A(q) : V → V* by < A(q)ψ₁, ψ₂ >_V*,V = −a(q; ψ₁, ψ₂) where ψ₁, ψ₂ ∈ V. It can be shown further that (see [3, 5, 39]) A(q) restricted to the set Dom(A(q)) = {ϕ ∈ V : A(q)ϕ ∈ H} is the infinitesimal generator of a holomorphic or analytic semigroup of bounded linear operators on H. Moreover, this semigroup can be restricted to be a holomorphic semigroup on V and extended to a holomorphic semigroup on V* by appropriately restricting or extending the domain, Dom(A(q)), of the operator A(q) (see, for example, [3] and [39]).

For q ∈ Q, define the operators $B (q) : R^{μ} \to V^{*} by 〈 B (q) u, φ 〉_{V^{*}, V} = 〈 b (q), φ 〉_{V^{*}, V} u$ and $C (q) : L_{2} ([0, T], V) \to R^{ν}$ by $C (q) ψ = \int_{0}^{T} 〈 c (q), ψ (s) 〉_{V^{*}, V} d s$ , for $u \in R^{μ}$ , φ ∈ V, and ψ ∈ L₂([0, T], V), and rewrite the system in (3.1) as

\dot{x} (t) = A (q) x (t) + B (q) u (t), x (0) = x_{0}, y (t) = C (q) x_{(t)}, t > 0 .

(3.2)

The mild solution of (3.2) is given by the variation of constants formula as

x (t; q) = e^{A (q) t} x_{0} + \int_{0}^{t} e^{A (q) (t - s)} B (q) u (s) d s, t \geq 0 .

(3.3)

Moreover, since the semigroup {e^A(q)t : t ≥ 0} is analytic it follows that

y (t; q) = C (q) x_{(t)} (q) = \int_{0}^{T} {〈 c (q), x_{(t)} (s; q) 〉}_{V^{*}, V} d s, t \geq 0 .

(3.4)

is well defined.

3.1. The Discrete Time Formulation

Now let τ > 0 be a sampling time and consider zero-order hold inputs of the form u(t) = u_j, t ∈ [jτ,(j + 1)τ), j = 0, 1, 2, …. Setting x_j = x(jT), for j = 0, 1, 2, …, (3.3) and (3.4) yield that

x_{j + 1} = \hat{A} (q) x_{j} + \hat{B} (q) u_{j}, y_{i} = \hat{C} (q) x_{(j)}, j = 0, 1, 2, \dots

(3.5)

where now we let x₀ ∈ V. Here, again by the properties of the analytic semigroup (see [26, 39]), we have {e^A(q)t : t ≥ 0}, x_j ∈ V, $\hat{A} (q) = e^{A (q) τ} \in L (V, V)$ and $\hat{B} (q) = \int_{0}^{τ} e^{A (q) s} B (q) d s \in L (R^{μ}, H)$ . The operator $\hat{C} (q)$ appearing in (3.5) is defined by recalling (3.4). We set

\hat{C} (q) x_{(j)} = C (q) x_{(j)},

(3.6)

where x_(j) in (3.6) denotes the function in L₂(0, T, V) given by

x_{(j)} = \sum_{i = 1}^{j} x_{i} χ_{[(j - 1) τ, j τ]} .

(3.7)

Now, in light of the coercivity assumption, Assumption (ii), by making the change of variables z(t) = e^−λ₀tx(t) and v(t) = e^−λ₀tu(t), without loss of generality we may assume that the operator A(q) is invertible with bounded inverse. Thus we have that ${\hat{B} (q) = \int_{0}^{τ} e^{A (q) s} B (q) d s = A (q)^{- 1} e^{A (q) s} B (q) ∣}_{0}^{τ} = (\hat{A} (q) - I) A (q)^{- 1} B (q) \in L (R^{μ}, V)$ . It follows that the recurrence given in (3.5) is a recurrence in V with $\hat{A} (q) \in L (V, V)$ and $\hat{B} (q) \in L (R^{μ}, V)$ . Thus it now becomes possible to allow the discrete time output operator $\hat{C} (q) \in L (V, R^{ν})$ defined in (3.6) and (3.7), if so desired, to take on the much simpler form $\hat{C} (q) x = 〈 c (q), x 〉_{V^{*}, V}$ . In what follows we shall assume that the output operator takes this simpler form.

3.2. Systems with Boundary Input

Of primary interest to us here are systems of the form (3.1) or (3.2) where the input u is on the boundary of the spatial domain. The theory developed in [13] and [27] tells us how in this case to define the input operator B(q) and the notion of a mild solution upon which our approach is based. Let W be a Hilbert space which is densely and continuously embedded in H. Let $Δ (q) \in L (W, H)$ and $Γ (q) \in L (W, R^{μ})$ and assume that $D o m (A (q)) \subseteq N (Γ (q)) \subseteq W$ , Γ(q) is surjective and Δ(q) = A(q) on Dom(A(q)). We then consider the system with input on the boundary given by

\dot{x} (t) = Δ (q) x (t), t > 0, Γ (q) x (t) = u (t), t > 0, y (t) = C (q) x_{(t)}, t > 0, x (0) = x_{0} .

(3.8)

In [13], Curtain and Salamon define a solution to the system (3.8) for the case where $u \in C ([0, T]; R^{μ})$ and x₀ ∈ W with Γ(q)x₀ = u(0), to be a function x ∈ C([0, T]; W) ∩ C¹([0, T]; H) that satisfies (3.8) at every t ∈ (0, T). The operator A(q) densely defined implies that it has an adjoint operator A(q)* : Dom(A(q)*) ⊆ H → H which is also densely defined and closed. Defining Z* to be the Hilbert space Dom(A(q)*) endowed with the graph Hilbert space norm associated with A(q)*, Z* will be continuously and densely embedded in H. So, the Gelfand triple Z* ↪ H ↪ Z is obtained where Z = Z** represents the dual space of Z*. By definition $A (q)^{*} \in L (Z^{*}, H)$ and consequently therefore, $A (q) \in L (H, Z)$ . It follows that the semigroup {e^A(q)t : t ≥ 0} can be uniquely extended to a holomorphic semigroup on Z with infinitesimal generator A(q) : H ⊆ Z → Z, the extension A(q) to H defined via the duality pairing < A(q)ψ, ϕ>_Z,Z*=< ψ, A(q)* ϕ >_H, for ψ ∈ H, and ϕ ∈ Z* = Dom(A(q)*).

For each q ∈ Q, let $Γ^{+} (q) \in L (R^{μ}, W)$ be any right inverse of $Γ (q) \in L (W, R^{μ})$ , and define the operator $B (q) \in L (R^{μ}, Z)$ by B(q) = (Δ(q) – A(q))Γ+(q). It is not difficult to show that B(q) is well defined (i.e. that it does not depend on the particular choice of the right inverse Γ⁺ (q)). Then for any x₀ ∈ H and $u \in L_{2} ([0, T]; R^{μ})$ , the mild solution, x ∈ C([0, T]; Z), of the initial boundary value problem in (3.8) is the Z-valued function given by

x (t) = e^{A (q) t} x_{0} + \int_{0}^{t} e^{A (q) (t - s)} B (q) u (s) d s, t \geq 0 .

(3.9)

It is shown in [13] that if (3.8) has a solution, then it is given by (3.9) where x ∈ C([0, T], H) ∩ H ¹ ((0, T), Z) and moreover, we have that the estimate given by $∣ \int_{0}^{t} e^{A (q) (t - s)} B (q) u (s) d s ∣_{H} \leq k ‖ u ‖_{L_{2} ([0, T]; R^{μ})}$ holds.

We note that if in fact we have that W ⊂ V, which is often the case (for example, in a one dimensional diffusion equation with either Neumann or Robin boundary input (see our examples in Section (6) below), but may not be the case if, for example, the boundary input is Dirichlet), then in the above formulation we may take Z* = V and Z = V*. In this case it will follow that $B (q) = (Δ (q) - A (q)) Γ^{+} (q) \in L (R^{μ}, V^{*})$ and consequently that the theory presented at the beginning of Section (3), and in particular, the discrete time theory presented in Section (3.1), applies. For ease of exposition, we will assume that this is indeed the case for what follows below. We note that all the results continue to follow in the more general case where Z* = Dom(A(q)*). It then follows that $\hat{A} (q) = e^{A (q) τ} \in L (V, V)$ and that $\hat{B} (q) = \int_{0}^{τ} e^{A (q) s} d s B (q) \in L (R^{μ}, V)$ and therefore that

\hat{B} (q) = {\int_{0}^{τ} e^{A (q) s} B (q) d s = A {(q)}^{- 1} e^{A (q) s} B (q) ∣}_{0}^{τ} = (\hat{A} (q) - I) A {(q)}^{- 1} B (q),

and $\hat{C} (q) = C (q) \in L (V, R^{ν})$ . Note that now we have

\hat{B} (q) = (I - \hat{A} (q)) Γ^{+} (q) + \int_{0}^{τ} e^{A (q) s} d s Δ (q) Γ^{+} (q) \in L (R^{μ}, V),

(3.10)

and if Γ⁺(q) can be chosen so that $R (Γ^{+} (q)) \in N (Δ (q))$ , then the expression in (3.10) becomes $\hat{B} (q) = (I - \hat{A} (q)) Γ^{+} (q)$ . Then, if x₀ = 0 ∈ H, y_i is given by

y_{i} = \sum_{j = 0}^{i - 1} C (q) \hat{A} {(q)}^{i - j - 1} \hat{B} (q) u_{i} = \sum_{j = 0}^{i - 1} K_{i, j} u_{j}, i = 1, 2, \dots,

(3.11)

where the operator $K_{i, j} = C (q) \hat{A} (q)^{i - j - 1} (I - \hat{A} (q)) Γ^{+} (q)$ appearing in (3.11) is the gain that represents the contribution of the j^th input channel to the i^th output channel.

4. Random Regularly Dissipative Operators and Their Associated Semigroups

In this section, we summarize the key ideas from the framework developed in [20] and [30] which are central to our approach. We assume that $q$ is a p-dimensional random vector whose support is in $\prod_{i = 1}^{p} [a_{i}, b_{i}]$ where $- \infty < \overset{‒}{α} < a_{i} < b_{i} < \overset{‒}{β} < \infty$ for all i = 1, 2, …, p. Letting $\vec{a} = [a_{i}]_{i = 1}^{p}$ , $\vec{b} = [b_{i}]_{i = 1}^{p}$ and let $Θ \subset R^{r}$ for some r be closed and bounded. We assume that the distribution of $q$ can be represented by an absolutely continuous cumulative distribution function $F (q; \vec{a}, \vec{b}, \vec{θ})$ , or equivalently, by a (push forward) measure $π = π (\vec{a}, \vec{b}, \vec{θ})$ , where $\vec{θ} \in Θ$ . Let a(·; ·, ·) be a sesquilinear form satisfying (i)-(iii) given in Section (3), where the assumed measurability is with respect to all of the measures $π = π (\vec{a}, \vec{b}, \vec{θ})$ .

Define the Bochner spaces $V = L_{π}^{2} (Q; V)$ and $H = L_{π}^{2} (Q; H)$ . The assumptions from Section (3) on the spaces V and H guarantee that the spaces $V$ , $H$ and $V^{*}$ form the Gelfand triple $V ↪ H ↪ V^{*}$ (see [20]) where $H$ is identified with its dual $H^{*}$ and $V^{*}$ is identified with $L_{π}^{2} (Q; V^{*})$ .

For $\vec{a} = [a_{i}]_{i = 1}^{p}$ , $\vec{b} = [b_{i}]_{i = 1}^{p}$ satisfying $- \infty < \overset{‒}{α} < a_{i} < b_{i} < \overset{‒}{β} < \infty$ for all i = 1, 2, …, p, and $\vec{θ} \in Θ$ , set $ρ = (\vec{a}, \vec{b}, \vec{θ})$ . Then we define the π(ρ)-averaged sesquilinear forms $a (ρ; \cdot, \cdot) : V \times V \to C$ (note, the spaces $H$ , $V$ , and $V^{*}$ now of course depend on ρ, but our notation here we will not explicitly show this dependence unless clarity demands it) by

a (ρ; φ, ψ) = \int_{Q} a (q; φ (q), ψ (q)) d π (q; ρ) = E [a (q; φ (q), ψ (q)) ∣ ∣ π (ρ)],

(4.1)

where $φ, ψ \in V$ and $ρ = (\vec{a}, \vec{b}, \vec{θ})$ . It is not difficult to show that Assumptions (i)-(iii) imply that $a (ρ; \cdot, \cdot)$ is a bounded and coercive sesquilinear form on $V \times V$ . Consequently, this sesquilinear form defines a bounded linear map $A (ρ) : V \to V^{*}$ by $< A (ρ) φ, ψ >_{V^{*}, V} = - a (ρ; φ, ψ)$ which when appropriately restricted or extended is the infinitesimal generator of analytic semigroups of bounded linear operators ${e^{A (ρ) t} : t \geq 0}$ on $V$ , $H$ and $V^{*}$ (see [3, 5, 39]). We assume that the maps q ↦< b(q), ψ(q) >_V*,V and q ↦< c(q), ψ(q) >_V*,V are π(ρ)-measurable for any $ψ \in V$ , and that ∥b(q)∥_V*, ∥c(q)∥_V* are uniformly bounded for a.e. q ∈ Q. We then define $B (ρ) : R^{μ} \to V^{*}$ and $C (ρ) : V \to R^{ν}$ by

< B (ρ) u, ψ >_{V^{*}, V} = \int_{Q} {〈 b (q), ψ (q) 〉}_{V^{*}, V} d π (q; ρ) u = E [{〈 b (q), ψ (q) 〉}_{V^{*}, V} ∣ ∣ π (ρ)] u,

(4.2)

C (ρ) ψ = \int_{Q} {〈 c (q), ψ 〉}_{V^{*}, V} d π (q; ρ) = E [{〈 c (q), ψ (q) 〉}_{V^{*}, V} ∣ ∣ π (ρ)],

(4.3)

for $u \in R^{μ}$ and $ψ \in V$ .

With the definitions (4.1) - (4.3) of the operators $A$ , $B$ , and $C$ , consider the abstract evolution system given by

\dot{x} (t) = A (ρ) x (t) + B (ρ) u (t), x (0) = x_{0} \in H, Y (t) = C (ρ) x (t), t > 0,

(4.4)

whose mild solution is given by

x (t) = T (t; ρ) x_{0} + \int_{0}^{t} T (t - s; ρ) B (ρ) u (s) d s, t \geq 0,

(4.5)

where $T (t; ρ) = {e^{A (ρ) t} : t \geq 0}$ is the analytic semigroup generated by the operator $A (ρ)$ . From (4.4) and (4.5), it follows that

Y (t) = \int_{0}^{t} C (ρ) T (t - s; ρ) B (ρ) u (s) d s, t \geq 0 .

(4.6)

As in Section (3), we obtain a discrete or sampled time version of (4.4). Now let x₀ ∈ V, let τ > 0 be the sampling time, and consider zero-order hold inputs of the form u(t) = u_j, t ∈ [jτ, (j + 1)τ),τ), j = 0, 1, 2, …. Setting $X_{j} = X (j τ)$ and $Y_{j} = Y (j τ)$ , j = 0, 1, 2, …, (4.5) and (4.6) yield

x_{j + 1} = \hat{A} (ρ) x_{j} + \hat{B} (ρ) u_{j}, Y_{j} = \hat{C} (ρ) x_{j}, j = 0, 1, 2, \dots,

(4.7)

with $X_{0} \in V$ and $\hat{A} (ρ) = T (τ; ρ) \in L (V, V)$ , $\hat{B} (ρ) = \int_{0}^{τ} T (s; ρ) B (ρ) d s \in L (R^{μ}, V)$ , and $\hat{C} (ρ) = C (ρ) \in L (V, R^{ν})$ . Note that the operators $\hat{A} (ρ)$ and $\hat{B} (ρ)$ are bounded since ${T (t; ρ) : t \geq 0}$ is an analytic semigroup on $V$ , $H$ , and $V^{*}$ (see [3, 5, 24, 39]). If $A (ρ) : Dom (A (ρ)) \subseteq V^{*} \to V^{*}$ has bounded inverse, then $\hat{B} (ρ) = \int_{0}^{τ} T (s; ρ) B (ρ) d s = \hat{A} {(ρ)}^{- 1} T (s; ρ) B (ρ) ∣_{0}^{τ} = (\hat{A} (ρ) - I) A {(ρ)}^{- 1} B (ρ) \in L (R^{μ}, V)$ .

It is shown in [20] and [30] that the solutions of systems (4.4) and (3.2) and (4.7) and (3.5) agree for π-a.e. q ∈ Q. It follows that

Y (t) = C (ρ) x (t) = E [y (t; q) ∣ ∣ π (ρ)] = E [C (q) x (t; q) ∣ ∣ π (ρ)], \forall t \geq 0,

(4.8)

and hence, from (4.8), that

Y_{j} = \hat{C} (ρ) x_{j} = E [Y_{j} (q) ∣ ∣ π (ρ)] = E [\hat{C} (q) x_{j} (q) ∣ ∣ π (ρ)],

(4.9)

where in (4.8) and (4.9) $E [\cdot ‖ π]$ denotes expectation with respect to the measure π.

5. Approximation and Convergence

In this section, we can now formally state our estimation problem and the sequence of finite dimensional approximating problems. We will also state and prove a convergence theorem.

5.1. The Estimation Problem

Assume that data of the form ${({{\tilde{u}}_{i, j}}_{j = 0}^{n_{i} - 1}, {{\tilde{y}}_{i, j}}_{j = 0}^{n_{i}})}_{i = 1}^{m}$ , has been given. Determine $ρ^{*} = ({\vec{a}}^{*}, {\vec{b}}^{*}, {\vec{θ}}^{*}) \in Ξ, Ξ$ a compact subset of $R^{2 p} \times Θ \subset R^{2 p + r}$ , ${\vec{a}}^{*} = {[a_{i}^{*}]}_{i = 1}^{p}$ , ${\vec{b}}^{*} = {[b_{i}^{*}]}_{i = 1}^{p}$ , which minimizes

J (ρ) = \sum_{i = 1}^{m} J_{i} (ρ) = \sum_{i = 1}^{m} \sum_{j = 0}^{n_{i}} {∣ Y_{i, j} ({{\tilde{u}}_{i, k}}_{k = 0}^{n_{i} - 1}, ρ) - {\tilde{y}}_{i, j} ∣}^{2}

(5.1)

where for i = 1, 2, …, m, $Y_{i, j} ({{\tilde{u}}_{i, k}}_{k = 0}^{n_{i} - 1}, ρ)$ is given by (4.7) with $u_{j} = {\tilde{u}}_{i, j}$ , j = 0, …, n, i = 1, 2, …, m, and (4.9).

Recalling the assumption that for i ∈ {1, 2, …, p}, $- \infty < \overset{‒}{α} \leq a_{i} < b_{i} \leq \overset{‒}{β} < \infty$ , let $\overset{‒}{Q} = \prod_{i = 1}^{p} [\overset{‒}{α}, \overset{‒}{β}]$ . Let $\overset{‒}{ρ} = ({[\overset{‒}{α}]}_{i = 1}^{p}, {[\overset{‒}{β}]}_{i = 1}^{p} \vec{θ}) \in Ξ$ , $\overset{‒}{H} = L_{π (\overset{‒}{ρ})}^{2} (\overset{‒}{Q}; H)$ and $\overset{‒}{V} = L_{π (\overset{‒}{ρ})}^{2} (\overset{‒}{Q}; V)$ . Then, for N = 1, 2, …, let ${\vec{a}}^{N} = {[a_{i}^{N}]}_{i = 1}^{p}$ , ${\vec{b}}^{N} = {[b_{i}^{N}]}_{i = 1}^{p}$ be such that $- \infty < \overset{‒}{α} \leq a_{i}^{N} < b_{i}^{N} \leq \overset{‒}{β} < \infty$ , and let $ρ^{N} = ([{\vec{a}}^{N}, {\vec{b}}^{N}, \vec{θ}) \in Ξ$ . Set $Q^{N} = \prod_{i = 1}^{p} [a_{i}^{N}, b_{i}^{N}]$ , $H^{N} = L_{π ({\overset{‒}{ρ}}^{N})}^{2} (Q^{N}, H)$ , $V^{N} = L_{π ({\overset{‒}{ρ}}^{N})}^{2} (Q^{N}, V)$ and let $U^{N}$ be a finite dimensional subspace of $V^{N}$ . Let $L^{N} : \overset{‒}{H} \to H^{N}$ be a linear map defined by $L^{N} (ψ) = ψ ∣_{Q^{N}}$ for any $ψ \in \overset{‒}{H}$ , let $P^{N} : H^{N} \to U^{N}$ denote the orthogonal projection of $H^{N}$ onto $U^{N}$ , and define $J^{N} : \overset{‒}{H} \to U^{N}$ by $J^{N} = P^{N} \circ I^{N}$ .

In addition, recall that we have assumed that for p ∈ Ξ, the probability distributions described by π(ρ) are all absolutely continuous; that is π(ρ) ~ f(ρ), where f(ρ) = f(·; ρ) is a joint density for the random vector $q$ .

Noting that in this formulation, $U^{N}$ is neither a subspace of $\overset{‒}{H}$ nor $\overset{‒}{V}$ , we define the operators $A^{N} (ρ)$ on $U^{N}$ to be what are essentially the restrictions of $A (ρ)$ to the spaces $U^{N}$ . More precisely, we set

〈 A^{N} (ρ) v^{N}, w^{N} 〉 = - a (ρ; v^{N}, w^{N}) = - \int_{Q} a (q; v^{N} (q), w^{N} (q)) d π (q, ρ) = - \int_{Q} a (q; v^{N} (q), w^{N} (q)) f (q; ρ) d q = - E [a (q; v^{N} (q), w^{N} (q)) ∣ ∣ π (ρ)],

(5.2)

where $V^{N}$ , $w^{N} \in U^{N}$ .

Define the operators $B^{N} (ρ) : R^{μ} \to U^{N}$ and $C^{N} (ρ) : U^{N} \to R^{ν}$ by

< B^{N} (ρ) u, v^{N} >_{V^{*}, V} = \int_{Q} {〈 b (q), v^{N} (q) 〉}_{V^{*}, V} d π (q; ρ) u = E [{〈 b (q), v^{N} (q) 〉}_{V^{*}, V} ∣ ∣ π (ρ)] u,

(5.3)

C^{N} (ρ) v^{N} = \int_{Q} {〈 c (q), v^{N} 〉}_{V^{*}, V} d π (q; ρ) = E [{〈 c (q), v^{N} (q) 〉}_{V^{*}, V} ∣ ∣ π (ρ)],

(5.4)

where $v^{N} \in U^{N}$ , and $u \in R^{μ}$ .

With these definitions, we can now state the finite dimensional approximating problems.

Assume that data of the form ${({{\tilde{u}}_{i, j}}_{j = 0}^{n_{i} - 1}, {{\tilde{y}}_{i, j}}_{j = 0}^{n_{i}})}_{i = 1}^{m}$ , has been given. Determine $ρ^{N *} = ({\vec{a}}^{N *}, {\vec{b}}^{N *}, {\vec{θ}}^{N *}) \in Ξ$ , Ξ a compact subset of $R^{2 p} \times Θ \subset R^{2 p + r}$ , ${\vec{a}}^{N *} = {[a_{i}^{N *}]}_{i = 1}^{p}$ , ${\vec{b}}^{N *} = {[b_{i}^{N *}]}_{i = 1}^{p}$ , which minimizes

J^{N} (ρ) = \sum_{i = 1}^{m} \sum_{j = 0}^{n_{i}} {∣ Y_{i, j}^{N} ({{\tilde{u}}_{i, k}}_{k = 0}^{n_{i} - 1}, ρ) - {\tilde{y}}_{i, j} ∣}^{2},

(5.5)

where in (5.5), for i = 1, 2, …, m, $Y_{i, j}^{N} ({{\tilde{u}}_{i, k}}_{k = 0}^{n_{i} - 1}, ρ) = C {(ρ)}^{N} x_{i, j}^{N}$ is given by (4.7) and (4.9) with $u_{j} = {\tilde{u}}_{i, j}$ , j = 0, …, n_i, i = 1, 2, …, m, $X_{j}$ replaced by $X_{i, j}^{N} \in U^{N}$ , $\hat{A} (ρ)$ replaced by

{\hat{A}}^{N} (ρ) = T^{N} (τ; ρ) = e^{A^{N} (ρ) τ} \in L (U^{N}, U^{N}),

$\hat{B} (ρ)$ replaced by ${\hat{B}}^{N} (ρ) = \int_{0}^{τ} e^{A^{N} (ρ) s} B^{N} (ρ) d s \in L (R^{μ}, U^{N})$ , $\hat{C} (ρ)$ replaced by ${\hat{C}}^{N} (ρ) \in L (U^{N}, R^{ν})$ , and $X_{i, 0}$ replaced by $X_{i, 0}^{N} = J^{N} X_{i, 0} \in U^{N}$ . It follows that for i = 1, 2, …, m,

x_{i, j + 1}^{N} = {\hat{A}}^{N} (ρ) x_{i, j}^{N} + {\hat{B}}^{N} (ρ) {\tilde{u}}_{i, j}, Y_{i, j}^{N} = C^{N} (ρ) x_{i, j}^{N} j = 0, 1, 2, \dots,

(5.6)

with the operators $A^{N} (ρ)$ , $B^{N} (ρ)$ , and $C^{N} (ρ)$ appearing in (5.6) are as they have been defined above using (5.2)-(5.4).

In the following sections we prove that there exists a subsequence of solutions to the sequence of approximating problems that converges to the solution of our original estimation/optimization problem.

5.2. A Version of the Trotter-Kato Semigroup Approximation Theorem

Our convergence proof is based on a version of the Trotter-Kato semigroup approximation theorem ([5, 21, 26]) that does not require the approximating spaces to be subspaces of the underlying infinite dimensional state space. Banks, Burns and Cliff [1] proved just such a result but unfortunately they do not state their hypotheses in terms of resolvent convergence which is what we require here. Consequently we establish the result in its requisite form here.

Let $\hat{H}$ be a Hilbert space with norm ∣ · ∣ and let ${{\hat{H}}^{N}}$ be a sequence of Hilbert spaces, each equipped with norm ∣ · ∣_N. Assume that for each $N \in N$ , ${\hat{U}}^{N}$ is a closed (finite dimensional) subspace of ${{\hat{H}}^{N}}$ . Assume that the operators $\hat{A}$ on $\hat{H}$ , and for each $N \in N$ , ${\hat{A}}^{N}$ on ${\hat{U}}^{N}$ , are in G(M, λ₀) with M and λ₀ independent of N; that is they are the infinitesimal generators of C₀-semigroups $\hat{S} (t)$ on $\hat{H}$ and ${\hat{S}}^{N} (t)$ , on ${\hat{U}}^{N}$ , respectively, that are uniformly (uniformly in N) exponentially bounded. (We note that if $\hat{A}$ is obtained from a bounded and coercive sesquilinear form and the ${\hat{U}}^{N,} s$ are subspaces with ${\hat{A}}^{N}$ defined as the restrictions of $\hat{A}$ to ${\hat{U}}^{N}$ , then this latter assumption is easily verified [3, 5].)

Theorem 5.1. Let $\hat{H}$ , ${\hat{H}}^{N}$ , and ${\hat{U}}^{N}$ be Hilbert spaces as defined above. Let $J^{N} : \hat{H} \to {\hat{H}}^{N}$ be an operator such that $Im (J^{N}) = {\hat{H}}^{N}$ and ${∣ J^{N} z ∣}_{N} \leq ∣ z ∣$ . Let $P^{N} : {\hat{H}}^{N} \to {\hat{U}}^{N}$ be the canonical projection of ${\hat{H}}^{N}$ onto ${\hat{U}}^{N}$ and define $P^{N} ≔ P^{N} \circ I^{N}$ . Let $\hat{A} \in G (M, λ_{0})$ on $\hat{H}$ , and ${\hat{A}}^{N} \in G (M, λ_{0})$ on ${\hat{U}}^{N}$ . Suppose that for some λ ≥ λ₀,

{∣ P^{N} R_{λ} (\hat{A}) z - R_{λ} ({\hat{A}}^{N}) P^{N} z ∣}_{N} \to 0, a s N \to \infty,

(5.7)

for every $z \in \hat{H}$ , where $R_{λ} (\hat{A}) = {(λ I - \hat{A})}^{- 1}$ and $R_{λ} ({\hat{A}}^{N}) = {(λ I - {\hat{A}}^{N})}^{- 1}$ denote respectively the resolvent operators of $\hat{A}$ and ${\hat{A}}^{N}$ at λ. Then

{∣ P^{N} \hat{S} (t) z - {\hat{S}}^{N} (t) P^{N} z ∣}_{N} \to 0, a s N \to \infty,

(5.8)

in ${\hat{H}}^{N}$ , for every $z \in \hat{H}$ uniformly in t on compact t-intervals.

Proof. For ease of exposition and without loss of generality, let λ₀ = 0. Then, since $\hat{S} (t) R_{λ} (\hat{A})$ and ${\hat{S}}^{N} (t) R_{λ} ({\hat{A}}^{N})$ are both strongly differentiable in t, we have

\frac{d}{d t} \hat{S} (t) R_{λ} (\hat{A}) = \hat{A} \hat{S} (t) R_{λ} (\hat{A}) = \hat{S} (t) \hat{A} R_{λ} (\hat{A}) = \hat{S} (t) [λ R_{λ} (\hat{A}) - I] .

(5.9)

Then, using an identity for ${\hat{S}}^{N} (t) R_{λ} ({\hat{A}}^{N})$ analogous to (5.9) , we obtain

\frac{d}{d s} [{\hat{S}}^{N} (t - s) R_{λ} ({\hat{A}}^{N}) P^{N} \hat{S} (s) R_{λ} (\hat{A})] = {\hat{S}}^{N} (t - s) [P^{N} R_{λ} (\hat{A}) - R_{λ} ({\hat{A}}^{N}) P^{N}] \hat{S} (s) .

(5.10)

Then, since

{{\hat{S}}^{N} (t - s) R_{λ} ({\hat{A}}^{N}) P^{N} \hat{S} (s) R_{λ} (\hat{A}) ∣}_{s = 0}^{s = t} = R_{λ} ({\hat{A}}^{N}) [P^{N} \hat{S} (t) - {\hat{S}}^{N} (t) P^{N}] R_{λ} (\hat{A}),

(5.11)

(5.10) and (5.11) imply that

R_{λ} ({\hat{A}}^{N}) [P^{N} \hat{S} (t) - {\hat{S}}^{N} (t) P^{N}] R_{λ} (\hat{A}) = \int_{0}^{t} {\hat{S}}^{N} (t - s) [P^{N} R_{λ} (\hat{A}) - R_{λ} ({\hat{A}}^{N}) P^{N}] \hat{S} (s) d s .

(5.12)

Equation (5.12) and $∣ {\hat{S}}^{N} (t - s) ∣ \leq M$ (recall λ₀ = 0), for any $u \in \hat{H}$ , yield

{∣ R_{λ} ({\hat{A}}^{N}) [P^{N} \hat{S} (t) - {\hat{S}}^{N} (t) P^{N}] R_{λ} (\hat{A}) u ∣}_{N} \leq M \int_{0}^{t} {∣ [P^{N} R_{λ} (\hat{A}) - R_{λ} ({\hat{A}}^{N}) P^{N}] \hat{S} (s) u ∣}_{N} d s .

(5.13)

By (5.7), we know that the integrand in (5.13) converges to 0 for a fixed s, and also it is bounded by 2M²∣u∣/λ, and therefore, by the Lebesgue Dominated Convergence Theorem, the right-hand side of (5.13) converges to 0 as N → ∞, where the convergence is uniform in t on compact t-intervals.

Letting $v = R_{λ} (\hat{A}) u$ , and using the fact that $D (\hat{A})$ is dense in $\hat{H}$ , we have that

∣ R_{λ} ({\hat{A}}^{N}) [P^{N} \hat{S} (t) - {\hat{S}}^{N} (t) P^{N}] v ∣_{N} \to 0, a s N \to \infty,

(5.14)

for all $v \in \hat{H}$ . Then, since $∣ \hat{S} (t) ∣ \leq M$ , (5.7) implies that

∣ R_{λ} ({\hat{A}}^{N}) {\hat{S}}^{N} (t) P^{N} v - {\hat{S}}^{N} (t) P^{N} R_{λ} (\hat{A}) v ∣_{N} = ∣ {\hat{S}}^{N} (t) [R_{λ} ({\hat{A}}^{N}) P^{N} v - P^{N} R_{λ} (\hat{A}) v] ∣_{N} \to 0,

(5.15)

and similarly, $∣ {\hat{S}}^{N} (t) ∣ \leq M$ and ((5.7)) imply that

∣ R_{λ} ({\hat{A}}^{N}) P^{N} \hat{S} (t) v - P^{N} \hat{S} (t) R_{λ} (\hat{A}) v ∣_{N} = ∣ [R_{λ} ({\hat{A}}^{N}) P^{N} - P^{N} R_{λ} (\hat{A})] \hat{S} (t) v ∣_{N} \to 0 .

(5.16)

Combining (5.15), (5.16), and the triangle inequality we get

∣ R_{λ} ({\hat{A}}^{N}) [{\hat{S}}^{N} (t) P^{N} - P^{N} \hat{S} (t)] v + [P^{N} \hat{S} (t) - {\hat{S}}^{N} (t) P^{N}] R_{λ} (\hat{A}) v ∣_{N} \to 0,

(5.17)

as N → ∞. Then, because of (5.14), and again by the triangle inequality, we obtain that

∣ [P^{N} \hat{S} (t) - {\hat{S}}^{N} (t) P^{N}] R_{λ} (\hat{A}) v ∣_{N} \to 0, a s N \to \infty .

(5.18)

Letting $w = R_{λ} (\hat{A}) v$ , we have $w \in Dom ({\hat{A}}^{2})$ ; and since $Dom ({\hat{A}}^{2})$ is dense in $\hat{H}$ , it follows from (5.7), (5.17) and (5.18) that

∣ {\hat{S}}^{N} (t) P^{N} z - P^{N} \hat{S} (t) z ∣_{N} \to 0, a s N \to \infty,

for all $z \in \hat{H}$ uniformly in t on compact t-intervals. □

5.3. Application to the Density Estimation Problem

Let {ρ^N}, ρ ∈ Ξ be such that f^N(q) → f(q), for almost every q ∈ Q, where f^N(q) = f(q; ρ^N) and f(q) = f(q; ρ). Let $\overset{‒}{H}$ , $\overset{‒}{V}$ , ${\hat{V}}^{N}$ , $U^{N}$ , $I^{N}$ , $\overset{‒}{H} \to {\hat{H}}^{N}$ , $P^{N} : H^{N} \to U^{N}$ , and $J^{N} : \overset{‒}{H} \to U^{N}$ be as they were defined earlier. Set $A = A (ρ)$ and consider it to be an operator on $\overset{‒}{H}$ and $\overset{‒}{V}$ by extending f(·, ρ), which is defined on Q, to $\overset{‒}{Q}$ by setting it equal to zero on $\overset{‒}{Q} ∖ Q$ and let $A^{N} = A^{N} (ρ^{N})$ . Then it follows from Assumptions (i) - (iii) that $A$ is in G(M, λ₀) on $\overset{‒}{H}$ and $A^{N}$ is in G(M, λ₀) on $H^{N}$ with M and λ₀ independent of N.

In the statement of Theorem (5.1), set $\hat{H} = \overset{‒}{H}$ , ${\hat{H}}^{N} = H^{N}$ , ${\hat{U}}^{N} = U^{N}$ , $P^{N} = J^{N}$ , $\hat{A} = A$ , and ${\hat{A}}^{N} = A^{N}$ . To apply Theorem (5.1) and conclude that in this case, (5.8) holds, we need only verify (5.7). In order to do this, we require the following two additional assumptions

There exist positive real numbers γ and δ such that for any ρ ∈ Ξ, we have 0 < γ ≤ f(q; ρ) ≤ δ < ∞ for π(ρ)-a.e. q ∈ Q.
For all $w \in \overset{‒}{V}$ , there exists $u^{N} \in U^{N}$ such that ${‖ u^{N} - J^{N} w ‖}_{V^{N}} \to 0$ as N → ∞.

We are now able to prove the following theorem.

Theorem 5.2. Let assumptions (i) - (v) be satisfied and let {ρ^N}, ρ ∈ Ξ be such that f^N(q) → f(q), for almost every $q \in \overset{‒}{Q}$ , where f^N(q) = f(q; ρ^N) and f(q) = f(q; ρ). Then, with the definitions above, the conditions of Theorem (5.1) (and in particular the resolvent convergence specified in (5.7)) are satisfied. Consequently, it follows that

‖ T^{N} (t; ρ^{N}) P^{N} z - J^{N} T (t; ρ) z ‖_{H^{N}} \to 0, a s N \to \infty,

(5.19)

for every $z \in H$ , uniformly in t on compact t-intervals where $T^{N} = {T^{N} (t; ρ^{N}) : t \geq 0}$ is the semigroup on $H^{N} (ρ)$ given by $T^{N} (t; ρ^{N}) = e^{A^{N} (ρ^{N}) t}$ and $T = {T (t; ρ) : t \geq 0}$ is the semigroup on $H$ and $\overset{‒}{H}$ given by $T (t; ρ) = e^{A t} = e^{A (ρ) t}$ .

Proof. First, note that if we can show resolvent convergence for every $z \in \overset{‒}{V}$ , then since $\overset{‒}{V}$ is dense in $\overset{‒}{H}$ , and $J^{N} R_{λ_{0}} (A)$ and $R_{λ_{0}} (A^{N}) J^{N}$ are uniformly bounded, the desired resolvent convergence for every $z \in \overset{‒}{H}$ will have been demonstrated. In what follows, for any $ρ = (\vec{a}, \vec{b}, \vec{θ}) \in Ξ$ , f (·; ρ) is defined on $Q = \prod_{i = 1}^{p} [a_{i}, b_{i}]$ , but it can be extended to be defined on $\overset{‒}{Q}$ by setting it equal to zero on $\overset{‒}{Q} ∖ Q$ . We will use this fact frequently below without further remark.

Let $z \in \overset{‒}{V}$ and define $w = R_{λ_{0}} (A) z$ , and $w^{N} = R_{λ_{0}} (A^{N}) J^{N} z$ . Suppose also that $u^{N} \in U^{N}$ be as in Assumption (v) for $w = R_{λ_{0}} (A) z$ .

Then, by triangle inequality, we have

‖ J^{N} w - w^{N} ‖_{V^{N}} \leq ‖ J^{N} w - u^{N} + u^{N} - w^{N} ‖_{V^{N}} \leq ‖ J^{N} w - u^{N} ‖_{V^{N}} + ‖ u^{N} - w^{N} ‖_{V^{N}} .

(5.20)

Thus, (5.20), Assumption (v) and the continuous embedding of $V^{N}$ in $H^{N}$ imply that it is enough to show that ${‖ u^{N} - w^{N} ‖}_{V^{N}} \to 0$ as N → ∞. Let z^N = w^N – u^N. Then, since $w^{N} \in U^{N} \subset V^{N})$ ,

A (ρ^{N}; w^{N}, z^{N}) = {〈 - A^{N} w^{N}, z^{N} 〉}_{H^{N}} = {〈 (λ_{0} I - A^{N}) R_{λ_{0}} (A^{N}) J^{N} z, z^{N} 〉}_{H^{N}} - λ_{0} {〈 w^{N}, z^{N} 〉}_{H^{N}} = {〈 J^{N} z, z^{N} 〉}_{H^{N}} - λ_{0} {〈 w^{N}, z^{N} 〉}_{H^{N}} .

(5.21)

Also, since $w \in Dom (A)$ ,

a (ρ; w, I^{N +} z^{N}) = {〈 - A w, I^{N +} z^{N} 〉}_{\overset{‒}{H}} = {〈 (λ_{0} I - A) R_{λ_{0}} (A) z, I^{N +} z^{N} 〉}_{\overset{‒}{H}} - λ_{0} {〈 w, I^{N +} z^{N} 〉}_{\overset{‒}{H}} = {〈 z, I^{N +} z^{N} 〉}_{\overset{‒}{H}} - λ_{0} {〈 w, I^{N +}, z^{N} 〉}_{\overset{‒}{H}},

(5.22)

where $J^{N^{+}}$ denotes the Moore-Penrose generalized inverse [11] of $J^{N}$ . We note that for $ψ \in H^{N}$ , $J^{N^{+}} ψ$ is the function in $\overset{‒}{H}$ that agrees with ψ on Q^N and is zero on $\overset{‒}{Q} ∖ Q^{N}$ . Then, from (5.21) and (5.22), we obtain

a (ρ^{N}; w^{N}, z^{N}) - a (ρ; w, I^{N +} z^{N}) = 〈 - I^{N} z, z^{N} 〉_{H^{N}} - λ_{0} 〈 w^{N}, z^{N} 〉_{H^{N}} - {〈 z, I^{N +} z^{N} 〉}_{\overset{‒}{H}} + λ_{0} {〈 w, I^{N +} z^{N} 〉}_{\overset{‒}{H}} .

(5.23)

Recalling Assumptions (i) and (ii) for the form α(·; ·, ·) on V × V, let ${\tilde{α}}_{0}$ , ${\tilde{μ}}_{0}$ , ${\tilde{λ}}_{0}$ denote the boundedness and coercivity coefficients for the forms $A (\cdot; \cdot, \cdot)$ . Then, using boundedness, coercivity, Assumptions (iv) and (v), Young’s and the Cauchy Schwarz Inequalities, and the continuous embeddings of the space V in the space H (i.e. that there exist a constant k such that ∣ · ∣_H ≤ k∥ · ∥_V) and (5.23), for any ε > 0, we obtain

{\tilde{µ}}_{0} {‖ z^{N} ‖}_{V^{N}} \leq A (ρ^{N}; z^{N}, z^{N}) + {\tilde{λ}}_{0} ∣ z^{N} ∣ {_{H}}^{N} = A (ρ^{N}; w^{N}, z^{N}) - a (ρ^{N}; u^{N}, z^{N}) + {\tilde{λ}}_{0} {∣ z^{N} ∣}_{H^{N}}^{2} = A (ρ^{N}; w^{N}, z^{N}) - a (ρ; w, I^{N^{+}} z^{N}) + A (ρ; w, I^{N^{+}} z^{N}) - a (ρ^{N}; u^{N}, z^{N}) + {\tilde{λ}}_{0} {∣ z^{N} ∣}_{H^{N}}^{2} = {〈 I^{N} z, z^{N} 〉}_{H^{N}} - {\tilde{λ}}_{0} {〈 w^{N}, z^{N} 〉}_{H^{N}} - {〈 z, I^{N^{+}} z^{N} 〉}_{\overset{‒}{H}} + {\tilde{λ}}_{0} {〈 w, I^{N^{+}} z^{N} 〉}_{\overset{‒}{H}} + \int_{\overset{‒}{Q}} (a (q; w, z^{N}) f (q) - a (q; u^{N}, z^{N}) f^{N} (q)) d q + {\tilde{λ}}_{0} {∣ z^{N} ∣}_{H^{N}}^{2} = \int_{\overset{‒}{Q}} ({〈 z, z^{N} 〉}_{H} (f^{N} (q) - f (q)) d q + {\tilde{λ}}_{0} \int_{\overset{‒}{Q}} ({〈 w, z^{N} 〉}_{H} f (q) - {〈 u^{N}, z^{N} 〉}_{H} f^{N} (q)) d q + \int_{\overset{‒}{Q}} (a (q; w, z^{N}) f (q) - a (q; u^{N}, z^{N}) f^{N} (q)) d q = \int_{\overset{‒}{Q}} ({〈 z, z^{N} 〉}_{H} (f^{N} (q) - f (q)) d q + {\tilde{λ}}_{0} \int_{\overset{‒}{Q}} {〈 w, z^{N} 〉}_{H} (f (q) - f^{N} (q)) d q + {\tilde{λ}}_{0} \int_{\overset{‒}{Q}} {〈 w - u^{N}, z^{N} 〉}_{H} f^{N} (q) d q + \int_{\overset{‒}{Q}} (a (q; w, z^{N}) (f (q) - f^{N} (q)) d q + \int_{\overset{‒}{Q}} a (q; w - u^{N}, z^{N}) f^{N} (q) d q \leq \int_{\overset{‒}{Q}} {∣ z ∣}_{H} {∣ z^{N} ∣}_{H} ∣ f^{N} (q) - f (q) ∣ d q + {\tilde{λ}}_{0} \int_{\overset{‒}{Q}} {∣ w ∣}_{H} {∣ z^{N} ∣}_{H} ∣ f (q) - f^{N} (q) ∣ d q + {\tilde{λ}}_{0} \int_{\overset{‒}{Q}} {∣ w - u^{N} ∣}_{H} {∣ z^{N} ∣}_{H} f^{N} (q) d q + α_{0} \int_{\overset{‒}{Q}} {‖ w ‖}_{V} {‖ z^{N} ‖}_{V} ∣ f (q) - f^{N} (q) ∣ d q + α_{0} \int_{\overset{‒}{Q}} {‖ w - u^{N} ‖}_{V} {‖ z^{N} ‖}_{V} f^{N} (q) d q \leq \frac{ε k^{2}}{2 α} \int_{Q^{N}} {‖ z^{N} ‖}_{V}^{2} f^{N} (q) d q + \frac{1}{2 ε} \int_{\overset{‒}{Q}} {∣ z ∣}_{H}^{2} {∣ f^{N} (q) - f (q) ∣}^{2} d q + \frac{{\tilde{λ}}_{0} ε k^{2}}{2 α} \int_{Q^{N}} {‖ z^{N} ‖}_{2}^{V} f^{N} (q) d q + \frac{{\tilde{λ}}_{0} k^{2}}{2 ε} \int_{\overset{‒}{Q}} {‖ w ‖}_{V}^{2} {∣ f^{N} (q) - f (q) ∣}^{2} d q + \frac{{\tilde{λ}}_{0} ε k^{2}}{2} \int_{Q^{N}} {‖ z^{N} ‖}_{V}^{2} f^{N} (q) d q + \frac{{\tilde{λ}}_{0} k^{2}}{2 ε} \int_{Q^{N}} {‖ w - u^{N} ‖}_{V}^{2} f^{N} (q) d q + \frac{α_{0} ε}{2 α} \int_{Q^{N}} {‖ z^{N} ‖}_{V}^{2} f^{N} (q) d q + \frac{α_{0}}{2 ε} \int_{\overset{‒}{Q}} {‖ w ‖}_{V}^{2} {∣ f^{N} (q) - f (q) ∣}^{2} d q + \frac{α_{0} ε}{2} \int_{Q^{N}} {‖ z^{N} ‖}_{V}^{2} f^{N} (q) d q + \frac{α_{0}}{2 ε} \int_{Q^{N}} {‖ w - u^{N} ‖}_{V}^{2} f^{N} (q) d q .

(5.24)

Then, letting $\tilde{c} = {\tilde{μ}}_{0} - \frac{ε}{2 α} (k^{2} ({\tilde{λ}}_{0} + 1) + (α + 1) ({\tilde{λ}}_{0} k^{2} + α_{0}))$ , it follows from (5.24) that

\tilde{c} ‖ z^{N} ‖_{V^{N}}^{2} \leq \frac{1}{2 ε} \int_{\overset{‒}{Q}} ∣ z ∣_{H}^{2} f^{N} (q) - f (q) ∣^{2} d q + \frac{{\tilde{λ}}_{0} k^{2} + α_{0}}{2 ε} \int_{Q^{N}} ‖ w - u^{N} ‖_{V}^{2} f^{N} (q) d q + \frac{{\tilde{λ}}_{0} k^{2} + α_{0}}{2 ε} \int_{\overset{‒}{Q}} ‖ w ‖_{V}^{2} ∣ f^{N} (q) - f (q) ∣^{2} d q = \frac{1}{2 ε} \int_{\overset{‒}{Q}} ∣ z ∣_{H}^{2} ∣ f^{N} (q) - f (q) ∣^{2} d q + \frac{{\tilde{λ}}_{0} k^{2} + α_{0}}{2 ε} ‖ J^{N} w - u^{N} ‖_{V^{N}}^{2} + \frac{{\tilde{λ}}_{0} k^{2} + α_{0}}{2 ε} \int_{\overset{‒}{Q}} ‖ w ‖_{V}^{2} ∣ f^{N} (q) - f (q) ∣^{2} d q .

(5.25)

Choosing ε positive, but sufficiently small in (5.25), it follows from Assumption (v) and the hypotheses of the theorem that

‖ w^{N} - u^{N} ‖_{V^{N}} = ‖ z^{N} ‖_{V^{N}} \to 0 a s N \to \infty .

(5.26)

Thus (5.26) together with (5.20), and Assumption (v) yield resolvent convergence and the theorem is proved. □

We note that in the proof of Theorem (5.2) we were in fact able to establish resolvent convergence in the $V^{N}$ norm. Consequently we may conclude that the semigroup convergence in (5.19) is in the $V^{N}$ norm as well. Moreover, it is not difficult to establish the following corollary to Theorem (5.2).

Corollary 5.1. Under the same hypotheses of Theorem (5.2), we have

‖ x_{i, j}^{N} (ρ^{N}) - J^{N} x_{i, j} (ρ) ‖_{V^{N}} \to 0, a s N \to \infty, ∣ Y_{i, j}^{N} (ρ^{N}) - Y_{i, j} (ρ) ∣_{R^{ν}} \to 0 a s N \to \infty,

(5.27)

for every i = 1, 2, …, m, uniformly in j, for j = 0, 1, 2,…, n_i, where $X_{i, j}^{N} (ρ^{N})$ and $Y_{i, j}^{N} (ρ^{N})$ are given in (5.6) and $X_{i, j} (ρ)$ and $Y_{i, j} (ρ)$ are given in (4.7).

The assumption that the feasible parameter set Ξ is closed and bounded in $R^{2 p + r}$ , together with (5.27) in the statement of Corollary (5.1) and Theorem (2.1) then yield the following result.

Theorem 5.3. If, in addition to Assumptions (i)-(v), we assume that the maps ρ ↦ f (q; ρ) from Ξ to $R$ are continuous for π(ρ) a.e. $q \in \overset{‒}{Q}$ , then each of the approximating estimation problems admits a solution, ρ^N*. Moreover, the sequence {ρ^N*} has a convergent subsequence, {ρ^N*} with ρ_{N_k}* → ρ* and ρ* a solution to the original estimation problem.

It is also possible to establish a consistency result for the estimator $ρ^{*} = ({\vec{a}}^{*}, {\vec{b}}^{*}, {\vec{θ}}^{*}) \in Ξ$ . We require the following additional assumptions:

(a) The measurement noise {ε_j,i} is i.i.d. with respect to a probability space {Ω, Σ, P} with $E [ε_{j, i} ‖ P] = 0$ and Var[ε_j,i∥P] = σ²,

(b) The feasible set of parameters Ξ is compact (i.e. closed and bounded since it is finite dimensional) and has nonempty interior,

(c) For i = 1, 2, …, n_i = n and nτ = T for some positive integer n and some T > 0, where τ is the sampling time defined in Section 3,

(d) That ${\tilde{y}}_{i, j} = Y_{i, j} ({{\tilde{u}}_{i, k}}_{k = 0}^{n_{i} - 1}, ρ_{0}) + ε_{j, i}$ , for some ρ₀ ∈ int{Ξ}, where for i = 1, 2, …, m, $Y_{i, j} ({{\tilde{u}}_{i, k}}_{k = 0}^{n_{i} - 1}, ρ)$ is given by (4.7) with $u_{j} = {\tilde{u}}_{i, j}$ , j = 0, …, n_i, i = 1, 2, …, m, and (4.9), and

(e) For each i = 1, 2, …, m, ρ₀ ∈ Ξ is the unique minimizer of J_i,0 in Xi where

J_{i, 0} (ρ) = σ^{2} + \int_{0}^{T} (Y (t; {\tilde{u}}_{i}, ρ_{0}) - Y (t; {\tilde{u}}_{i}, ρ))^{2} d t,

(5.28)

and $Y (t; {\tilde{u}}_{i}, ρ)$ is given by (4.4) -(4.6) with $u = {\tilde{u}}_{i}$ .

Then a straight forward application of Theorem 4.2 in [7] can then be used to establish the following lemma and theorem (see [32]).

Lemma 5.1. If in addition to Assumptions (i)-(iv) and (a) (e) above we assume that the maps ρ ↦ f(q; ρ) from Ξ to $R$ are continuous for π(ρ) a.e. $q \in \overset{‒}{Q}$ , then there exists an event A ∈ Σ with P(A) = 1 such that for all ω ∈ A and J as given in (5.1) we have

\frac{1}{m} \sum_{i = 1}^{n} {\frac{1}{n} J_{i} (ρ) - J_{i, 0} (ρ)} \to 0,

as n, m → ∞ and τ → 0, with nτ = T, uniformly in ρ for ρ ∈ Ξ, where J_i is given by (5.1) and J_i,0 by (5.28).

Theorem 5.4. (Consistency of the estimator ρ*) Let ρ* ∈ Ξ be as defined in (5.1) in Section 5.1. Then under the assumptions of Lemma (5.1) the estimator $ρ^{*} = ({\vec{a}}^{*}, {\vec{b}}^{*}, {\vec{θ}}^{*}) \in Ξ$ is consistent for ρ₀. That is ρ* → ρ₀ in probability with repsect to the probability measure P, as m, n → ∞, and τ → 0 with ητ = T.

6. Examples and Numerical Results

6.1. The Adjoint Method

The approximating optimization problems are solved numerically by using an iterative gradient-based scheme. Once a basis for the space ^N is chosen, matrix forms of the operators ${\hat{A}}^{N}$ , ${\hat{B}}^{N}$ , and ${\hat{C}}^{N}$ can be computed. The gradient of J^N(ρ), with respect to the 2p + r parameters in ρ can be computed accurately (in fact exactly with the exception of finite precision arithmetic round-off) and efficiently (which is especially important if the dimension of the approximating system (5.6) and/or the number of parameters is large) using the adjoint method (see [23]). For each i = 1, …, m, set $v_{i, j}^{N} = 2 {[{\hat{C}}^{N}]}^{T} ({\hat{C}}^{N} X_{i, j}^{N} - {\tilde{y}}_{i, j}) \in R^{K^{N}}$ , j = 0, …, n_i where K^N is the number of basis elements for $U^{N}$ . Then for each i = 1, …, m, the adjoint systems are defined to be

z_{i, j - 1}^{N} = [{\hat{A}}^{N}]^{T} z_{i, j}^{N} + v_{i, j - 1}^{N}, z_{i, n_{i}} = v_{i, n_{i}}^{N}, j = n_{i}, n_{i} - 1, \dots, 2, 1 .

(6.1)

The gradient of J^N at $ρ = (\vec{a}, \vec{b}, \vec{θ})$ can then be computed from

\vec{\nabla} J^{N} (ρ) = \sum_{i = 1}^{m} \sum_{j = 1}^{n_{i}} [z_{i, j}^{N}]^{T} (\frac{\partial {\hat{A}}^{N}}{\partial ρ} x_{i, j - 1}^{N} - (A^{N})^{- 1} (\frac{\partial A^{N}}{\partial ρ} (A^{N})^{- 1} ({\hat{A}}^{N} - I) B^{N} {\tilde{u}}_{i, j - 1} - \frac{\partial {\hat{A}}^{N}}{\partial ρ} B^{N} {\tilde{u}}_{i, j - 1} - ({\hat{A}}^{N} - I) \frac{\partial B^{N}}{\partial ρ} {\tilde{u}}_{i, j - 1})) + \sum_{i = 1}^{m} \sum_{j = 0}^{n_{i}} (Y_{j}^{N} - {\tilde{y}}_{i, j})^{T} \frac{\partial {\hat{C}}^{N}}{\partial ρ} x_{i, j}^{N} .

(6.2)

Using (6.1) and (6.2) to compute the gradient requires the calculation of the tensor $\frac{\partial {\hat{A}}^{N}}{\partial ρ}$ . This can be done using the sensitivity equations. For t ≤ 0 set $Φ^{N} (t) = e^{A^{N} (t)}$ from which differentiation yields

{\dot{Φ}}^{N} (t) = A^{N} Φ^{N} (t), Φ^{N} (0) = I .

(6.3)

Then, setting $Ψ^{N} (t) = \frac{\partial Φ^{N} (t)}{\partial ρ}$ , differentiating (6.3) with respect to ρ, and interchanging the order of differentiation, we obtain

{\dot{Ψ}}^{N} (t) = A^{N} Ψ^{N} (t) + \frac{\partial A^{N}}{\partial ρ} Φ^{N} (t), Ψ^{N} (0) = 0 .

(6.4)

Combining (6.3) and (6.4), and solving the resulting system, we obtain

[\begin{matrix} Ψ^{N} (t) \\ Φ^{N} (t) \end{matrix}] = e x p ([\begin{matrix} A^{N} & \partial A^{N} ∕ \partial ρ \\ 0 & A^{N} \end{matrix}] τ) [\begin{matrix} 0 \\ I \end{matrix}]

(6.5)

Setting t = τ in (6.5), we obtain that $\frac{\partial {\hat{A}}^{N}}{\partial ρ} = Ψ^{N} (τ)$ .

To illustrate our approach, we consider the case of a one dimensional heat/diffusion equation on the interval [0, 1] with random (thermal) diffusivity and two different sets of boundary conditions. Consider the partial differential equation, boundary conditions and output operator given by

\frac{\partial x}{\partial t} (t, η) = q_{1} \frac{\partial^{2} x}{\partial η^{2}} (t, η), 0 < η < 1, t > 0,

(6.6)

Γ_{D} x (t, \cdot) = x (t, 0) = 0, t > 0,

(6.7)

Γ_{R} x (t, \cdot) = q_{1} \frac{\partial x}{\partial η} (t, 0) - x (t, 0) = 0, t > 0,

(6.8)

Γ_{1} x (t, \cdot) = \frac{q_{1}}{q_{2}} \frac{\partial x}{\partial η} (t, 1) = u (t) t > 0,

(6.9)

x (0, η) = 0, 0 < η < 1,

(6.10)

y (t) = x (t, η_{0}), t > 0,

(6.11)

where 0 < η₀ < 1. In the examples below, we consider the parameterized family of probability density functions defined as follows.

Definition 6.1. Let φ(q; θ), $q \in R^{n}$ be a member in an exponential family [12], and let Φ denote its cumulative distribution function. Let θ represent a vector of parameters, and let $D \subset R^{n}$ be a bounded region to which φ will be restricted. Then define Φ_D(θ) = ∫_D φ(q; θ)dq. Then the family of pdfs, f (·, ρ) given by

f (q; ρ) = \frac{φ (q; θ) χ_{D} (q)}{Φ_{D} (θ)} = \frac{1}{Φ_{D} (θ)} h (q) c (θ) e x p (\sum_{i = 1}^{k} w_{i} (θ) t_{i} (q)) χ_{D} (q)

where the parameters ρ include the parameters θ and parameters $\vec{a}$ and $\vec{b}$ to describe the domain D, is called a truncated exponential family.

It is clear that this family of densities satisfies Assumption (iv) and the hypotheses of Theorem (5.1).

All of the numerical results presented here use simulation data. Our studies involving actual experimental/clinical data are discussed elsewhere (see [32]). The simulated data was generated by first sampling the target distribution to obtain 100 samples q of $q$ . A spline based Galerkin approximation to the system (6.6) -(6.11) using a 128 equally spaced point grid on [0,1] was then solved using each $q$ -sample. The resulting 100 output signals were then averaged at each time point. The approximating estimation problems were all solved on either MAC or PC laptops using the Matlab optimization toolbox routine FMINCON for constrained optimization. Gradients were computed using either FMINCON built-in finite differencing or the adjoint method, (6.1)-(6.5). Which method was used had only a negligible effect on the results. The input signal used was u(t) = ∣cos(t)∣χ_[0,2](t), t ∈ [0, 20], and the sampling interval was τ = 0.1. In all of our examples below, the admissible parameter space Q is assumed to be either in $R^{+}$ in the case of the uni-variate examples, or in the fist quadrant of the plane $R^{2}$ in the bivariate examples. Consequently when the approximating optimization problems were solved, the lower bounds for the supports of the random parameters, a and c, were constrained to be strictly positive. This is based on the requirements of the physical model (6.6)-(6.11) and the assumption that properties (i)-(iii) in Section 3 hold.

6.2. Examples 6.1,6.2 and 6.3; One Random Parameter; Truncated Uniform, Exponential and Normal Distributions

In this series of examples we consider the system (6.6),(6.7),(6.9)-(6.11) with q₁ random and q₂ = 1. In this case we have q = q₁ ∈ Q = [a, b], W = [φ ∈ H²(0, 1), Γ_Dφ = 0}, $H = H_{L}^{1} (0, 1) = {φ \in H^{1} (0, 1), Γ_{D} φ = 0}$ , Dom(A(q)) = [φ ∈ V : Γ_1φ = 0}, and Γ(q) = Γ₁. It follows that

a (q; φ, ψ) = q \int_{0}^{1} φ^{'} (η) ψ^{'} (η) d η, φ, ψ \in V,

and ⟨b(q), ψ⟩_V*,V = ⟨b, ψ⟩_V*,V = ψ(1) = δ(· − 1), ψ ∈ V, and ⟨c(q), ψ⟩_V*,V = ⟨c, ψ⟩_V*,V = ψ(1/3), ψ ∈ V, where in this case η₀ = 1/3. Standard arguments [3, 5] show that Assumptions (i)-(iii) are satisfied.

To carry out the finite dimensional discretization, we let n, m be positive integers and set N = (n, m). In this case we have either D = [a, b] (uniform and normal) or D = [0, R] (exponential). In what follows we describe the q or Q discretization for the uniform and normal cases; the exponential is similar. The basis for the approximating subspaces $U^{N}$ were taken to be tensor products of the standard linear spline basis elements $φ_{i}^{n}$ corresponding to the uniform mesh ${0, \frac{1}{n}, \frac{2}{n}, \dots, \frac{n - 1}{n}, 1}$ on [0, 1], and the characteristic function basis $χ_{j}^{m}$ for the interval [a, b]. The j^th element corresponds to the j^th sub-interval $[a + (j - 1) \frac{b - a}{m}), a + j \frac{b - a}{m})$ , j = 1, 2, …, m. In this way $U^{N} = span {ξ_{i, j}^{N}}$ , i = 1, 2, …, n, ^j = 1, 2, …, m where $ξ_{i, j}^{N} (η, q) = φ_{i}^{n} (η) χ_{j}^{m} (q)$ , η ∈ [0, 1], q ∈ [a, b] with $\dim (U^{N}) = n m$ . Using standard estimates [29] it is not difficult to show that Assumption (v) holds.

Re-numbering $ξ_{i, j}^{N,} s$ so that $ξ_{i, j}^{N} = ξ_{k}^{N}$ where k = (i − 1)n + j and letting $Ψ_{k}^{N} = {[ψ_{i}^{N}]}_{i = 1}^{n m} \in R^{n m}$ , the matrix representation for the operators $A^{N}$ are given by $[A^{N}] = - {(M^{N})}^{- 1} K^{N}$ with

M_{r, s}^{N} = M_{r, s}^{N} (a, b, θ) = 〈 ξ_{r}^{N}, ξ_{s}^{N} 〉_{H} = \int_{a}^{b} \int_{0}^{1} ξ_{r}^{N} ξ_{s}^{N} f (q; a, b, θ) d η d q = \int_{a}^{b} χ_{j}^{m} χ_{l}^{m} f (q; a, b, θ) d q \int_{0}^{1} φ_{i}^{n} φ_{k}^{n} d η,

K_{r, s}^{N} = K_{r, s}^{N} (a, b, θ) = A (q; ξ_{r}^{N}, ξ_{s}^{N}) = \int_{a}^{b} q \int_{0}^{1} \frac{\partial ξ_{r}^{N}}{\partial η} \frac{\partial η_{s}^{N}}{\partial η} f (q; a, b, θ) d η d q = \int_{a}^{b} q χ_{j}^{m} χ_{l}^{m} f (q; a, b, θ) d q \int_{0}^{1} φ_{i}^{n'} φ_{k}^{n'} d η,

where r = (j − 1)n + i, s = (l − 1)n + k, i, k = 1, 2, …, n, j, l = 1, 2, …, m.

We also have

B_{r}^{N} = B_{r}^{N} (a, b, θ) = \int_{a}^{b} ξ_{r}^{N} (1, q) f (q; a, b, θ) d q = φ_{i}^{n} (1) \int_{a}^{b} χ_{j}^{m} f (q; a, b, θ) d q, C_{s}^{N} (a, b, θ) = \int_{a}^{b} ξ_{s}^{N} (1 ∕ 3, q) f (q; a, b, θ) d q - φ_{k}^{n} (1 ∕ 3) \int_{a}^{b} χ_{l}^{m} (q) f (q; a, b, θ) d q,

r, s = 1, 2, …, nm, r = (j − 1)n + i, s = (l − 1)n + k, i, k = 1, 2, …, n, j, l = 1, 2, …, m.

With the density f = f₀(·; ρ) = f₀(·; (a, b, θ)) as given in Definition (6.1) above, if we define

f_{1} (α, β; ρ) = \int_{α}^{β} f (q; ρ) d q and f_{2} (α, β; ρ) = \int_{α}^{β} q f (q; ρ) d q,

it is a straightforward, albeit somewhat tedious, exercise to compute the partial derivatives $\frac{\partial f_{i}}{\partial α}$ , $\frac{\partial f_{i}}{\partial β}$ , $\frac{\partial f_{i}}{\partial θ}$ , $\frac{\partial f_{i}}{\partial a}$ , $\frac{\partial f_{i}}{\partial b}$ , i = 0, 1, 2. These partial derivatives show up in the matrices that appear in the adjoint equations (6.1)-(6.5). We tested our scheme on truncated uniform (ρ = (a, b)), exponential (ρ = (R, θ)) and normal (ρ = (a, b, μ, σ)) distributions. Our results are shown in Table (6.1) and Figure (6.1) below. In panels (a) - (c) of Figure (6.1), we have plotted the converged estimated population models together with the data and the 75% credible band for the truncated uniform, exponential and normal densities. The credible bands can be obtained directly from the solution to the population model. Indeed, $q$ is sampled using the estimated distribution and then $C (q) X_{j}^{N} (\cdot, q)$ is evaluated at the sample q’s where $X_{j}^{N}$ is given by (5.6). Now the q dependence of the solution to the population model is only valid π almost everywhere and our convergence framework is an L₂ (in q) theory. Consequently, pointwise evaluation is, strictly speaking, undefined. However, the results appear to be useful so we have included them. We are currently working on an extension of the results presented here that involves introducing parabolic regularization in q. This will potentially allow us to justify pointwise evaluation in q of the population model to obtain credible band. It is interesting to note that the credible band for the exponential distribution is quite wide, almost to the point of making the population model not that useful. This is because the exponential distribution, especially one with a mean and variance of μ = 1/θ = 3, has a rather “fat” tail. Panels (d) and (f) of Figure (6.1) show the converging estimated pdfs for the truncated exponential and normal distributions, respectively. Panel (e) shows how the output of the population model compares to the data when the resolution of the finite element discretizations of q and η and the truncation point of the densities are varied. It appears from the figure that it is the q discretization that determines the rate of convergence, while a rather coarse η discretization seems to suffice. We believe that this explains the slow convergence of θ (the exponential parameter) and σ (the standard deviation of the normal) observed in Table (6.1) and panel (f) of Figure (6.1). The truncation of the density appears to have only a negligible effect. We are currently investigating whether using smoother first order splines for the q elements produces improved estimates and more rapid convergence.

Table 6.1:

Convergence results for Examples 6.1, 6.2 and 6.3; estimation of the parameters in truncated uniform, exponential and normal distributions.

N		Uniform		Exponential		Normal
n	m	a*	b*	θ*	R*	a*	b*	μ*	σ*
4	4	1.76	4.27	2e-5	3.61	2.61	5.44	4.05	0.62
8	8	1.91	4.05	4e-5	3.81	2.29	5.42	4.01	0.40
16	16	1.94	4.00	0.20	4.34	2.17	5.42	4.01	0.37
32	32	1.95	3.99	0.30	5.95	2.15	5.42	4.00	0.35
64	64	1.96	3.99	0.30	11.08	2.14	5.42	4.00	0.35
True Values		2	4	1/3	—	—	—	4	0.25

Open in a new tab

Figure 6.1: — Top row, starting from the left: Data, converged estimated population model and 75% credible band for (a) Example 6.1 Truncated uniform distribution; (b) Example 6.2 Truncated exponential distribution; (c) Example 6.3 Truncated normal distribution. Bottom row, starting from the left: (d) Example 6.2 Converged pdfs for truncated exponential distribution; (e) Example 6.2 Data and Estimated population model for various values of *R, n* and m; (f) Example 6.3 Converged pdfs for truncated normal distribution.

6.3. Example 6.4; Two Random Parameters; Truncated Bi-variate Normal Distribution

In this example we consider the system (6.5)-(6.11), but instead of the Dirichlet boundary condition (6.2) at η = 0, we take the Robin boundary condition (6.3) at η = 0. In this case, q = [q₁, q₂] is the vector of random parameters with q ∈ D = Q = [a, b] × [c, d], H = L²(0, 1), V = H¹(0, 1), W = H²(0, 1), and Dom(A(q)) = {φ ∈ H²(0, 1) : Γ_Rφ = 0, Γ_1φ = 0} and Γ(q) = Γ₁. The sesquilinear form on V × V is given by $a (q; φ, ψ) = q_{1} \int_{0}^{1} φ^{'} ψ^{'} d η + φ (0) ψ (0)$ with < b(q), ψ >_V*,V = q₂ψ(1) = q₂ψ(· − 1), ψ ∈ V, and < c(q), ψ >_V*,V=< c,ψ >_V*,V = ψ(0), ψ ∈ V where we have set η₀ = 0. In this case N = (n, m₁, m₂), where n is again the level of discretization of the space variable η and m_i is the level of discretization of q_i, i = 1, 2. Once again the approximating subspaces were constructed using tensor products, $U^{N} = span {ξ_{i, j, k}^{N}}$ , i = 0, 1, 2, …, n, j = 1, 2, …, m₁, k = 1, 2, …, m₂ where $ξ_{i, j, k}^{N} (η, q_{1}, q_{2}) = φ_{i}^{n} (η) χ_{j}^{m_{1}} (q_{1}) χ_{k}^{m_{2}} (q_{2})$ , η ∈ [0, 1], q₁ ∈ [a, b], q₂ ∈ [c, d] with $\dim (U^{N}) = (n + 1) m_{1} m_{2}$ .

In this example the truncated exponential family was based on the bivariate normal. Once again, it is possible to compute all the partial derivatives (although of course their evaluation requires the numerical evaluation of single and double integrals) that are required to form the matrices that appear in the state and adjoint equations (6.1)-(6.5). We obtained simulated data by generating samples for $q$ from a $N (\overset{‒}{μ}, \overset{‒}{Σ})$ distribution with $\overset{‒}{μ} = [\begin{matrix} 12 \\ 10 \end{matrix}]$ and $\overset{‒}{Σ} = [\begin{matrix} 9 & 3 \\ 3 & 5 \end{matrix}]$ .

Our results are shown in Table (6.2) and Figure (6.2), where it can be seen that we obtained reasonably good approximations to the actual parameters that we used to simulate the data. We parameterized the covariance matrix as Σ = L^TL, where the 2 × 2 matrix L is upper triangular with L₁₁ and L₂₂ both positive so as to guarantee that at each step in the optimization, Σ is positive definite symmetric. The plot of the optimal joint density in the left hand panel of Figure (6.2) correspond to n = 16 and m₁ = m₂ = 8. In the right hand panel of Figure (6.2) we have plotted the output of the fit population model and the 75% credible band. Once again, we believe that the rate of convergence could be improved by using linear splines rather than piece-wise constant elements to discretize the random parameters q.

Table 6.2:

Convergence results for Example 6.4; estimation of the parameters in truncated bivariate normal distribution.

n	m₁	m₂	a*	b*	c*	d*	μ*	σ*
4	8	8	5.88	18.15	4.85	14.63	$[\begin{matrix} 11.72 \\ 9.88 \end{matrix}]$	$[\begin{matrix} 12.13 & 5.76 \\ 5.76 & 7.35 \end{matrix}]$
8	8	8	5.67	18.35	5.17	14.46	$[\begin{matrix} 11.68 \\ 9.87 \end{matrix}]$	$[\begin{matrix} 10.15 & 4.04 \\ 4.04 & 5.97 \end{matrix}]$
16	8	8	5.79	18.17	5.06	14.66	$[\begin{matrix} 11.67 \\ 9.86 \end{matrix}]$	$[\begin{matrix} 9.29 & 3.03 \\ 3.03 & 5.21 \end{matrix}]$

Open in a new tab

Figure 6.2: — Left hand panel: Example 6.4 Estimated bivariate normal joint density with n = 16 and m₁ = m₂ = 8; Right hand panel: Example 6.4 Data, estimated population model and 75% credible band for truncated bivariate normal distribution.

7. Concluding Remarks

We are currently working on a number of applications and extensions of the results presented here. Specifically, we are looking at applying our approach to actual experimental and clinical BrAC and TAC data collected in both the lab/clinic and the field using two different transdermal alcohol biosensors from a number of different individuals that include several drinking episodes occurring over a time period of several days. We are developing deconvolution schemes based on population models fit using the approach discussed here that, given an output signal, will provide a population based estimate for the input together with credible bands obtained directly from the deconvolved input signal and not requiring simulation. We are also looking at extensions of the ideas presented here to the solution of the LQR and LQG compensator problems wherein the infinite dimensional linear regularly dissipative dynamics and quadratic performance index involve random parameters.

In our treatment here, we assumed that the probability measures describing the distribution of the random parameters were defined in terms of parameterized families of joint density functions. We are looking at developing numerical schemes and an associated convergence theory for estimating the shape of the density directly. We also hope to be able to apply the convergence theory based on the Prohorov metric on a space of measures developed in [7] more directly to the class of problems that we have discussed here. More precisely, we would like to be able to eliminate the assumption that the measures are defined in terms of a density, and estimate the measure directly. We believe that such a theory may be possible by assuming that our approximating subspaces are required to satisfy additional regularity (i.e. smoothness) assumptions; in particular that they are required to be contained in the domain of the operator. Then by making use of a slightly different version of the Trotter-Kato semigroup approximation theorem (see, for example, [1]) we believe it may now be possible to verify the hypotheses of the more general convergence theorem established in [7] for the estimation of the probability measures directly, rather than by estimating an associated density.

Footnotes

^†

This research was supported in part by grants R21AA017711 and R01AA026368 from the National Institute on Alcohol Abuse and Alcoholism (NIAAA).

Contributor Information

Melike Sirlanci, Department of Computing and Mathematical Sciences, California Institute of Technology.

Susan E. Luczak, Department of Psychology, University of Southern California

I. G. Rosen, Department of Mathematics, University of Southern California.

References

[1].Banks HT, Burns JA, and Cliff EM. Parameter estimation and identification for systems with delays. SIAM Journal on Control and Optimization, 19(6):791–828, 1981. URL: 10.1137/0319051, arXiv: 10.1137/0319051, doi: 10.1137/0319051. [DOI] [Google Scholar]
[2].Banks HT, Flores KB, Rosen IG, Rutter EM, Sirlanci Melike, and Thompson Clayton. The prohorov metric framework and aggregate data inverse problems for random pdes. Communications in Applied Analysis, 22(3):415–446, 2018. URL: https://acadsol.eu/en/articles/22/3/6.pdf, doi: 10.12732/caa.v22i3.6. [DOI] [PMC free article] [PubMed] [Google Scholar]
[3].Banks HT and Ito K. A unified framework for approximation in inverse problems for distributed parameter systems. Control Theory Advanced Technology, 4(1):73–90, 1988. URL: https://apps.dtic.mil/dtic/tr/fulltext/u2/a193780.pdf. [Google Scholar]
[4].Banks HT, Kareiva P, and Lamm PK. Estimation techniques for transport equations In Mathematics in Biology and Medicine, pages 428–438. Springer, 1985. URL: https://link.springer.com/chapter/10.1007/978-3-642-93287-8_58. [Google Scholar]
[5].Banks HT and Kunisch Karl. Estimation Techniques for Distributed Parameter Systems. Springer Science & Business Media, 2012. URL: https://www.springer.com/us/book/9780817634339. [Google Scholar]
[6].Banks HT and Lamm PK. Estimation of variable coefficients in parabolic distributed systems. IEEE Transactions on Automatic Control, 30(4):386–398, 1985. URL: https://ieeexplore.ieee.org/document/1103955, doi:DOI: 10.1109/TAC.1985.1103955. [DOI] [Google Scholar]
[7].Banks HT and Thompson W Clayton. Least squares estimation of probability measures in the prohorov metric framework. Technical report, DTIC Document, 2012. URL: https://www.researchgate.net/publication/268353806_Least_Squares_Estimation_of_Probability_Measures_in_the_Prohorov_Metric_Framework. [Google Scholar]
[8].Bui-Thanh Tan, Burstedde Carsten, Ghattas Omar, Martin James, Stadler Georg, and Wilcox Lucas C.. Extreme-scale UQ for Bayesian inverse problems governed by PDEs. In SC12, November 10–16. URL: https://ieeexplore.ieee.org/document/6468442. [Google Scholar]
[9].Bui-Thanh Tan, Ghattas Omar, Martin James, and Stadler Georg. A computational framework for infinite-dimensional Bayesian inverse problems Part I: The linearized case, with application to global seismic inversion. SIAM J. Sci. Stat. Comp, 35(6):A2494A2523 URL: https://epubs.siam.org/doi/abs/10.1137/12089586X?journalCode=sjoce3, doi: 10.1137/12089586X. [DOI] [Google Scholar]
[10].Calvetti Daniela, Kaipio Jario P., and Somersalo Erkki. Inverse problems in the Bayesian framework. Inverse Problems, 30:1–4, 2014. URL: iopscience.iop.org/article/10.1088/0266-5611/30/11/110301/pdf, doi: 10.1088/0266-5611/30/11/110301. [DOI] [Google Scholar]
[11].Campbell Stephen L and Meyer Carl D. Generalized Inverses of Linear Transformations. SIAM, 2009. URL: http://bookstore.siam.org/cl56/. [Google Scholar]
[12].Casella George and Berger Roger L. Statistical Inference, volume 2 Duxbury, Pacific Grove, CA, 2002. URL: https://books.google.com/books/about/Statistical_Inference.html?id=0x_vAAAAMAAJ. [Google Scholar]
[13].Curtain Ruth F and Salamon Dietmar. Finite-dimensional compensators for infinite-dimensional systems with unbounded input operators. SIAM Journal on Control and Optimization, 24(4):797–816, 1986. URL: https://epubs.siam.org/doi/10.1137/0324050, doi: 10.1137/0324050. [DOI] [Google Scholar]
[14].Dai Zheng, Rosen I Gary, Wang Chunming, Barnett Nancy, and Luczak Susan E. Using drinking data and pharmacokinetic modeling to calibrate transport model and blind deconvolution based data analysis software for transdermal alcohol biosensors. Mathematical Biosciences and Engineering: MBE, 13(5):911, 2016. URL: http://www.aimsciences.org/journals/displayArticlesnew.jsp?paperID=12739. [DOI] [PMC free article] [PubMed] [Google Scholar]
[15].Dashti Masoumeh and Stuart Andrew M.. The Bayesian approach to inverse problems In Ghanem R et al. , editor, Handbook of Uncertainty Quantification, pages 311–428. Springer International Publishing Switzerland, 2017. URL: https://www.springer.com/us/book/9783319123844. [Google Scholar]
[16].Davidian M and Giltinan D. Nonlinear Models for Repeated Measurement Data. Chapman and Hall, New York, 1995. URL: https://www.crcpress.com/Nonlinear-Models-for-Repeated-Measurement-Data/Davidian-Giltinan/p/book/9780412983412. [Google Scholar]
[17].Davidian M and Giltinan DM. Nonlinear models for repeated measurement data: An overview and update. Journal of Agricultural, Biological and Environmental Statistics, 8:387–419, 2003. URL: https://link.springer.com/article/10.1198/1085711032697. [Google Scholar]
[18].Demidenko E. Mixred Models, Theory and Applications, Second Edition. John Wiley and Sons, Hoboken, 2013. URL: https://www.wiley.com/en-us/Mixed+Models%3A+Theory+and+Applications+with+R%2C+2nd+Edition-p-9781118091579. [Google Scholar]
[19].Dumett Miguel A, Rosen I Gary, Sabat J, Shaman A, Tempelman L, Wang C, and Swift RM. Deconvolving an estimate of breath measured blood alcohol concentration from biosensor collected transdermal ethanol data. Applied Mathematics and Computation, 196(2):724–743, 2008. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2597868/, doi: 10.1016/j.amc.2007.07.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
[20].Gittelson Claude Jeffrey, Andreev Roman, and Schwab Christoph. Optimality of adaptive galerkin methods for random parabolic partial differential equations. J. Computational Applied Mathematics, 263:189–201, 2014. URL: 10.1016/j.cam.2013.12.031, doi: 10.1016/j.cam.2013.12.031. [DOI] [Google Scholar]
[21].Kato Tosio. Perturbation Theory for Linear Operators, volume 132 Springer Science & Business Media, 2013. URL: https://www.springer.com/us/book/9783540586616. [Google Scholar]
[22].Labianca Dominick A.. The chemical basis of the breathalyzer: a critical analysis. J. Chem. Educ, 67(3):259–261, 1990. URL: https://pubs.acs.org/doi/abs/10.1021/ed067p259?journalCode=jceda8, doi:DOI: 10.1021/ed067p259. [DOI] [Google Scholar]
[23].Levi AFJ and Rosen I Gary. A novel formulation of the adjoint method in the optimal design of quantum electronic devices. SIAM Journal on Control and Optimization, 48(5):3191–3223, 2010. URL: 10.1137/070708330, arXiv: 10.1137/070708330, doi: 10.1137/070708330. [DOI] [Google Scholar]
[24].Lions JL. Optimal Control of Systems Governed by Partial Differential Equations Grundlehrender mathematischen Wissenschaften in Einzeldarstellungen mit besonderer Berücksichtigung der Anwendungsgebiete. Springer-Verlag, 1971. URL: https://books.google.com/books?id=aL9tlwEACAAJ. [Google Scholar]
[25].Nyman E and Palmlöv A. The elimination of ethyl alcohol in sweat. Acta Physiologica, 74(2):155–159, 1936. URL:https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1748-1716.1936.tb01150.x, doi: 10.1111/j.1748-1716.1936.tb01150.x. [DOI] [Google Scholar]
[26].Pazy A. Semigroups of Linear Operators and Applications to Partial Differential Equations Applied Mathematical Sciences. Springer, 1983. URL: https://books.google.com/books?id=80XYPwAACAAJ. [Google Scholar]
[27].Pritchard Anthony J and Salamon Dietmar. The linear quadratic control problem for infinite dimensional systems with unbounded input and output operators. SIAM Journal on Control and Optimization, 25(1):121–144, 1987. URL: https://epubs.siam.org/doi/abs/10.1137/0325009, doi: 10.1137/0325009. [DOI] [Google Scholar]
[28].Rosen I Gary, Luczak Susan E, and Weiss Jordan. Blind deconvolution for distributed parameter systems with unbounded input and output and determining blood alcohol concentration from transdermal biosensor data. Applied Mathematics and Computation, 231:357–376, 2014. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3972634/, doi: 10.1016/j.amc.2013.12.099. [DOI] [PMC free article] [PubMed] [Google Scholar]
[29].Schultz MH. Spline Analysis. Prentice-Hall Series in Automatic Computation. Pearson Education, Limited, 1972. URL: https://books.google.com/books?id=AdRQAAAAMAAJ. [Google Scholar]
[30].Schwab Christoph and Gittelson Claude Jeffrey. Sparse tensor discretizations of high-dimensional parametric and stochastic pdes. Acta Numerical, 20:291467, 2011. URL: https://www.cambridge.org/core/journals/acta-numerica/article/sparse-tensor-discretizations-of-highdimensional-parametric-and-stochastic-pdes/A46BD443A2D1176B448132A271057DCC, doi: 10.1017/S0962492911000055. [DOI] [Google Scholar]
[31].Sirlanci Melike, Luczak Susan, and Rosen I Gary. Approximation and convergence in the estimation of random parameters in linear holomorphic semigroups generated by regularly dissipative operators. In American Control Conference (ACC), 2017, pages 3171–3176. IEEE, 2017. URL: https://www.researchgate.net/publication/318333926_Approximation_and_convergence_in_the_estimation_of_random_parameters_in_linear_holomorphic_semigroups_generated_by_regularly_dissipative_operators, doi: 10.23919/ACC.2017.7963435. [DOI] [Google Scholar]
[32].Sirlanci Melike, Luczak Susan E., Fairbairn Catharine E., Kang Dayheon, Pan Ruoxi, Yu Xin, and Rosen I Gary. Estimating the distribution of random parameters in a diffusion equation forward model for a transdermal alcohol biosensor. Automatica, 2018. to appear, arXiv:1808.04058. URL: https://arxiv.org/abs/1808.04058. [DOI] [PMC free article] [PubMed] [Google Scholar]
[33].Sirlanci Melike, Rosen I Gary, Luczak Susan E., Fairbairn Catharine E., Bresin Konrad, and Kang Dayheon. Deconvolving the input to random abstract parabolic systems; a population model-based approach to estimating blood/breath alcohol concentration from transdermal alcohol biosensor data. Inverse problems, 34(12), 2018. arXiv:1807.05088v1. URL: http://iopscience.iop.org/article/10.1088/1361-6420/aae791/pdf. [DOI] [PMC free article] [PubMed] [Google Scholar]
[34].Stuart AM. Inverse problems: A Bayesian perspective. Acta Numerica, pages 451–559, 2010. URL: https://www.cambridge.org/core/journals/acta-numerica/article/inverse-problems-a-bayesian-perspective/587A3A0D480A1A7C2B1B284BCEDF7E23, doi: 10.1017/S0962492910000061. [DOI] [Google Scholar]
[35].Swift Robert M.. Transdermal measurement of alcohol consumption. Addiction, 88(8):1037–1039, 1993. URL: 10.1111/j.1360-0443.1993.tb02122.x, doi: 10.1111/j.1360-0443.1993.tb02122.x. [DOI] [PubMed] [Google Scholar]
[36].Swift Robert M.. Transdermal alcohol measurement for estimation of blood alcohol concentration. Alcoholism: Clinical and Experimental Research, 24(4):422–423, 2000. URL: 10.1111/j.1530-0277.2000.tb02006.x, doi: 10.1111/j.1530-0277.2000.tb02006.x. [DOI] [PubMed] [Google Scholar]
[37].Swift Robert M.. Direct measurement of alcohol and its metabolites. Addiction, 98:73–80, 2003. URL: 10.1046/j.1359-6357.2003.00605.x, doi: 10.1046/j.1359-6357.2003.00605.x. [DOI] [PubMed] [Google Scholar]
[38].Swift Robert M., Martin Christopher S, Swette Larry, LaConti Anthony, and Kackley Nancy. Studies on a wearable, electronic, transdermal alcohol sensor. Alcoholism: Clinical and Experimental Research, 16(4):721–725, 1992. URL: https://www.ncbi.nlm.nih.gov/pubmed/1530135, doi: 10.1111/j.1530-0277.1992.tb00668.x. [DOI] [PubMed] [Google Scholar]
[39].Tanabe H. Equations of Evolution Monographs and Studies in Mathematics. Pitman, 1979. URL: https://books.google.com/books?id=Dn6zAAAAIAAJ. [Google Scholar]

[R1] [1].Banks HT, Burns JA, and Cliff EM. Parameter estimation and identification for systems with delays. SIAM Journal on Control and Optimization, 19(6):791–828, 1981. URL: 10.1137/0319051, arXiv: 10.1137/0319051, doi: 10.1137/0319051. [DOI] [Google Scholar]

[R2] [2].Banks HT, Flores KB, Rosen IG, Rutter EM, Sirlanci Melike, and Thompson Clayton. The prohorov metric framework and aggregate data inverse problems for random pdes. Communications in Applied Analysis, 22(3):415–446, 2018. URL: https://acadsol.eu/en/articles/22/3/6.pdf, doi: 10.12732/caa.v22i3.6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] [3].Banks HT and Ito K. A unified framework for approximation in inverse problems for distributed parameter systems. Control Theory Advanced Technology, 4(1):73–90, 1988. URL: https://apps.dtic.mil/dtic/tr/fulltext/u2/a193780.pdf. [Google Scholar]

[R4] [4].Banks HT, Kareiva P, and Lamm PK. Estimation techniques for transport equations In Mathematics in Biology and Medicine, pages 428–438. Springer, 1985. URL: https://link.springer.com/chapter/10.1007/978-3-642-93287-8_58. [Google Scholar]

[R5] [5].Banks HT and Kunisch Karl. Estimation Techniques for Distributed Parameter Systems. Springer Science & Business Media, 2012. URL: https://www.springer.com/us/book/9780817634339. [Google Scholar]

[R6] [6].Banks HT and Lamm PK. Estimation of variable coefficients in parabolic distributed systems. IEEE Transactions on Automatic Control, 30(4):386–398, 1985. URL: https://ieeexplore.ieee.org/document/1103955, doi:DOI: 10.1109/TAC.1985.1103955. [DOI] [Google Scholar]

[R7] [7].Banks HT and Thompson W Clayton. Least squares estimation of probability measures in the prohorov metric framework. Technical report, DTIC Document, 2012. URL: https://www.researchgate.net/publication/268353806_Least_Squares_Estimation_of_Probability_Measures_in_the_Prohorov_Metric_Framework. [Google Scholar]

[R8] [8].Bui-Thanh Tan, Burstedde Carsten, Ghattas Omar, Martin James, Stadler Georg, and Wilcox Lucas C.. Extreme-scale UQ for Bayesian inverse problems governed by PDEs. In SC12, November 10–16. URL: https://ieeexplore.ieee.org/document/6468442. [Google Scholar]

[R9] [9].Bui-Thanh Tan, Ghattas Omar, Martin James, and Stadler Georg. A computational framework for infinite-dimensional Bayesian inverse problems Part I: The linearized case, with application to global seismic inversion. SIAM J. Sci. Stat. Comp, 35(6):A2494A2523 URL: https://epubs.siam.org/doi/abs/10.1137/12089586X?journalCode=sjoce3, doi: 10.1137/12089586X. [DOI] [Google Scholar]

[R10] [10].Calvetti Daniela, Kaipio Jario P., and Somersalo Erkki. Inverse problems in the Bayesian framework. Inverse Problems, 30:1–4, 2014. URL: iopscience.iop.org/article/10.1088/0266-5611/30/11/110301/pdf, doi: 10.1088/0266-5611/30/11/110301. [DOI] [Google Scholar]

[R11] [11].Campbell Stephen L and Meyer Carl D. Generalized Inverses of Linear Transformations. SIAM, 2009. URL: http://bookstore.siam.org/cl56/. [Google Scholar]

[R12] [12].Casella George and Berger Roger L. Statistical Inference, volume 2 Duxbury, Pacific Grove, CA, 2002. URL: https://books.google.com/books/about/Statistical_Inference.html?id=0x_vAAAAMAAJ. [Google Scholar]

[R13] [13].Curtain Ruth F and Salamon Dietmar. Finite-dimensional compensators for infinite-dimensional systems with unbounded input operators. SIAM Journal on Control and Optimization, 24(4):797–816, 1986. URL: https://epubs.siam.org/doi/10.1137/0324050, doi: 10.1137/0324050. [DOI] [Google Scholar]

[R14] [14].Dai Zheng, Rosen I Gary, Wang Chunming, Barnett Nancy, and Luczak Susan E. Using drinking data and pharmacokinetic modeling to calibrate transport model and blind deconvolution based data analysis software for transdermal alcohol biosensors. Mathematical Biosciences and Engineering: MBE, 13(5):911, 2016. URL: http://www.aimsciences.org/journals/displayArticlesnew.jsp?paperID=12739. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] [15].Dashti Masoumeh and Stuart Andrew M.. The Bayesian approach to inverse problems In Ghanem R et al. , editor, Handbook of Uncertainty Quantification, pages 311–428. Springer International Publishing Switzerland, 2017. URL: https://www.springer.com/us/book/9783319123844. [Google Scholar]

[R16] [16].Davidian M and Giltinan D. Nonlinear Models for Repeated Measurement Data. Chapman and Hall, New York, 1995. URL: https://www.crcpress.com/Nonlinear-Models-for-Repeated-Measurement-Data/Davidian-Giltinan/p/book/9780412983412. [Google Scholar]

[R17] [17].Davidian M and Giltinan DM. Nonlinear models for repeated measurement data: An overview and update. Journal of Agricultural, Biological and Environmental Statistics, 8:387–419, 2003. URL: https://link.springer.com/article/10.1198/1085711032697. [Google Scholar]

[R18] [18].Demidenko E. Mixred Models, Theory and Applications, Second Edition. John Wiley and Sons, Hoboken, 2013. URL: https://www.wiley.com/en-us/Mixed+Models%3A+Theory+and+Applications+with+R%2C+2nd+Edition-p-9781118091579. [Google Scholar]

[R19] [19].Dumett Miguel A, Rosen I Gary, Sabat J, Shaman A, Tempelman L, Wang C, and Swift RM. Deconvolving an estimate of breath measured blood alcohol concentration from biosensor collected transdermal ethanol data. Applied Mathematics and Computation, 196(2):724–743, 2008. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2597868/, doi: 10.1016/j.amc.2007.07.026. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] [20].Gittelson Claude Jeffrey, Andreev Roman, and Schwab Christoph. Optimality of adaptive galerkin methods for random parabolic partial differential equations. J. Computational Applied Mathematics, 263:189–201, 2014. URL: 10.1016/j.cam.2013.12.031, doi: 10.1016/j.cam.2013.12.031. [DOI] [Google Scholar]

[R21] [21].Kato Tosio. Perturbation Theory for Linear Operators, volume 132 Springer Science & Business Media, 2013. URL: https://www.springer.com/us/book/9783540586616. [Google Scholar]

[R22] [22].Labianca Dominick A.. The chemical basis of the breathalyzer: a critical analysis. J. Chem. Educ, 67(3):259–261, 1990. URL: https://pubs.acs.org/doi/abs/10.1021/ed067p259?journalCode=jceda8, doi:DOI: 10.1021/ed067p259. [DOI] [Google Scholar]

[R23] [23].Levi AFJ and Rosen I Gary. A novel formulation of the adjoint method in the optimal design of quantum electronic devices. SIAM Journal on Control and Optimization, 48(5):3191–3223, 2010. URL: 10.1137/070708330, arXiv: 10.1137/070708330, doi: 10.1137/070708330. [DOI] [Google Scholar]

[R24] [24].Lions JL. Optimal Control of Systems Governed by Partial Differential Equations Grundlehrender mathematischen Wissenschaften in Einzeldarstellungen mit besonderer Berücksichtigung der Anwendungsgebiete. Springer-Verlag, 1971. URL: https://books.google.com/books?id=aL9tlwEACAAJ. [Google Scholar]

[R25] [25].Nyman E and Palmlöv A. The elimination of ethyl alcohol in sweat. Acta Physiologica, 74(2):155–159, 1936. URL:https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1748-1716.1936.tb01150.x, doi: 10.1111/j.1748-1716.1936.tb01150.x. [DOI] [Google Scholar]

[R26] [26].Pazy A. Semigroups of Linear Operators and Applications to Partial Differential Equations Applied Mathematical Sciences. Springer, 1983. URL: https://books.google.com/books?id=80XYPwAACAAJ. [Google Scholar]

[R27] [27].Pritchard Anthony J and Salamon Dietmar. The linear quadratic control problem for infinite dimensional systems with unbounded input and output operators. SIAM Journal on Control and Optimization, 25(1):121–144, 1987. URL: https://epubs.siam.org/doi/abs/10.1137/0325009, doi: 10.1137/0325009. [DOI] [Google Scholar]

[R28] [28].Rosen I Gary, Luczak Susan E, and Weiss Jordan. Blind deconvolution for distributed parameter systems with unbounded input and output and determining blood alcohol concentration from transdermal biosensor data. Applied Mathematics and Computation, 231:357–376, 2014. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3972634/, doi: 10.1016/j.amc.2013.12.099. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] [29].Schultz MH. Spline Analysis. Prentice-Hall Series in Automatic Computation. Pearson Education, Limited, 1972. URL: https://books.google.com/books?id=AdRQAAAAMAAJ. [Google Scholar]

[R30] [30].Schwab Christoph and Gittelson Claude Jeffrey. Sparse tensor discretizations of high-dimensional parametric and stochastic pdes. Acta Numerical, 20:291467, 2011. URL: https://www.cambridge.org/core/journals/acta-numerica/article/sparse-tensor-discretizations-of-highdimensional-parametric-and-stochastic-pdes/A46BD443A2D1176B448132A271057DCC, doi: 10.1017/S0962492911000055. [DOI] [Google Scholar]

[R31] [31].Sirlanci Melike, Luczak Susan, and Rosen I Gary. Approximation and convergence in the estimation of random parameters in linear holomorphic semigroups generated by regularly dissipative operators. In American Control Conference (ACC), 2017, pages 3171–3176. IEEE, 2017. URL: https://www.researchgate.net/publication/318333926_Approximation_and_convergence_in_the_estimation_of_random_parameters_in_linear_holomorphic_semigroups_generated_by_regularly_dissipative_operators, doi: 10.23919/ACC.2017.7963435. [DOI] [Google Scholar]

[R32] [32].Sirlanci Melike, Luczak Susan E., Fairbairn Catharine E., Kang Dayheon, Pan Ruoxi, Yu Xin, and Rosen I Gary. Estimating the distribution of random parameters in a diffusion equation forward model for a transdermal alcohol biosensor. Automatica, 2018. to appear, arXiv:1808.04058. URL: https://arxiv.org/abs/1808.04058. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] [33].Sirlanci Melike, Rosen I Gary, Luczak Susan E., Fairbairn Catharine E., Bresin Konrad, and Kang Dayheon. Deconvolving the input to random abstract parabolic systems; a population model-based approach to estimating blood/breath alcohol concentration from transdermal alcohol biosensor data. Inverse problems, 34(12), 2018. arXiv:1807.05088v1. URL: http://iopscience.iop.org/article/10.1088/1361-6420/aae791/pdf. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] [34].Stuart AM. Inverse problems: A Bayesian perspective. Acta Numerica, pages 451–559, 2010. URL: https://www.cambridge.org/core/journals/acta-numerica/article/inverse-problems-a-bayesian-perspective/587A3A0D480A1A7C2B1B284BCEDF7E23, doi: 10.1017/S0962492910000061. [DOI] [Google Scholar]

[R35] [35].Swift Robert M.. Transdermal measurement of alcohol consumption. Addiction, 88(8):1037–1039, 1993. URL: 10.1111/j.1360-0443.1993.tb02122.x, doi: 10.1111/j.1360-0443.1993.tb02122.x. [DOI] [PubMed] [Google Scholar]

[R36] [36].Swift Robert M.. Transdermal alcohol measurement for estimation of blood alcohol concentration. Alcoholism: Clinical and Experimental Research, 24(4):422–423, 2000. URL: 10.1111/j.1530-0277.2000.tb02006.x, doi: 10.1111/j.1530-0277.2000.tb02006.x. [DOI] [PubMed] [Google Scholar]

[R37] [37].Swift Robert M.. Direct measurement of alcohol and its metabolites. Addiction, 98:73–80, 2003. URL: 10.1046/j.1359-6357.2003.00605.x, doi: 10.1046/j.1359-6357.2003.00605.x. [DOI] [PubMed] [Google Scholar]

[R38] [38].Swift Robert M., Martin Christopher S, Swette Larry, LaConti Anthony, and Kackley Nancy. Studies on a wearable, electronic, transdermal alcohol sensor. Alcoholism: Clinical and Experimental Research, 16(4):721–725, 1992. URL: https://www.ncbi.nlm.nih.gov/pubmed/1530135, doi: 10.1111/j.1530-0277.1992.tb00668.x. [DOI] [PubMed] [Google Scholar]

[R39] [39].Tanabe H. Equations of Evolution Monographs and Studies in Mathematics. Pitman, 1979. URL: https://books.google.com/books?id=Dn6zAAAAIAAJ. [Google Scholar]

PERMALINK

Estimation of the Distribution of Random Parameters in Discrete Time Abstract Parabolic Systems with Unbounded Input and Output: Approximation and Convergence^†

Melike Sirlanci

Susan E Luczak

I G Rosen

Abstract

1. Introduction

2. Estimation of Random Discrete Time Dynamical Systems

3. Abstract Parabolic Systems with Unbounded Input and Output

3.1. The Discrete Time Formulation

3.2. Systems with Boundary Input

4. Random Regularly Dissipative Operators and Their Associated Semigroups

5. Approximation and Convergence

5.1. The Estimation Problem

5.2. A Version of the Trotter-Kato Semigroup Approximation Theorem

5.3. Application to the Density Estimation Problem

6. Examples and Numerical Results

6.1. The Adjoint Method

6.2. Examples 6.1,6.2 and 6.3; One Random Parameter; Truncated Uniform, Exponential and Normal Distributions

Table 6.1:

Figure 6.1:

6.3. Example 6.4; Two Random Parameters; Truncated Bi-variate Normal Distribution

Table 6.2:

Figure 6.2:

7. Concluding Remarks

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Estimation of the Distribution of Random Parameters in Discrete Time Abstract Parabolic Systems with Unbounded Input and Output: Approximation and Convergence†

Melike Sirlanci

Susan E Luczak

I G Rosen

Abstract

1. Introduction

2. Estimation of Random Discrete Time Dynamical Systems

3. Abstract Parabolic Systems with Unbounded Input and Output

3.1. The Discrete Time Formulation

3.2. Systems with Boundary Input

4. Random Regularly Dissipative Operators and Their Associated Semigroups

5. Approximation and Convergence

5.1. The Estimation Problem

5.2. A Version of the Trotter-Kato Semigroup Approximation Theorem

5.3. Application to the Density Estimation Problem

6. Examples and Numerical Results

6.1. The Adjoint Method

6.2. Examples 6.1,6.2 and 6.3; One Random Parameter; Truncated Uniform, Exponential and Normal Distributions

Table 6.1:

Figure 6.1:

6.3. Example 6.4; Two Random Parameters; Truncated Bi-variate Normal Distribution

Table 6.2:

Figure 6.2:

7. Concluding Remarks

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Estimation of the Distribution of Random Parameters in Discrete Time Abstract Parabolic Systems with Unbounded Input and Output: Approximation and Convergence^†