Abstract
A finite dimensional abstract approximation and convergence theory is developed for estimation of the distribution of random parameters in infinite dimensional discrete time linear systems with dynamics described by regularly dissipative operators and involving, in general, unbounded input and output operators. By taking expectations, the system is re-cast as an equivalent abstract parabolic system in a Gelfand triple of Bochner spaces wherein the random parameters become new space-like variables. Estimating their distribution is now analogous to estimating a spatially varying coefficient in a standard deterministic parabolic system. The estimation problems are approximated by a sequence of finite dimensional problems. Convergence is established using a state space-varying version of the Trotter-Kato semigroup approximation theorem. Numerical results for a number of examples involving the estimation of exponential families of densities for random parameters in a diffusion equation with boundary input and output are presented and discussed.
Keywords: Distribution estimation, Random parameters, Distributed parameter systems, Abstract parabolic systems, Regularly dissipative operators
1. Introduction
The work we report on here was motivated by a compound inverse or blind deconvolution problem involving the interpretation of data from a transdermal alcohol biosensor. The observation (dating back to the 1930s [25, 35, 36, 37, 38]) that ethanol is highly miscible and finds its way into all the water in the body, and in particular, sweat, has in the past two decades, led to the development of technology to measure the amount of ethanol excreted from the body transdermally (i.e. through the skin) through perspiration and to then use it to quantitatively assess intoxication level. The basis for the measurement is an oxidation-reduction (redox) reaction that produces four electrons for each ethanol molecule oxidized. This results in a continuous current whose level is proportional to the amount of ethanol evaporating from the surface of the skin beneath the sensor. Now while these devices have been available and in use, both experimentally and commercially, for a number of years, they have been used primarily as abstinence monitors because transdermal alcohol level or concentration (TAC) data cannot consistently be converted to breath and blood alcohol concentrations (BrAC/BAC) across individuals, devices, and environmental conditions. (BAC and BrAC are currently, and historically have been, the standard measures of intoxication among alcohol researchers and clinicians, as well as in the courts.) Indeed, unlike a breath analyzer, which relies on a relatively simple model from basic chemistry (i.e., Henrys Law) for the exchange of gases between circulating pulmonary blood and alveolar air (see, for example, [22]) that has been found to be reasonably robust across the population, the transport and filtering of alcohol by the skin is physiologically more complex and is affected by a number of factors that differ across individuals (e.g., skin layer thickness, porosity and tortuosity, etc.) and even drinking episodes within individuals (e.g., body and ambient temperature, skin hydration, vasodilation). The challenge in making these devices practicable is to develop a means to reliably convert biosensor measured TAC into BAC or BrAC.
In our earlier work ([14, 19, 28]) we have taken a strictly deterministic approach to converting TAC to either BAC or BrAC. We fit first principles physics-based models in the form of a distributed parameter (diffusion) system with unbounded input and output, and used individual calibration data to capture the dynamics of the forward process - the propagation of alcohol from the blood, through the skin, and its measurement by the sensor (i.e. the forward model) by estimating the parameters (diffusivity, input/output gain, propagation inertia, etc.) that appear in the model via nonlinear least squares. Then in a second phase of processing, we use the fit model to deconvolve BAC or BrAC from the TAC signal measured by the biosensor in the field. However, because of the challenges described above, this approach was not entirely satisfying. Indeed, while it was possible to fit the models quite well to any particular drinking episode, we observed significant variance in the values of the parameters across different individuals and across different drinking episodes for the same individual. Consequently, the fit models did not yield the desired level of accuracy when they were used to deconvolve BAC or BrAC from TAC for a drinking episode that they were not specifically trained on.
To deal with this problem we have been looking at the idea of fitting a population forward model (having BAC or BrAC as input and TAC as output) in the form of a random partial differential equation, to data from multiple drinking episodes and multiple individuals and then using the population model to solve the deconvolution problem. Fitting a population model of this form implies that rather than estimate particular values for the parameters, we treat the parameters as random variables and estimate their distributions. In this way, it will become possible to produce not only an estimate for the BAC or BrAC, but also some form of credible bands to go along with it providing a quantitative estimate of the level of uncertainty in the estimate.
The basic underlying assumption in such an approach is that our first principles physics/physiological based model in essence, describes the dynamics common to the entire population (population interpreted broadly here to include not only all individuals, but also all devices, environmental conditions, and in effect, all ethanol molecules) and to then attribute all unmodeled sources of uncertainty (primarily due to variations in physiology, hardware, and the environment) observed in individual data to random effects. Moreover, we assume that what we observe in any individual data set is the combination or average of these random effects. Thus, this approach is realized by letting the parameters in the PDE model be random variables, the distributions of which are to be estimated based on aggregate population data.
In this paper, we develop an abstract approximation framework and convergence theory for formulating and solving just such an estimation problem. In addition to the theory, we have also included a number of examples and numerical results. However, we do not discuss here the application of these ideas to either the alcohol biosensor problem discussed above or even the deconvolution problem. Those results are presented elsewhere ([31, 32, 33]). In our treatment here, we are strictly concerned with the problem of estimating the distributions of random parameters in a forward model from a particular class of abstract linear infinite dimensional systems for which the input is known and observations of the output for a sampling of members of the target population are available. That is, we are referring to the problem of fitting the population model.
The class of systems we consider here are those governed by abstract parabolic or hyperbolic operators with damping formulated in a Gelfand triple setting together with input and observations on the boundary of the domain. These types of operators are sometimes referred to as being regularly dissipative, and can typically be shown to generate holomorphic or analytic semigroups. We formulate the estimation problem in much the same way as it is in standard linear regression. That is, that each data point is assumed to be an observation of the mean population behavior plus random error. We then formulate the estimation problem as an optimization problem over the space of feasible distributions for the random parameters. The objective of the optimization problem is to minimize prediction error in the form of the difference between the observed output signal and the expectation of the output of the model. We then consider a sequence of approximating estimation problems in each of which the infinite dimensional system is replaced by a finite dimensional approximating system. We then demonstrate that under appropriate (and readily verifiable) assumptions, the solutions to the approximating estimation problems converge to a solution to the original estimation problem with the infinite dimensional state. These convergence results are formulated in a functional analytic or operator theoretic setting and are based on ideas and results from linear semigroup theory.
Our general approach relies heavily on three relatively recent papers: 1) Banks and Thompson’s [7] framework for the estimation of probability measures in random abstract evolution equations and the convergence of finite dimensional approximations in the Prohorov metric, 2) a more recent and enhanced version of the previous paper, [2], and 3) Gittelson, Andreev, and Schwab’s [20] theory for random abstract parabolic partial differential equations with dynamics defined in terms of coercive sesquilinear forms. While our effort here is similar in spirit and takes its cue from the treatment in [2] and [7], it is somewhat different in that we are forced to assume that the probability measures that describe the distribution of our random parameters can be defined in terms of a joint density function; that is, that the random parameters are jointly absolutely continuous.
The approach in [20] is novel in the way that it treats the random parameters in the PDE as another space-like independent variable. This is done by appropriately defining corresponding Bochner spaces in which the weak formulation of the problem is stated and shown to be well-posed. In fact, it turns out that the random parameter dependent regularly dissipative operators that determine the underlying PDE are regularly dissipative when embedded in these Bochner spaces. Consequently, we are able to use linear semigroup theory to develop our approximation framework in much the same way as we have in our earlier deterministic treatments. In this way, finite dimensional approximation is handled in much the same way that it is for the standard deterministic space variables, and the estimation of the distribution of the random parameters effectively becomes analogous to the problem of estimating a variable coefficient in a deterministic PDE, a problem which has been studied extensively over the last thirty years ([4] and [6]).
We use the framework in [20] together with generation and approximation results from linear semigroup theory, (i.e. the Hille-Yosida-Phillips theorem and a version of the Trotter Kato approximation theorem) to establish that the sufficient conditions for a Banks Thompson-like convergence result are satisfied. These theoretical results allow us to develop rigorously established convergent computational algorithms that yield numerical approximations to the desired distributions. Moreover, the solutions in the Bochner spaces and their finite dimensional approximations directly capture the explicit dependence of the state and output (and eventually the deconvolved input) on the random parameters. Using this together with the estimated distributions for the random parameters, it becomes straight forward to directly identify credible intervals for the output without having to re-solve the PDE many times as you would if you were attempting to identify these credible intervals by naive sampling.
An outline of the remainder of the paper is as follows. In Section (2) we formally develop the estimation problem, reformulate it as a nonlinear least squares optimization problem and establish the existence of solutions. In Section (3) we discuss infinite dimensional systems described by regularly dissipative operators involving unbounded input and output (this is typically the case for a PDE with input and output on the boundary). In Section (4) we discuss the framework in [20] for treating systems of the form discussed in Section (3) but now involving random parameters. Our approximation and convergence results are presented in Section (5) and a discussion of examples and our numerical results are in Section (6). Section (7) has a few concluding remarks regarding where we plan to go next with this line of research.
In our discussions to follow we will on occasion use the notation E[X∥f], , or E [X∥π] to denote the expectation of the random variable X with respect to the probability density function f, the cumulative distribution function F, or the probability measure π. We use the “double bar” as opposed to a “single bar” to distinguish what we mean here with conditional expectation.
2. Estimation of Random Discrete Time Dynamical Systems
We consider the family of discrete or sampled time initial value problems that are set in an, in general, infinite dimensional Hilbert state space, , given by
(2.1) |
(2.2) |
where and for j = 0, …, ni and i = 1, 2, …, m, ui = {ui,j} is an external input or control with , and tj = jτ, with τ > 0 the length of the sampling interval, describing the dynamics of a process common to the entire population. In addition, we assume that we can observe some function of the solutions of (2.1)-(2.2), xj,i, as given by the output equation
(2.3) |
where .
In equations (2.1)-(2.3), we assume q ∈ Q, where Q is the set of admissible parameters (a subset of Euclidean space endowed with Lebesgue measure), and the values of the parameters are specific to each individual in the population. Therefore, assuming that the parameters, q, are samples from a random vector , the objective is to estimate their (joint) distribution based on the aggregate data sampled from the population. For this purpose, we assume that the distribution of these random vectors is described by the joint pdf , where represents a set of feasible pdfs with support in Q.
There are a number of ways to formulate the statistical model that will be used as the basis for the estimation of the distribution of the random parameters. One approach is to treat (2.1)- (2.3) as an, in general, nonlinear mixed effects model (see, for example, [16, 17, 18, 32]) wherein randomness in the parameters, q, are used to quantify uncertainty between subjects, and randomness in the output or measurements, yj,i given in (2.3) is intended to capture uncertainty within individual subjects. In this case we assume that the observed data points are of the form
where εj,i, j = 0, …, ni, i = 1, …, m, representing measurement noise are assumed to be independent across subjects (i.e with repsect to i), conditionally independent with respect to within subjects (i.e with repsect to j), identically distributed with mean 0 and known common variance σ2, and with εj,i ~ φ, j = 0, …, ni, i = 1, …, m. In this case, for example, using conditional probability and the total probability formula, a likelihood function could be defined formally as
Once one deals with a number of computational issues, specifically, the discretization or parameterization of f0, finite dimensional approximation of the in general infinite dimensional state equation (2.1), the efficient evaluation of a potentially high dimensional integral, the loss of precision and underflow issues due to the fact that the evaluation of requires the computation of products of small numbers, etc., one could then seek a maximum likelihood estimator for f0 by maximizing or, more typically, an expression involving to avoid having to deal with the products. Under appropriate regularity assumptions on φ, f0, and the system (2.1)- (2.3), one way to do this might be via a gradient based search. Another might be via stochastic optimization. One could also treat direct observations of as missing data and then use the iterative E-M algorithm to find the MLE (see, for example, [12]).
Alternatively, one could use the likelihood function defined above and take a Bayesian approach (see, for example, [8, 9, 10, 15, 33, 34]). One way of doing this would be to assume f0 = f0(·; ρ) has been parameterized by a parameter vector , where denotes a parameter set. Then assume a prior on ρ and apply Bayes to obtain the posterior as
where Z is the normalizing constant given by
Still another Bayesian approach could be used to estimate the distribution of directly where now the posterior for serves as the estimator for f0. In this case we assume that εj,i, j = 0, …, ni, i = 1, …, m are simply independent both across and within subjects, identically distributed with mean 0 and known common variance σ2, and with εj,i ~ φ, j = 0, …, ni, i = 1, …, m. If we now let denote the prior for , then Bayes yields
where the normalizing constant Z is now given by
Both of these Bayesian approaches also have some of the same computational issues as the MLE approach when some sort of MCMC technique such as Metropolis-Hastings or the Gibbs Sampler is used to sample the posterior distribution.
In our study here, however, we take a statistically somewhat less sophisticated approach. We consider the naive pooled data estimator. We do this for a number of reasons. 1) Our primary focus here is the finite dimensional approximation of the infinite dimensional state equation and the convergence of the corresponding estimators and the computational challenges described above would only serve to confound our findings, 2) The naive pooled estimator meshes especially well with the approach we take in dealing with the randomness in the family of PDEs (i.e. abstract parabolic, and eventually, damped hyperbolic) of particular interest to us here in the context of the alcohol biosensor problem described earlier. 3) A reasonable argument could be made that the data we observe is best described as pooled or averaged. We note that it in fact turns out that the approximation and convergence results we present here are highly relevant to the MLE and Bayesian approaches described in the previous paragraphs; we are currently investigating that and we will report on our findings and results in those cases elsewhere. Finally it is interesting to note that in the Bayesian approach, if the prior f0 and the distribution of the measurement noise process, εj,i, as described by the density φ are both assumed to be normal, then the naive pooled data estimator we find here is in fact the Maximum A-Posteriori, or MAP, estimator.
In light of this, our statistical model assumes that the observed data points can be represented by the mean output of the model plus random error. Thus, we assume that we have random observations of the process given by a random array with components
(2.4) |
where in (2.4), εj,i, j = 0, …, ni, i = 1, …, m, represent measurement noise and are assumed to be independent and identically distributed with mean 0 and known common variance σ2. For , define
(2.5) |
the mean behavior at time tj, j = 0, …, ni, if .
The estimation problem is to estimate the pdf, f0, using a least squares approach
(2.6) |
where the vi(tj; f) are as given in (2.5).
Solving the optimization problem given in (2.6) will typically require finite dimensional approximation of the dynamical system given in (2.1)-(2.2), and the parameterization of the feasible set of pdfs, . Indeed, in our treatment here, we assume that the set of pdfs, , is parameterized by a vector of parameters θ ∈ Θ, where is a set of feasible parameters. In this case, we denote the set of pdfs by .
We approximate the estimation problem given in (2.6) by a sequence of finite dimensional estimation problems by replacing vi(tj; f) with a finite dimensional approximation . We obtain
(2.7) |
We note that ultimately, we will want to dispense with the assumption that has been parametrized by the finite dimensional parameter θ ∈ Θ and actually estimate the shape of f directly. In this case, will also have to be approximated or discretized with the level, or dimension of the parameterization having to grow in order to establish convergence. We are currently studying this extension to the results presented here and will discuss our findings elsewhere. Analogous to theorem 5.1 in [7], we have the following convergence result for the .
Theorem 2.1. Let be compact. If
A. The maps on Θ, θ ↦ f (q; θ), for almost every q ∈ Q, and θ ↦ JN (f (·; θ); V), for all N and are continuous,
B. For any sequence of densities with limN→∞ fN(q) = f (q), a.e. q ∈ Q, for some , we have converging to vi(tj; f) for all i ∈ {1, …, m} and j ∈ {0, …, ni} as N → ∞, and
C. The vi(tj; f) and are uniformly bounded for all j ∈ {0, …, ni}, i ∈ {1, …, m} and ,
then it will follow that there exist solutions to the estimation problems over , given in (2.7), and there exists a subsequence of the that converges to a solution of the estimation problem over given in (2.6).
Proof. Finding the solution to the problem in (2.7) is equivalent to finding the parameters θ ∈ Θ such that JN (f; V) is minimized. Since Θ is a compact set and the map θ → JN (f(·; θ); V) is continuous for all N by (A), a solution to the estimation problem (2.7) over exists.
Next, let be any sequence with limN→∞ fN(q) = f (q), a.e. q ∈ Q for some and consider that
for some M > 0, since vi(tj; f) and are uniformly bounded for all i ∈ {1, …, m} and j ∈ {0, …, ni} (by assumption (C)), and . Then, by (B), we obtain
(2.8) |
as N → ∞. On the other hand, since , where , is the minimizer of JN(f; V), we have
(2.9) |
for all and N = 1, 2, …. Since , compact, there exists a subsequence with as k → ∞. Thus, taking the limit as k → ∞ in (2.9) with N replaced by Nk, and using (2.8) (with , all k =1, 2, … when the limit is taken on the right hand side of (2.9)), we obtain
(2.10) |
for all , where . Thus, (2.10) implies that is a solution of estimation problem given in (2.6) over . □
3. Abstract Parabolic Systems with Unbounded Input and Output
Let V and H be in general complex (but in many instances, real would suffice) Hilbert spaces with V ↪ H, i.e. V is continuously and densely embedded in H. By identifying H with its dual H*, we obtain the Gelfand triple V ↪ H ↪ V*. Let < ·, · >H denote the H inner product and ∣·∣H, ∥·∥V denote norms on H and V, respectively, and assume that (Q, dQ) is a compact metric space contained in Euclidean space endowed with Lebesgue measure. In what follows all multi-dimensional vectors, whether in Euclidean or some abstract space, are assumed to be column vectors, unless explicitly stated otherwise. For q ∈ Q, let be a sesquilinear form that has the following properties
Boundedness There exists a constant α0 > 0 such that ∣a(q; ψ1, ψ2)∣ ≤ α0∥ψ1∥V∥ψ2∥V, ψ1, ψ2 ∈ V, q ∈ Q,
Coercivity There exist constants and μ0 > 0 such that , ψ ∈ V, q ∈ Q,
Measurability For all ψ1, ψ2 ∈ V, the map q ↦ a(q; ψ1, ψ2) is measurable on Q with respect to all measures defined in terms of the densities in , where is the set of feasible parameters.
Assume further that b(q), c(q) are respectively μ and ν dimensional row vectors in V* with the maps q ↦< b(q), ψ >V*,V and q ↦< c(q), ψ >V*, V measurable on Q for ψ ∈ V, where < ·, · >V*,V denotes the duality pairing between V and V*. We consider the system which is written in weak form as
(3.1) |
where T > 0, and φ(t)(s) = φ(t − s)χ[0, T](s), s ∈ [0, T]. For , it can be shown that (3.1) has a unique solution (see [24, 39]) which depends continuously on . It follows that .
For q ∈ Q, under the assumptions (i),(ii), the sesquilinear form a(q; ·, ·) defines a bounded linear operator A(q) : V → V* by < A(q)ψ1, ψ2 >V*,V = −a(q; ψ1, ψ2) where ψ1, ψ2 ∈ V. It can be shown further that (see [3, 5, 39]) A(q) restricted to the set Dom(A(q)) = {ϕ ∈ V : A(q)ϕ ∈ H} is the infinitesimal generator of a holomorphic or analytic semigroup of bounded linear operators on H. Moreover, this semigroup can be restricted to be a holomorphic semigroup on V and extended to a holomorphic semigroup on V* by appropriately restricting or extending the domain, Dom(A(q)), of the operator A(q) (see, for example, [3] and [39]).
For q ∈ Q, define the operators and by , for , φ ∈ V, and ψ ∈ L2([0, T], V), and rewrite the system in (3.1) as
(3.2) |
The mild solution of (3.2) is given by the variation of constants formula as
(3.3) |
Moreover, since the semigroup {eA(q)t : t ≥ 0} is analytic it follows that
(3.4) |
is well defined.
3.1. The Discrete Time Formulation
Now let τ > 0 be a sampling time and consider zero-order hold inputs of the form u(t) = uj, t ∈ [jτ,(j + 1)τ), j = 0, 1, 2, …. Setting xj = x(jT), for j = 0, 1, 2, …, (3.3) and (3.4) yield that
(3.5) |
where now we let x0 ∈ V. Here, again by the properties of the analytic semigroup (see [26, 39]), we have {eA(q)t : t ≥ 0}, xj ∈ V, and . The operator appearing in (3.5) is defined by recalling (3.4). We set
(3.6) |
where x(j) in (3.6) denotes the function in L2(0, T, V) given by
(3.7) |
Now, in light of the coercivity assumption, Assumption (ii), by making the change of variables z(t) = e−λ0tx(t) and v(t) = e−λ0tu(t), without loss of generality we may assume that the operator A(q) is invertible with bounded inverse. Thus we have that . It follows that the recurrence given in (3.5) is a recurrence in V with and . Thus it now becomes possible to allow the discrete time output operator defined in (3.6) and (3.7), if so desired, to take on the much simpler form . In what follows we shall assume that the output operator takes this simpler form.
3.2. Systems with Boundary Input
Of primary interest to us here are systems of the form (3.1) or (3.2) where the input u is on the boundary of the spatial domain. The theory developed in [13] and [27] tells us how in this case to define the input operator B(q) and the notion of a mild solution upon which our approach is based. Let W be a Hilbert space which is densely and continuously embedded in H. Let and and assume that , Γ(q) is surjective and Δ(q) = A(q) on Dom(A(q)). We then consider the system with input on the boundary given by
(3.8) |
In [13], Curtain and Salamon define a solution to the system (3.8) for the case where and x0 ∈ W with Γ(q)x0 = u(0), to be a function x ∈ C([0, T]; W) ∩ C1([0, T]; H) that satisfies (3.8) at every t ∈ (0, T). The operator A(q) densely defined implies that it has an adjoint operator A(q)* : Dom(A(q)*) ⊆ H → H which is also densely defined and closed. Defining Z* to be the Hilbert space Dom(A(q)*) endowed with the graph Hilbert space norm associated with A(q)*, Z* will be continuously and densely embedded in H. So, the Gelfand triple Z* ↪ H ↪ Z is obtained where Z = Z** represents the dual space of Z*. By definition and consequently therefore, . It follows that the semigroup {eA(q)t : t ≥ 0} can be uniquely extended to a holomorphic semigroup on Z with infinitesimal generator A(q) : H ⊆ Z → Z, the extension A(q) to H defined via the duality pairing < A(q)ψ, ϕ>Z,Z*=< ψ, A(q)* ϕ >H, for ψ ∈ H, and ϕ ∈ Z* = Dom(A(q)*).
For each q ∈ Q, let be any right inverse of , and define the operator by B(q) = (Δ(q) – A(q))Γ+(q). It is not difficult to show that B(q) is well defined (i.e. that it does not depend on the particular choice of the right inverse Γ+ (q)). Then for any x0 ∈ H and , the mild solution, x ∈ C([0, T]; Z), of the initial boundary value problem in (3.8) is the Z-valued function given by
(3.9) |
It is shown in [13] that if (3.8) has a solution, then it is given by (3.9) where x ∈ C([0, T], H) ∩ H 1 ((0, T), Z) and moreover, we have that the estimate given by holds.
We note that if in fact we have that W ⊂ V, which is often the case (for example, in a one dimensional diffusion equation with either Neumann or Robin boundary input (see our examples in Section (6) below), but may not be the case if, for example, the boundary input is Dirichlet), then in the above formulation we may take Z* = V and Z = V*. In this case it will follow that and consequently that the theory presented at the beginning of Section (3), and in particular, the discrete time theory presented in Section (3.1), applies. For ease of exposition, we will assume that this is indeed the case for what follows below. We note that all the results continue to follow in the more general case where Z* = Dom(A(q)*). It then follows that and that and therefore that
and . Note that now we have
(3.10) |
and if Γ+(q) can be chosen so that , then the expression in (3.10) becomes . Then, if x0 = 0 ∈ H, yi is given by
(3.11) |
where the operator appearing in (3.11) is the gain that represents the contribution of the jth input channel to the ith output channel.
4. Random Regularly Dissipative Operators and Their Associated Semigroups
In this section, we summarize the key ideas from the framework developed in [20] and [30] which are central to our approach. We assume that is a p-dimensional random vector whose support is in where for all i = 1, 2, …, p. Letting , and let for some r be closed and bounded. We assume that the distribution of can be represented by an absolutely continuous cumulative distribution function , or equivalently, by a (push forward) measure , where . Let a(·; ·, ·) be a sesquilinear form satisfying (i)-(iii) given in Section (3), where the assumed measurability is with respect to all of the measures .
Define the Bochner spaces and . The assumptions from Section (3) on the spaces V and H guarantee that the spaces , and form the Gelfand triple (see [20]) where is identified with its dual and is identified with .
For , satisfying for all i = 1, 2, …, p, and , set . Then we define the π(ρ)-averaged sesquilinear forms (note, the spaces , , and now of course depend on ρ, but our notation here we will not explicitly show this dependence unless clarity demands it) by
(4.1) |
where and . It is not difficult to show that Assumptions (i)-(iii) imply that is a bounded and coercive sesquilinear form on . Consequently, this sesquilinear form defines a bounded linear map by which when appropriately restricted or extended is the infinitesimal generator of analytic semigroups of bounded linear operators on , and (see [3, 5, 39]). We assume that the maps q ↦< b(q), ψ(q) >V*,V and q ↦< c(q), ψ(q) >V*,V are π(ρ)-measurable for any , and that ∥b(q)∥V*, ∥c(q)∥V* are uniformly bounded for a.e. q ∈ Q. We then define and by
(4.2) |
(4.3) |
for and .
With the definitions (4.1) - (4.3) of the operators , , and , consider the abstract evolution system given by
(4.4) |
whose mild solution is given by
(4.5) |
where is the analytic semigroup generated by the operator . From (4.4) and (4.5), it follows that
(4.6) |
As in Section (3), we obtain a discrete or sampled time version of (4.4). Now let x0 ∈ V, let τ > 0 be the sampling time, and consider zero-order hold inputs of the form u(t) = uj, t ∈ [jτ, (j + 1)τ),τ), j = 0, 1, 2, …. Setting and , j = 0, 1, 2, …, (4.5) and (4.6) yield
(4.7) |
with and , , and . Note that the operators and are bounded since is an analytic semigroup on , , and (see [3, 5, 24, 39]). If has bounded inverse, then .
It is shown in [20] and [30] that the solutions of systems (4.4) and (3.2) and (4.7) and (3.5) agree for π-a.e. q ∈ Q. It follows that
(4.8) |
and hence, from (4.8), that
(4.9) |
where in (4.8) and (4.9) denotes expectation with respect to the measure π.
5. Approximation and Convergence
In this section, we can now formally state our estimation problem and the sequence of finite dimensional approximating problems. We will also state and prove a convergence theorem.
5.1. The Estimation Problem
Assume that data of the form , has been given. Determine a compact subset of , , , which minimizes
(5.1) |
where for i = 1, 2, …, m, is given by (4.7) with , j = 0, …, n, i = 1, 2, …, m, and (4.9).
Recalling the assumption that for i ∈ {1, 2, …, p}, , let . Let , and . Then, for N = 1, 2, …, let , be such that , and let . Set , , and let be a finite dimensional subspace of . Let be a linear map defined by for any , let denote the orthogonal projection of onto , and define by .
In addition, recall that we have assumed that for p ∈ Ξ, the probability distributions described by π(ρ) are all absolutely continuous; that is π(ρ) ~ f(ρ), where f(ρ) = f(·; ρ) is a joint density for the random vector .
Noting that in this formulation, is neither a subspace of nor , we define the operators on to be what are essentially the restrictions of to the spaces . More precisely, we set
(5.2) |
where , .
Define the operators and by
(5.3) |
(5.4) |
where , and .
With these definitions, we can now state the finite dimensional approximating problems.
Assume that data of the form , has been given. Determine , Ξ a compact subset of , , , which minimizes
(5.5) |
where in (5.5), for i = 1, 2, …, m, is given by (4.7) and (4.9) with , j = 0, …, ni, i = 1, 2, …, m, replaced by , replaced by
replaced by , replaced by , and replaced by . It follows that for i = 1, 2, …, m,
(5.6) |
with the operators , , and appearing in (5.6) are as they have been defined above using (5.2)-(5.4).
In the following sections we prove that there exists a subsequence of solutions to the sequence of approximating problems that converges to the solution of our original estimation/optimization problem.
5.2. A Version of the Trotter-Kato Semigroup Approximation Theorem
Our convergence proof is based on a version of the Trotter-Kato semigroup approximation theorem ([5, 21, 26]) that does not require the approximating spaces to be subspaces of the underlying infinite dimensional state space. Banks, Burns and Cliff [1] proved just such a result but unfortunately they do not state their hypotheses in terms of resolvent convergence which is what we require here. Consequently we establish the result in its requisite form here.
Let be a Hilbert space with norm ∣ · ∣ and let be a sequence of Hilbert spaces, each equipped with norm ∣ · ∣N. Assume that for each , is a closed (finite dimensional) subspace of . Assume that the operators on , and for each , on , are in G(M, λ0) with M and λ0 independent of N; that is they are the infinitesimal generators of C0-semigroups on and , on , respectively, that are uniformly (uniformly in N) exponentially bounded. (We note that if is obtained from a bounded and coercive sesquilinear form and the are subspaces with defined as the restrictions of to , then this latter assumption is easily verified [3, 5].)
Theorem 5.1. Let , , and be Hilbert spaces as defined above. Let be an operator such that and . Let be the canonical projection of onto and define . Let on , and on . Suppose that for some λ ≥ λ0,
(5.7) |
for every , where and denote respectively the resolvent operators of and at λ. Then
(5.8) |
in , for every uniformly in t on compact t-intervals.
Proof. For ease of exposition and without loss of generality, let λ0 = 0. Then, since and are both strongly differentiable in t, we have
(5.9) |
Then, using an identity for analogous to (5.9) , we obtain
(5.10) |
Then, since
(5.11) |
(5.12) |
Equation (5.12) and (recall λ0 = 0), for any , yield
(5.13) |
By (5.7), we know that the integrand in (5.13) converges to 0 for a fixed s, and also it is bounded by 2M2∣u∣/λ, and therefore, by the Lebesgue Dominated Convergence Theorem, the right-hand side of (5.13) converges to 0 as N → ∞, where the convergence is uniform in t on compact t-intervals.
Letting , and using the fact that is dense in , we have that
(5.14) |
for all . Then, since , (5.7) implies that
(5.15) |
and similarly, and ((5.7)) imply that
(5.16) |
Combining (5.15), (5.16), and the triangle inequality we get
(5.17) |
as N → ∞. Then, because of (5.14), and again by the triangle inequality, we obtain that
(5.18) |
Letting , we have ; and since is dense in , it follows from (5.7), (5.17) and (5.18) that
for all uniformly in t on compact t-intervals. □
5.3. Application to the Density Estimation Problem
Let {ρN}, ρ ∈ Ξ be such that fN(q) → f(q), for almost every q ∈ Q, where fN(q) = f(q; ρN) and f(q) = f(q; ρ). Let , , , , , , , and be as they were defined earlier. Set and consider it to be an operator on and by extending f(·, ρ), which is defined on Q, to by setting it equal to zero on and let . Then it follows from Assumptions (i) - (iii) that is in G(M, λ0) on and is in G(M, λ0) on with M and λ0 independent of N.
In the statement of Theorem (5.1), set , , , , , and . To apply Theorem (5.1) and conclude that in this case, (5.8) holds, we need only verify (5.7). In order to do this, we require the following two additional assumptions
There exist positive real numbers γ and δ such that for any ρ ∈ Ξ, we have 0 < γ ≤ f(q; ρ) ≤ δ < ∞ for π(ρ)-a.e. q ∈ Q.
For all , there exists such that as N → ∞.
We are now able to prove the following theorem.
Theorem 5.2. Let assumptions (i) - (v) be satisfied and let {ρN}, ρ ∈ Ξ be such that fN(q) → f(q), for almost every , where fN(q) = f(q; ρN) and f(q) = f(q; ρ). Then, with the definitions above, the conditions of Theorem (5.1) (and in particular the resolvent convergence specified in (5.7)) are satisfied. Consequently, it follows that
(5.19) |
for every , uniformly in t on compact t-intervals where is the semigroup on given by and is the semigroup on and given by .
Proof. First, note that if we can show resolvent convergence for every , then since is dense in , and and are uniformly bounded, the desired resolvent convergence for every will have been demonstrated. In what follows, for any , f (·; ρ) is defined on , but it can be extended to be defined on by setting it equal to zero on . We will use this fact frequently below without further remark.
Let and define , and . Suppose also that be as in Assumption (v) for .
Then, by triangle inequality, we have
(5.20) |
Thus, (5.20), Assumption (v) and the continuous embedding of in imply that it is enough to show that as N → ∞. Let zN = wN – uN. Then, since ,
(5.21) |
Also, since ,
(5.22) |
where denotes the Moore-Penrose generalized inverse [11] of . We note that for , is the function in that agrees with ψ on QN and is zero on . Then, from (5.21) and (5.22), we obtain
(5.23) |
Recalling Assumptions (i) and (ii) for the form α(·; ·, ·) on V × V, let , , denote the boundedness and coercivity coefficients for the forms . Then, using boundedness, coercivity, Assumptions (iv) and (v), Young’s and the Cauchy Schwarz Inequalities, and the continuous embeddings of the space V in the space H (i.e. that there exist a constant k such that ∣ · ∣H ≤ k∥ · ∥V) and (5.23), for any ε > 0, we obtain
(5.24) |
Then, letting , it follows from (5.24) that
(5.25) |
Choosing ε positive, but sufficiently small in (5.25), it follows from Assumption (v) and the hypotheses of the theorem that
(5.26) |
Thus (5.26) together with (5.20), and Assumption (v) yield resolvent convergence and the theorem is proved. □
We note that in the proof of Theorem (5.2) we were in fact able to establish resolvent convergence in the norm. Consequently we may conclude that the semigroup convergence in (5.19) is in the norm as well. Moreover, it is not difficult to establish the following corollary to Theorem (5.2).
Corollary 5.1. Under the same hypotheses of Theorem (5.2), we have
(5.27) |
for every i = 1, 2, …, m, uniformly in j, for j = 0, 1, 2,…, ni, where and are given in (5.6) and and are given in (4.7).
The assumption that the feasible parameter set Ξ is closed and bounded in , together with (5.27) in the statement of Corollary (5.1) and Theorem (2.1) then yield the following result.
Theorem 5.3. If, in addition to Assumptions (i)-(v), we assume that the maps ρ ↦ f (q; ρ) from Ξ to are continuous for π(ρ) a.e. , then each of the approximating estimation problems admits a solution, ρN*. Moreover, the sequence {ρN*} has a convergent subsequence, {ρN*} with ρNk* → ρ* and ρ* a solution to the original estimation problem.
It is also possible to establish a consistency result for the estimator . We require the following additional assumptions:
(a) The measurement noise {εj,i} is i.i.d. with respect to a probability space {Ω, Σ, P} with and Var[εj,i∥P] = σ2,
(b) The feasible set of parameters Ξ is compact (i.e. closed and bounded since it is finite dimensional) and has nonempty interior,
(c) For i = 1, 2, …, ni = n and nτ = T for some positive integer n and some T > 0, where τ is the sampling time defined in Section 3,
(d) That , for some ρ0 ∈ int{Ξ}, where for i = 1, 2, …, m, is given by (4.7) with , j = 0, …, ni, i = 1, 2, …, m, and (4.9), and
(e) For each i = 1, 2, …, m, ρ0 ∈ Ξ is the unique minimizer of Ji,0 in Xi where
(5.28) |
and is given by (4.4) -(4.6) with .
Then a straight forward application of Theorem 4.2 in [7] can then be used to establish the following lemma and theorem (see [32]).
Lemma 5.1. If in addition to Assumptions (i)-(iv) and (a) (e) above we assume that the maps ρ ↦ f(q; ρ) from Ξ to are continuous for π(ρ) a.e. , then there exists an event A ∈ Σ with P(A) = 1 such that for all ω ∈ A and J as given in (5.1) we have
as n, m → ∞ and τ → 0, with nτ = T, uniformly in ρ for ρ ∈ Ξ, where Ji is given by (5.1) and Ji,0 by (5.28).
Theorem 5.4. (Consistency of the estimator ρ*) Let ρ* ∈ Ξ be as defined in (5.1) in Section 5.1. Then under the assumptions of Lemma (5.1) the estimator is consistent for ρ0. That is ρ* → ρ0 in probability with repsect to the probability measure P, as m, n → ∞, and τ → 0 with ητ = T.
6. Examples and Numerical Results
6.1. The Adjoint Method
The approximating optimization problems are solved numerically by using an iterative gradient-based scheme. Once a basis for the space N is chosen, matrix forms of the operators , , and can be computed. The gradient of JN(ρ), with respect to the 2p + r parameters in ρ can be computed accurately (in fact exactly with the exception of finite precision arithmetic round-off) and efficiently (which is especially important if the dimension of the approximating system (5.6) and/or the number of parameters is large) using the adjoint method (see [23]). For each i = 1, …, m, set , j = 0, …, ni where KN is the number of basis elements for . Then for each i = 1, …, m, the adjoint systems are defined to be
(6.1) |
The gradient of JN at can then be computed from
(6.2) |
Using (6.1) and (6.2) to compute the gradient requires the calculation of the tensor . This can be done using the sensitivity equations. For t ≤ 0 set from which differentiation yields
(6.3) |
Then, setting , differentiating (6.3) with respect to ρ, and interchanging the order of differentiation, we obtain
(6.4) |
Combining (6.3) and (6.4), and solving the resulting system, we obtain
(6.5) |
Setting t = τ in (6.5), we obtain that .
To illustrate our approach, we consider the case of a one dimensional heat/diffusion equation on the interval [0, 1] with random (thermal) diffusivity and two different sets of boundary conditions. Consider the partial differential equation, boundary conditions and output operator given by
(6.6) |
(6.7) |
(6.8) |
(6.9) |
(6.10) |
(6.11) |
where 0 < η0 < 1. In the examples below, we consider the parameterized family of probability density functions defined as follows.
Definition 6.1. Let φ(q; θ), be a member in an exponential family [12], and let Φ denote its cumulative distribution function. Let θ represent a vector of parameters, and let be a bounded region to which φ will be restricted. Then define ΦD(θ) = ∫D φ(q; θ)dq. Then the family of pdfs, f (·, ρ) given by
where the parameters ρ include the parameters θ and parameters and to describe the domain D, is called a truncated exponential family.
It is clear that this family of densities satisfies Assumption (iv) and the hypotheses of Theorem (5.1).
All of the numerical results presented here use simulation data. Our studies involving actual experimental/clinical data are discussed elsewhere (see [32]). The simulated data was generated by first sampling the target distribution to obtain 100 samples q of . A spline based Galerkin approximation to the system (6.6) -(6.11) using a 128 equally spaced point grid on [0,1] was then solved using each -sample. The resulting 100 output signals were then averaged at each time point. The approximating estimation problems were all solved on either MAC or PC laptops using the Matlab optimization toolbox routine FMINCON for constrained optimization. Gradients were computed using either FMINCON built-in finite differencing or the adjoint method, (6.1)-(6.5). Which method was used had only a negligible effect on the results. The input signal used was u(t) = ∣cos(t)∣χ[0,2](t), t ∈ [0, 20], and the sampling interval was τ = 0.1. In all of our examples below, the admissible parameter space Q is assumed to be either in in the case of the uni-variate examples, or in the fist quadrant of the plane in the bivariate examples. Consequently when the approximating optimization problems were solved, the lower bounds for the supports of the random parameters, a and c, were constrained to be strictly positive. This is based on the requirements of the physical model (6.6)-(6.11) and the assumption that properties (i)-(iii) in Section 3 hold.
6.2. Examples 6.1,6.2 and 6.3; One Random Parameter; Truncated Uniform, Exponential and Normal Distributions
In this series of examples we consider the system (6.6),(6.7),(6.9)-(6.11) with q1 random and q2 = 1. In this case we have q = q1 ∈ Q = [a, b], W = [φ ∈ H2(0, 1), ΓDφ = 0}, , Dom(A(q)) = [φ ∈ V : Γ1φ = 0}, and Γ(q) = Γ1. It follows that
and ⟨b(q), ψ⟩V*,V = ⟨b, ψ⟩V*,V = ψ(1) = δ(· − 1), ψ ∈ V, and ⟨c(q), ψ⟩V*,V = ⟨c, ψ⟩V*,V = ψ(1/3), ψ ∈ V, where in this case η0 = 1/3. Standard arguments [3, 5] show that Assumptions (i)-(iii) are satisfied.
To carry out the finite dimensional discretization, we let n, m be positive integers and set N = (n, m). In this case we have either D = [a, b] (uniform and normal) or D = [0, R] (exponential). In what follows we describe the q or Q discretization for the uniform and normal cases; the exponential is similar. The basis for the approximating subspaces were taken to be tensor products of the standard linear spline basis elements corresponding to the uniform mesh on [0, 1], and the characteristic function basis for the interval [a, b]. The jth element corresponds to the jth sub-interval , j = 1, 2, …, m. In this way , i = 1, 2, …, n, j = 1, 2, …, m where , η ∈ [0, 1], q ∈ [a, b] with . Using standard estimates [29] it is not difficult to show that Assumption (v) holds.
Re-numbering so that where k = (i − 1)n + j and letting , the matrix representation for the operators are given by with
where r = (j − 1)n + i, s = (l − 1)n + k, i, k = 1, 2, …, n, j, l = 1, 2, …, m.
We also have
r, s = 1, 2, …, nm, r = (j − 1)n + i, s = (l − 1)n + k, i, k = 1, 2, …, n, j, l = 1, 2, …, m.
With the density f = f0(·; ρ) = f0(·; (a, b, θ)) as given in Definition (6.1) above, if we define
it is a straightforward, albeit somewhat tedious, exercise to compute the partial derivatives , , , , , i = 0, 1, 2. These partial derivatives show up in the matrices that appear in the adjoint equations (6.1)-(6.5). We tested our scheme on truncated uniform (ρ = (a, b)), exponential (ρ = (R, θ)) and normal (ρ = (a, b, μ, σ)) distributions. Our results are shown in Table (6.1) and Figure (6.1) below. In panels (a) - (c) of Figure (6.1), we have plotted the converged estimated population models together with the data and the 75% credible band for the truncated uniform, exponential and normal densities. The credible bands can be obtained directly from the solution to the population model. Indeed, is sampled using the estimated distribution and then is evaluated at the sample q’s where is given by (5.6). Now the q dependence of the solution to the population model is only valid π almost everywhere and our convergence framework is an L2 (in q) theory. Consequently, pointwise evaluation is, strictly speaking, undefined. However, the results appear to be useful so we have included them. We are currently working on an extension of the results presented here that involves introducing parabolic regularization in q. This will potentially allow us to justify pointwise evaluation in q of the population model to obtain credible band. It is interesting to note that the credible band for the exponential distribution is quite wide, almost to the point of making the population model not that useful. This is because the exponential distribution, especially one with a mean and variance of μ = 1/θ = 3, has a rather “fat” tail. Panels (d) and (f) of Figure (6.1) show the converging estimated pdfs for the truncated exponential and normal distributions, respectively. Panel (e) shows how the output of the population model compares to the data when the resolution of the finite element discretizations of q and η and the truncation point of the densities are varied. It appears from the figure that it is the q discretization that determines the rate of convergence, while a rather coarse η discretization seems to suffice. We believe that this explains the slow convergence of θ (the exponential parameter) and σ (the standard deviation of the normal) observed in Table (6.1) and panel (f) of Figure (6.1). The truncation of the density appears to have only a negligible effect. We are currently investigating whether using smoother first order splines for the q elements produces improved estimates and more rapid convergence.
Table 6.1:
N | Uniform | Exponential | Normal | |||||||
---|---|---|---|---|---|---|---|---|---|---|
n | m | a* | b* | θ* | R* | a* | b* | μ* | σ* | |
4 | 4 | 1.76 | 4.27 | 2e-5 | 3.61 | 2.61 | 5.44 | 4.05 | 0.62 | |
8 | 8 | 1.91 | 4.05 | 4e-5 | 3.81 | 2.29 | 5.42 | 4.01 | 0.40 | |
16 | 16 | 1.94 | 4.00 | 0.20 | 4.34 | 2.17 | 5.42 | 4.01 | 0.37 | |
32 | 32 | 1.95 | 3.99 | 0.30 | 5.95 | 2.15 | 5.42 | 4.00 | 0.35 | |
64 | 64 | 1.96 | 3.99 | 0.30 | 11.08 | 2.14 | 5.42 | 4.00 | 0.35 | |
True Values | 2 | 4 | 1/3 | — | — | — | 4 | 0.25 |
6.3. Example 6.4; Two Random Parameters; Truncated Bi-variate Normal Distribution
In this example we consider the system (6.5)-(6.11), but instead of the Dirichlet boundary condition (6.2) at η = 0, we take the Robin boundary condition (6.3) at η = 0. In this case, q = [q1, q2] is the vector of random parameters with q ∈ D = Q = [a, b] × [c, d], H = L2(0, 1), V = H1(0, 1), W = H2(0, 1), and Dom(A(q)) = {φ ∈ H2(0, 1) : ΓRφ = 0, Γ1φ = 0} and Γ(q) = Γ1. The sesquilinear form on V × V is given by with < b(q), ψ >V*,V = q2ψ(1) = q2ψ(· − 1), ψ ∈ V, and < c(q), ψ >V*,V=< c,ψ >V*,V = ψ(0), ψ ∈ V where we have set η0 = 0. In this case N = (n, m1, m2), where n is again the level of discretization of the space variable η and mi is the level of discretization of qi, i = 1, 2. Once again the approximating subspaces were constructed using tensor products, , i = 0, 1, 2, …, n, j = 1, 2, …, m1, k = 1, 2, …, m2 where , η ∈ [0, 1], q1 ∈ [a, b], q2 ∈ [c, d] with .
In this example the truncated exponential family was based on the bivariate normal. Once again, it is possible to compute all the partial derivatives (although of course their evaluation requires the numerical evaluation of single and double integrals) that are required to form the matrices that appear in the state and adjoint equations (6.1)-(6.5). We obtained simulated data by generating samples for from a distribution with and .
Our results are shown in Table (6.2) and Figure (6.2), where it can be seen that we obtained reasonably good approximations to the actual parameters that we used to simulate the data. We parameterized the covariance matrix as Σ = LTL, where the 2 × 2 matrix L is upper triangular with L11 and L22 both positive so as to guarantee that at each step in the optimization, Σ is positive definite symmetric. The plot of the optimal joint density in the left hand panel of Figure (6.2) correspond to n = 16 and m1 = m2 = 8. In the right hand panel of Figure (6.2) we have plotted the output of the fit population model and the 75% credible band. Once again, we believe that the rate of convergence could be improved by using linear splines rather than piece-wise constant elements to discretize the random parameters q.
Table 6.2:
n | m1 | m2 | a* | b* | c* | d* | μ* | σ* | |
---|---|---|---|---|---|---|---|---|---|
4 | 8 | 8 | 5.88 | 18.15 | 4.85 | 14.63 | |||
8 | 8 | 8 | 5.67 | 18.35 | 5.17 | 14.46 | |||
16 | 8 | 8 | 5.79 | 18.17 | 5.06 | 14.66 |
7. Concluding Remarks
We are currently working on a number of applications and extensions of the results presented here. Specifically, we are looking at applying our approach to actual experimental and clinical BrAC and TAC data collected in both the lab/clinic and the field using two different transdermal alcohol biosensors from a number of different individuals that include several drinking episodes occurring over a time period of several days. We are developing deconvolution schemes based on population models fit using the approach discussed here that, given an output signal, will provide a population based estimate for the input together with credible bands obtained directly from the deconvolved input signal and not requiring simulation. We are also looking at extensions of the ideas presented here to the solution of the LQR and LQG compensator problems wherein the infinite dimensional linear regularly dissipative dynamics and quadratic performance index involve random parameters.
In our treatment here, we assumed that the probability measures describing the distribution of the random parameters were defined in terms of parameterized families of joint density functions. We are looking at developing numerical schemes and an associated convergence theory for estimating the shape of the density directly. We also hope to be able to apply the convergence theory based on the Prohorov metric on a space of measures developed in [7] more directly to the class of problems that we have discussed here. More precisely, we would like to be able to eliminate the assumption that the measures are defined in terms of a density, and estimate the measure directly. We believe that such a theory may be possible by assuming that our approximating subspaces are required to satisfy additional regularity (i.e. smoothness) assumptions; in particular that they are required to be contained in the domain of the operator. Then by making use of a slightly different version of the Trotter-Kato semigroup approximation theorem (see, for example, [1]) we believe it may now be possible to verify the hypotheses of the more general convergence theorem established in [7] for the estimation of the probability measures directly, rather than by estimating an associated density.
Footnotes
This research was supported in part by grants R21AA017711 and R01AA026368 from the National Institute on Alcohol Abuse and Alcoholism (NIAAA).
Contributor Information
Melike Sirlanci, Department of Computing and Mathematical Sciences, California Institute of Technology.
Susan E. Luczak, Department of Psychology, University of Southern California
I. G. Rosen, Department of Mathematics, University of Southern California.
References
- [1].Banks HT, Burns JA, and Cliff EM. Parameter estimation and identification for systems with delays. SIAM Journal on Control and Optimization, 19(6):791–828, 1981. URL: 10.1137/0319051, arXiv: 10.1137/0319051, doi: 10.1137/0319051. [DOI] [Google Scholar]
- [2].Banks HT, Flores KB, Rosen IG, Rutter EM, Sirlanci Melike, and Thompson Clayton. The prohorov metric framework and aggregate data inverse problems for random pdes. Communications in Applied Analysis, 22(3):415–446, 2018. URL: https://acadsol.eu/en/articles/22/3/6.pdf, doi: 10.12732/caa.v22i3.6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Banks HT and Ito K. A unified framework for approximation in inverse problems for distributed parameter systems. Control Theory Advanced Technology, 4(1):73–90, 1988. URL: https://apps.dtic.mil/dtic/tr/fulltext/u2/a193780.pdf. [Google Scholar]
- [4].Banks HT, Kareiva P, and Lamm PK. Estimation techniques for transport equations In Mathematics in Biology and Medicine, pages 428–438. Springer, 1985. URL: https://link.springer.com/chapter/10.1007/978-3-642-93287-8_58. [Google Scholar]
- [5].Banks HT and Kunisch Karl. Estimation Techniques for Distributed Parameter Systems. Springer Science & Business Media, 2012. URL: https://www.springer.com/us/book/9780817634339. [Google Scholar]
- [6].Banks HT and Lamm PK. Estimation of variable coefficients in parabolic distributed systems. IEEE Transactions on Automatic Control, 30(4):386–398, 1985. URL: https://ieeexplore.ieee.org/document/1103955, doi:DOI: 10.1109/TAC.1985.1103955. [DOI] [Google Scholar]
- [7].Banks HT and Thompson W Clayton. Least squares estimation of probability measures in the prohorov metric framework. Technical report, DTIC Document, 2012. URL: https://www.researchgate.net/publication/268353806_Least_Squares_Estimation_of_Probability_Measures_in_the_Prohorov_Metric_Framework. [Google Scholar]
- [8].Bui-Thanh Tan, Burstedde Carsten, Ghattas Omar, Martin James, Stadler Georg, and Wilcox Lucas C.. Extreme-scale UQ for Bayesian inverse problems governed by PDEs. In SC12, November 10–16. URL: https://ieeexplore.ieee.org/document/6468442. [Google Scholar]
- [9].Bui-Thanh Tan, Ghattas Omar, Martin James, and Stadler Georg. A computational framework for infinite-dimensional Bayesian inverse problems Part I: The linearized case, with application to global seismic inversion. SIAM J. Sci. Stat. Comp, 35(6):A2494A2523 URL: https://epubs.siam.org/doi/abs/10.1137/12089586X?journalCode=sjoce3, doi: 10.1137/12089586X. [DOI] [Google Scholar]
- [10].Calvetti Daniela, Kaipio Jario P., and Somersalo Erkki. Inverse problems in the Bayesian framework. Inverse Problems, 30:1–4, 2014. URL: iopscience.iop.org/article/10.1088/0266-5611/30/11/110301/pdf, doi: 10.1088/0266-5611/30/11/110301. [DOI] [Google Scholar]
- [11].Campbell Stephen L and Meyer Carl D. Generalized Inverses of Linear Transformations. SIAM, 2009. URL: http://bookstore.siam.org/cl56/. [Google Scholar]
- [12].Casella George and Berger Roger L. Statistical Inference, volume 2 Duxbury, Pacific Grove, CA, 2002. URL: https://books.google.com/books/about/Statistical_Inference.html?id=0x_vAAAAMAAJ. [Google Scholar]
- [13].Curtain Ruth F and Salamon Dietmar. Finite-dimensional compensators for infinite-dimensional systems with unbounded input operators. SIAM Journal on Control and Optimization, 24(4):797–816, 1986. URL: https://epubs.siam.org/doi/10.1137/0324050, doi: 10.1137/0324050. [DOI] [Google Scholar]
- [14].Dai Zheng, Rosen I Gary, Wang Chunming, Barnett Nancy, and Luczak Susan E. Using drinking data and pharmacokinetic modeling to calibrate transport model and blind deconvolution based data analysis software for transdermal alcohol biosensors. Mathematical Biosciences and Engineering: MBE, 13(5):911, 2016. URL: http://www.aimsciences.org/journals/displayArticlesnew.jsp?paperID=12739. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Dashti Masoumeh and Stuart Andrew M.. The Bayesian approach to inverse problems In Ghanem R et al. , editor, Handbook of Uncertainty Quantification, pages 311–428. Springer International Publishing Switzerland, 2017. URL: https://www.springer.com/us/book/9783319123844. [Google Scholar]
- [16].Davidian M and Giltinan D. Nonlinear Models for Repeated Measurement Data. Chapman and Hall, New York, 1995. URL: https://www.crcpress.com/Nonlinear-Models-for-Repeated-Measurement-Data/Davidian-Giltinan/p/book/9780412983412. [Google Scholar]
- [17].Davidian M and Giltinan DM. Nonlinear models for repeated measurement data: An overview and update. Journal of Agricultural, Biological and Environmental Statistics, 8:387–419, 2003. URL: https://link.springer.com/article/10.1198/1085711032697. [Google Scholar]
- [18].Demidenko E. Mixred Models, Theory and Applications, Second Edition. John Wiley and Sons, Hoboken, 2013. URL: https://www.wiley.com/en-us/Mixed+Models%3A+Theory+and+Applications+with+R%2C+2nd+Edition-p-9781118091579. [Google Scholar]
- [19].Dumett Miguel A, Rosen I Gary, Sabat J, Shaman A, Tempelman L, Wang C, and Swift RM. Deconvolving an estimate of breath measured blood alcohol concentration from biosensor collected transdermal ethanol data. Applied Mathematics and Computation, 196(2):724–743, 2008. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2597868/, doi: 10.1016/j.amc.2007.07.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Gittelson Claude Jeffrey, Andreev Roman, and Schwab Christoph. Optimality of adaptive galerkin methods for random parabolic partial differential equations. J. Computational Applied Mathematics, 263:189–201, 2014. URL: 10.1016/j.cam.2013.12.031, doi: 10.1016/j.cam.2013.12.031. [DOI] [Google Scholar]
- [21].Kato Tosio. Perturbation Theory for Linear Operators, volume 132 Springer Science & Business Media, 2013. URL: https://www.springer.com/us/book/9783540586616. [Google Scholar]
- [22].Labianca Dominick A.. The chemical basis of the breathalyzer: a critical analysis. J. Chem. Educ, 67(3):259–261, 1990. URL: https://pubs.acs.org/doi/abs/10.1021/ed067p259?journalCode=jceda8, doi:DOI: 10.1021/ed067p259. [DOI] [Google Scholar]
- [23].Levi AFJ and Rosen I Gary. A novel formulation of the adjoint method in the optimal design of quantum electronic devices. SIAM Journal on Control and Optimization, 48(5):3191–3223, 2010. URL: 10.1137/070708330, arXiv: 10.1137/070708330, doi: 10.1137/070708330. [DOI] [Google Scholar]
- [24].Lions JL. Optimal Control of Systems Governed by Partial Differential Equations Grundlehrender mathematischen Wissenschaften in Einzeldarstellungen mit besonderer Berücksichtigung der Anwendungsgebiete. Springer-Verlag, 1971. URL: https://books.google.com/books?id=aL9tlwEACAAJ. [Google Scholar]
- [25].Nyman E and Palmlöv A. The elimination of ethyl alcohol in sweat. Acta Physiologica, 74(2):155–159, 1936. URL:https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1748-1716.1936.tb01150.x, doi: 10.1111/j.1748-1716.1936.tb01150.x. [DOI] [Google Scholar]
- [26].Pazy A. Semigroups of Linear Operators and Applications to Partial Differential Equations Applied Mathematical Sciences. Springer, 1983. URL: https://books.google.com/books?id=80XYPwAACAAJ. [Google Scholar]
- [27].Pritchard Anthony J and Salamon Dietmar. The linear quadratic control problem for infinite dimensional systems with unbounded input and output operators. SIAM Journal on Control and Optimization, 25(1):121–144, 1987. URL: https://epubs.siam.org/doi/abs/10.1137/0325009, doi: 10.1137/0325009. [DOI] [Google Scholar]
- [28].Rosen I Gary, Luczak Susan E, and Weiss Jordan. Blind deconvolution for distributed parameter systems with unbounded input and output and determining blood alcohol concentration from transdermal biosensor data. Applied Mathematics and Computation, 231:357–376, 2014. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3972634/, doi: 10.1016/j.amc.2013.12.099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [29].Schultz MH. Spline Analysis. Prentice-Hall Series in Automatic Computation. Pearson Education, Limited, 1972. URL: https://books.google.com/books?id=AdRQAAAAMAAJ. [Google Scholar]
- [30].Schwab Christoph and Gittelson Claude Jeffrey. Sparse tensor discretizations of high-dimensional parametric and stochastic pdes. Acta Numerical, 20:291467, 2011. URL: https://www.cambridge.org/core/journals/acta-numerica/article/sparse-tensor-discretizations-of-highdimensional-parametric-and-stochastic-pdes/A46BD443A2D1176B448132A271057DCC, doi: 10.1017/S0962492911000055. [DOI] [Google Scholar]
- [31].Sirlanci Melike, Luczak Susan, and Rosen I Gary. Approximation and convergence in the estimation of random parameters in linear holomorphic semigroups generated by regularly dissipative operators. In American Control Conference (ACC), 2017, pages 3171–3176. IEEE, 2017. URL: https://www.researchgate.net/publication/318333926_Approximation_and_convergence_in_the_estimation_of_random_parameters_in_linear_holomorphic_semigroups_generated_by_regularly_dissipative_operators, doi: 10.23919/ACC.2017.7963435. [DOI] [Google Scholar]
- [32].Sirlanci Melike, Luczak Susan E., Fairbairn Catharine E., Kang Dayheon, Pan Ruoxi, Yu Xin, and Rosen I Gary. Estimating the distribution of random parameters in a diffusion equation forward model for a transdermal alcohol biosensor. Automatica, 2018. to appear, arXiv:1808.04058. URL: https://arxiv.org/abs/1808.04058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [33].Sirlanci Melike, Rosen I Gary, Luczak Susan E., Fairbairn Catharine E., Bresin Konrad, and Kang Dayheon. Deconvolving the input to random abstract parabolic systems; a population model-based approach to estimating blood/breath alcohol concentration from transdermal alcohol biosensor data. Inverse problems, 34(12), 2018. arXiv:1807.05088v1. URL: http://iopscience.iop.org/article/10.1088/1361-6420/aae791/pdf. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [34].Stuart AM. Inverse problems: A Bayesian perspective. Acta Numerica, pages 451–559, 2010. URL: https://www.cambridge.org/core/journals/acta-numerica/article/inverse-problems-a-bayesian-perspective/587A3A0D480A1A7C2B1B284BCEDF7E23, doi: 10.1017/S0962492910000061. [DOI] [Google Scholar]
- [35].Swift Robert M.. Transdermal measurement of alcohol consumption. Addiction, 88(8):1037–1039, 1993. URL: 10.1111/j.1360-0443.1993.tb02122.x, doi: 10.1111/j.1360-0443.1993.tb02122.x. [DOI] [PubMed] [Google Scholar]
- [36].Swift Robert M.. Transdermal alcohol measurement for estimation of blood alcohol concentration. Alcoholism: Clinical and Experimental Research, 24(4):422–423, 2000. URL: 10.1111/j.1530-0277.2000.tb02006.x, doi: 10.1111/j.1530-0277.2000.tb02006.x. [DOI] [PubMed] [Google Scholar]
- [37].Swift Robert M.. Direct measurement of alcohol and its metabolites. Addiction, 98:73–80, 2003. URL: 10.1046/j.1359-6357.2003.00605.x, doi: 10.1046/j.1359-6357.2003.00605.x. [DOI] [PubMed] [Google Scholar]
- [38].Swift Robert M., Martin Christopher S, Swette Larry, LaConti Anthony, and Kackley Nancy. Studies on a wearable, electronic, transdermal alcohol sensor. Alcoholism: Clinical and Experimental Research, 16(4):721–725, 1992. URL: https://www.ncbi.nlm.nih.gov/pubmed/1530135, doi: 10.1111/j.1530-0277.1992.tb00668.x. [DOI] [PubMed] [Google Scholar]
- [39].Tanabe H. Equations of Evolution Monographs and Studies in Mathematics. Pitman, 1979. URL: https://books.google.com/books?id=Dn6zAAAAIAAJ. [Google Scholar]