An Inverse Problem for a Class of Conditional Probability Measure-Dependent Evolution Equations

Inom Mirzaev; Erin C Byrne; David M Bortz

doi:10.1088/0266-5611/32/9/095005

. Author manuscript; available in PMC: 2017 Jul 15.

Published in final edited form as: Inverse Probl. 2016 Jul 15;32(9):095005. doi: 10.1088/0266-5611/32/9/095005

An Inverse Problem for a Class of Conditional Probability Measure-Dependent Evolution Equations

Inom Mirzaev ¹, Erin C Byrne ², David M Bortz ¹

PMCID: PMC5352987 NIHMSID: NIHMS804840 PMID: 28316360

Abstract

We investigate the inverse problem of identifying a conditional probability measure in measure-dependent evolution equations arising in size-structured population modeling. We formulate the inverse problem as a least squares problem for the probability measure estimation. Using the Prohorov metric framework, we prove existence and consistency of the least squares estimates and outline a discretization scheme for approximating a conditional probability measure. For this scheme, we prove general method stability.

The work is motivated by Partial Differential Equation (PDE) models of flocculation for which the shape of the post-fragmentation conditional probability measure greatly impacts the solution dynamics. To illustrate our methodology, we apply the theory to a particular PDE model that arises in the study of population dynamics for flocculating bacterial aggregates in suspension, and provide numerical evidence for the utility of the approach.

Keywords: Conditional probability measures, inverse problem, measure-dependent evolution equations, size-structured populations, flocculation, fragmentation, bacterial aggregates

1. Introduction

In this paper, we examine an inverse problem involving a general conditional probability measure-dependent partial differential equation (PDE) arising in structured population modeling. We consider a general abstract evolution equation with solution b on a Banach space H, defined on an interval Q ⊂ ℝ⁺ ∪ {0}, depending on the conditional probability measure F:

b_{t} = g (b, F)

(1)

b (0, x) = b_{0} (x) \forall x \in Q

(2)

for t ∈ T = [0, t_f] with t_f < ∞.

Although estimation of conditional probability measures in statistics is common [35, 47, 50, 54], results on the estimation of probability distributions within the context of size-structured models are not widespread. Banks et al. [15, 16] formulated a Sinko-Streifer model such that growth rates vary probabilistically across individuals of the population and presented inverse problem techniques for estimation of this growth distribution using aggregate population data. Banks and Bihari [9] developed an inverse problem framework for identifying a single probability measure in general measure-dependent dynamical systems. Later, Banks and Bortz [10] extended this framework to a class of dynamical systems with distributed temporal delays and countable number of probability measures. All these inverse problem techniques are based on a Prohorov metric framework, for reviews of which we refer readers to [17, 21].

Our study of the class of models in (1)–(2) is motivated by our interest in fragmentation phenomena, which arise in a wide variety of areas including algal populations [1, 2, 7], cancer metastases [38, 49, 81], and mining [44, 67]. The most prominent approach to model fragmentation is based on structured population equations, which can be traced back to the works of Smoluchowski [76, 77] and Becker and Doring [23]. The solutions of the forward problem of structured population models has been extensively studied in over the years, and the mathematical techniques for their analysis are highly developed. For the review of these mathematical techniques we refer interested readers to the books by Webb [79] and Metz and Diekmann [60]. For the applications of the theory in an engineering context we recommend the book by Ramkrishna [72].

Inverse problems for structured population models have also received a substantial interest in recent years [43, 46, 68, 69]. Efficient methodologies for the estimation of growth and mortality rates in physiologically structured Sinko-Streifer [73] equation has been developed and used to measure the lethal effects of pesticides in insect populations [8, 22]. Estimation of the model rates in a size-structured Bell and Anderson model [24], describing the dynamics of proliferation and death processes of a population of cells, has also been widely investigated. Perthame et al. [41, 68] developed an inverse problem methodology for determining the cell division (birth) rate from the measured stable size distribution of the population. Their approach is based on a novel regularization technique relying on generalized relative entropy estimates [62]. Later, this methodology was further extended to recover fragmentation rates in growth-fragmentation equations by Doumic et al. [28, 42]. For the same model, Luzyanina et al. [56–58] formulated a numerical method, based on a maximum likelihood approach, for the robust identification of the cell division rates from CFSE (a type of cell division tracking dye) histogram time-series data. Moreover, several improvements and extensions of this numerical approach has been suggested in [19].

In [26], we developed a size-structured partial differential equation (PDE) model for bacterial flocculation, the process whereby flocs, i.e., aggregates, in suspension adhere and separate. For the breakage term in that PDE model, the fragmentation of each parent particle will generate child particles according to a post-fragmentation conditional probability distribution. In the literature, it is widespread to assume that this distribution is is normally distributed for all parent floc size [55, 66]. However, in [29], we focused only on the fragmentation and developed a microscale mathematical model which contradicts this result and predicted that the distribution is non-normal and conditionally dependent on parent size. Thus it is clear that there is a need for a methodology to identify this conditional distribution from available data.

Building upon the work in [10], in this paper we present and investigate an inverse problem for estimating the conditional probability measures in size-structured population models from size-distribution measurements. In Section 2, we formulate the inverse problem as a least squares problem for the probability measure estimation. We use the Prohorov metric (convergence in which is equivalent to weak convergence of measures) in a functional-analytic setting and show existence and consistency of the estimators for the least squares problem. Consequently, in Section 3, we develop an approximation approach for computational implementation and prove convergence of this approximate inverse problem. We show the convergence of solutions of the approximate inverse problem to solutions of the original inverse problem. In Section 4, we illustrate that the flocculation dynamics of bacterial aggregates in suspension as one realization of a system satisfying the hypotheses in our framework. Furthermore, we present numerical examples to demonstrate the feasibility of our inverse methodology for artificial data sets.

2. Least squares problem for estimation of conditional probability distributions

In this section, we consider the inverse problem for inferring a conditional probability distribution from aggregate population data. First, in Section 2.1, we formulate the inverse problem as a least squares problem for the probability measure estimation. Consequently, in Section 2.2, we will develop the theoretical results needed to prove the existence and consistency of the estimates of the least squares problem. Accordingly, we will state some assumptions which are motivated by the features of the validating data.

2.1. Theoretical framework for the least squares problem

Let 𝒫(Q) be the space of all probability distributions on (Q, 𝒜), where 𝒜 is the Borel σ-algebra on Q. Since we are primarily concerned with the systems with unknown conditional probability measures, we restrict the space of probability distributions to those that can be solutions to our inverse problem. Towards this end, we define regular conditional probability measures F : Q × Q → [0, 1] such that F(·, y) is a probability measure in 𝒫(Q) for all y ∈ Q. We then define our space of solutions to the inverse problem ℱ(Q × Q) as the set of all conditional probability measures defined on Q, i.e, F ∈ ℱ(Q × Q) if and only if F(·, y) ∈ 𝒫(Q) for all y ∈ Q.

We define a metric on the space ℱ to create a metric topology, and we accomplish this by making use of the well-known Prohorov metric (see [25] for a full description). Convergence in the Prohorov metric is equivalent to weak convergence of measures and we direct the interested reader to [45] for a summary of its relationship to a variety of other metrics on probability measures. For F, F̃ ∈ ℱ and fixed y, we use the Prohorov metric ρ_proh to denote the distance ρ_proh(F(·, y), F̃(·, y)) between the measures. We extend this concept to define the metric ρ on the space ℱ(Q × Q) by taking the supremum of ρ_proh over all y ∈ Q,

ρ (F, \tilde{F}) = sup_{y \in Q} ρ_{proh} (F (\cdot, y), \tilde{F} (\cdot, y)) .

The most widely available, high-fidelity data for flocculating particles are in the form of particle size histograms from, e.g., from flow-cytometers, Coulter counters, etc. Accordingly, we will define our inverse problem with the goal of comparing with histograms of floc sizes. Let n_j(t_i) represent the number of flocculated biomasses with volume between x_j and x_j+1 at time t_i. We assume that the data is generated by an actual post-fragmentation probability distribution. In other words, n^d is representable as the partial zeroth moment of the solution

n_{j i}^{d} = \int_{x_{j - 1}}^{x_{j}} b (t_{i}, x; F_{0}) d x + ℰ_{j i}

(3)

for some true conditional probability-measure F₀ ∈ ℱ. The random variables ℰ_ji represent measurement noise. We also assume, as it is commonly assumed, that the random variables ℰ_ji are independent, identically distributed, with mean E[ℰ_ji] = 0 and variance

Var [ℰ_{j i}] = σ^{2} < \infty

(4)

(which is generally true for flow-cytometers [36]). Thus our inverse problem entails finding a minimizer F ∈ ℱ of the least squares cost functional, defined as

J (F; n^{d}) = \sum_{i = 1}^{N_{t}} \sum_{j = 1}^{N_{x}} {(\int_{x_{j - 1}}^{x_{j}} b (t_{i}, x; F) d x - n_{j i}^{d})}^{2},

(5)

where the data n^d ∈ ℝ^{N_x ×
N_t} consists of the number of flocs in each of the N_x bins for floc volume at N_t time points. The superscript d denotes the dimension of the data, d = N_x × N_t. The function b is the solution to (1)–(2) corresponding to the probability measure F.

For a given data n^d, the cost functional J may not have a unique minimizer, thus we denote a corresponding solution set of probability distributions as ℱ^*(n^d). We then define the distance between two such sets of solutions, $ℱ^{*} (n_{1}^{d_{1}})$ and $ℱ^{*} (n_{2}^{d_{2}})$ (for data $n_{1}^{d_{1}}$ and $n_{2}^{d_{2}}$ ) to be the well-known Hausdorff distance [53]

d_{H} (ℱ^{*} (n_{1}^{d_{1}}), ℱ^{*} (n_{2}^{d_{2}})) = inf {ρ (F, \tilde{F}) : F \in ℱ^{*} (n_{1}^{d_{1}}), \tilde{F} \in ℱ^{*} (n_{2}^{d_{2}})} .

2.2. Existence and consistency of the least squares estimates

In this section we establish existence and consistency of the estimates of the least squares problem defined in (5). In particular, we will first show that for a given data set n^d with dimension d, the least squares estimator defined in (5) has at least one minimizer. Next, we will investigate the behavior of minimizers of (5) as more data is collected. Specifically, we will show that the least squares estimator is consistent, i.e., as the dimension of data increases (N_t → ∞ and N_x → ∞) the minimizers of the estimator (5) converge to true probability measure F₀ generating the data n^d.

2.2.1. Existence of the estimator

In this section we prove that the cost functional defined in (5) possesses at least one minimizer. We use the well-known result that a continuous function on a compact metric space has a minimum. In particular, first we show that (ℱ, ρ) is a compact metric space. Next, we establish continuous dependence of the solution b on the conditional probability measure F.

For much of the following analysis, we require the operator g to satisfy a Lipschitz-type condition. We detail that condition in the following.

Condition 2.1

Suppose that b and b̃ are solutions to the evolution equation (1)–(2). For fixed t, the function g : H × ℱ → H must satisfy

‖ g (b, F) - g (\tilde{b}, \tilde{F}) ‖ \leq C ‖ b - \tilde{b} ‖ + 𝒯 (F, \tilde{F}),

where C > 0, and 𝒯(F, F̃) is some functional such that |𝒯(F, F̃)| < ∞ and 𝒯(F, F̃) → 0 as ρ(F, F̃) → 0.

We begin by proving that (ℱ, ρ) is a compact metric space.

Lemma 2.2

(ℱ, ρ) is a compact metric space.

Proof

Consider a Cauchy sequence {F_n} ∈ ℱ. Then ∀ ε > 0, ∃ N such that ∀ n, m ≥ N,

sup_{y \in Q} ρ_{proh} (F_{n} (\cdot, y), F_{m} (\cdot, y)) < ε .

It is easy to see we have a Cauchy sequence {F_n(·, y)} ∈ 𝒫(Q) which converges uniformly in y ∈ Q. From results in [21, Corollary 2.16], 𝒫(Q) is a compact metric space with respect to the Prohorov metric, and thus there exists F(·, y) ∈ 𝒫(Q) such that ρ_Proh(F_n(·, y), F(·, y)) < ε for all n ≥ N. Thus

sup_{y \in Q} ρ_{proh} (F_{n} (\cdot, y), F (\cdot, y)) < ε

and (ℱ, ρ) is a complete metric space. Analogously, we can show that (ℱ, ρ) is sequentially compact. Therefore, we conclude that (ℱ, ρ) is a compact metric space.

Now that we have a compact metric space, it remains to show that the cost functional on that space is continuous with respect to the function F. It suffices to prove point-wise continuity.

Lemma 2.3

If t ∈ T, F ∈ ℱ, and the operator g in (1) satisfies Condition 2.1, then the unique solution b to (1) is point-wise continuous at F ∈ ℱ. Moreover, since ℱ is compact space the unique solution b is uniformly continuous on ℱ.

Proof

For the function b to be point-wise continuous at F, we need to show that ‖b(t, ·; F_i) − b(t, ·; F)‖ → 0 as ρ(F_i, F) → 0 for {F_i} ∈ ℱ and fixed t. We begin by re-writing (1) as an integral equation

b (t, x) = b_{0} (x) + \int_{0}^{t} g (b (s, x), F) d s .

For fixed t, consider b to be a function of F

b (t, x; F) = b_{0} (x) + \int_{0}^{t} g (b (s, x; F), F) d s .

By definition of solutions, we have

‖ b (t, \cdot; F_{i}) - b (t, \cdot; F) ‖ \leq \int_{0}^{t} ‖ g (b (s, \cdot; F_{i}), F_{i}) - g (b (s, \cdot; F), F) ‖ d s .

Based on Condition 2.1, we obtain

‖ b (t, \cdot; F_{i}) - b (t, \cdot; F) ‖ \leq C \int_{0}^{t} ‖ b (s, \cdot; F_{i}) - b (s, \cdot; F) ‖ d s + 𝖳 (F_{i}, F),

where we define $𝖳 (F_{i}, F) = \int_{0}^{t_{f}} 𝒯 (F_{i}, F) d s$ , independent of t. An application of Gronwall’s inequality yields

‖ b (t, \cdot; F_{i}) - b (t, \cdot; F) ‖ \leq 𝖳 (F_{i}, F) e^{\int_{0}^{t} C d s} \leq 𝖳 (F_{i}, F) e^{C t_{f}} \to 0

since we know that 𝒯(F_i, F) → 0 as F_i → F in (ℱ, ρ). Thus the solutions b are point-wise continuous at F ∈ ℱ.

We use the results of the above two lemmas to establish existence of a solution to our inverse problem.

Theorem 2.4

There exists a solution to the inverse problem as described in (5).

Proof

It is well known that a continuous function on a compact set obtains both a maximum and a minimum. We have shown (ℱ, ρ) is compact, and from Lemma 2.3, for fixed t ∈ T, we have that F ↦ b(t, ·; F) is continuous. Since J is continuous with respect to F, we can conclude there exist minimizers for J.

2.2.2. Consistency of the estimator

In previous section we have proven that for a given data there exists minimizers of the least squares cost functional defined in (5). In this section we will investigate the behavior of the least squares estimators as the number of observations increase. In particular, the estimator is said to be consistent if the estimators for the data n^d converge to true probability measure F₀ as N_t → ∞ and N_x → ∞. Consistency of the estimators of the least squares problems are well-studied in statistics and the results of this section follow closely the theoretical results of [20] and [14]. Hence, as in [20, Theorem 4.3] and [14, Corallary 3.2], we will make the following two assumptions required for the convergence of the estimators to the unique true probability measure F₀.

(A1)
Let us denote the space of positive functions T × Q ↦ ℝ⁺, which are bounded and Riemann integrable by ℜ(T × Q, ℝ⁺). Then, the model function b(t, x; ·) : ℱ → ℜ(T × Q, ℝ⁺) is continuous on (ℱ, ρ).
(A2)
The functional
$J_{0} (F) = σ^{2} + \int_{T} \int_{Q} {(b (t, x; F) - b (t, x; F_{0}))}^{2} d x d t$
is uniquely (up to L¹ norm) minimized at F₀ ∈ ℱ. Here σ² is variance of the measurement noise defined in (4).

Assumption (A2) is often referred to as identifiability condition (or output least squares identifiability [31]) and addresses the question of whether the least squares inverse problem (5) has a unique solution for given data set. Establishing identifiability conditions is generally mathematically challenging and depends to a great extent on the model involved. Chavent [32] presented general sufficient conditions for identifiability of parameters when the parameter space is finite dimensional, which require additional smoothness on the model function b. In some cases, non-identifiability of parameters can be eliminated by reducing the dimensionality of the parameter space [30] or by adding a regularization term to the cost functional J [33]. For further details about the identifiability of parameters in inverse problems, we refer readers to a review articles by Yeh [82] and Miao et al. [61].

In this paper, we will not develop any detailed rules about the identifiability of the parameters. However, a few general conclusions can be drawn. For instance, from continuity arguments of Theorem (2.3) it follows that the function J₀ : ℱ → ℝ⁺, defined in assumption (A2), is continuous with respect to Prohorov metric. Therefore, the function J₀ has at least one minimizer on (ℱ, ρ). Moreover, it is easy to see that the function J₀ is minimized at F₀. Suppose that there is another minimizer F₁ of J₀ such that ρ(F₁, F₀) ≠ 0. Then

\int_{T} \int_{Q} {(b (t, x; F_{1}) - b (t, x; F_{0}))}^{2} d x d t = 0 .

This in turn implies that

b (t, x; F_{1}) = b (t, x; F_{0})

almost everywhere on (t, x) ∈ T × Q, which is very strict condition to fulfill for significantly different F₁ and F₀. Furthermore, the value of b(t₁, x; F) at time t₁ for fixed F ∈ ℱ depends on the profile of b(t, x; F) for t ∈ [0, t₁) and thus by choosing the observation interval T = [0, t_f] sufficiently large one can ensure uniqueness of the minimizer of the function J₀.

Having the required assumptions in hand, we now present the following theorem about the consistency of the estimators of the least squares cost functional (5).

Theorem 2.5

Under assumptions (A1) and (A2)

d_{H} (ℱ^{*} (n^{d}), F_{0}) \to 0

as N_t → ∞ and N_x → ∞.

Proof

The specific details of this proof are nearly identical to a similar theorem in [20] and so here we simply provide an overview. Briefly, one first shows that J(F; n^d) converges to J₀(F) for each F ∈ ℱ as N_t → ∞ and N_x → ∞. Then, using the fact that J₀(F) is uniquely (up to the metric ρ) minimized at F₀, one can show that for each sequence {F^d ∈ ℱ^*(n^d)} the distance ρ(F^d, F₀) converges to zero as N_t → ∞ and N_x → ∞, which yields the result.

3. Approximate Inverse Problem

Since the original problem involves minimizing over the infinite dimensional space ℱ, pursuing this optimization is challenging without some type of finite dimensional approximation. Thus we define some approximation spaces over which the optimization problem becomes computationally tractable. Similar to the partitioning presented in [10], let $Q_{M} = {q_{j}^{M}}_{j = 0}^{M}$ be partitions of Q = [0, x̅] for M = 1, 2, … and

Q_{D} = \cup_{M = 1}^{\infty} Q_{M}

(6)

where the sequences are chosen such that Q_D is dense in Q.

For positive integers M, L, let the approximation space be defined as

ℱ^{M L} = {F \in ℱ | \forall Ω \subseteq Q, F (Ω, y) = \sum_{m = 1}^{M} p_{ℓ m} Δ_{q_{m}^{M}} (Ω) 𝟙_{(q_{ℓ - 1}^{L}, q_{ℓ}^{L}]} (y), q_{m}^{M} \in Q_{M}, q_{ℓ}^{L} \in Q_{L}, p_{i j} \geq 0, \sum_{m = 1}^{ℓ} p_{ℓ m} = 1, ℓ = 1, 2, \dots, L},

where Δ_q is the Dirac measure with atom x = q defined for all Ω ⊆ Q as

Δ_{q} (Ω) = {\begin{matrix} 1 & q \in Ω \\ 0; & q \notin Ω \end{matrix} .

The function 𝟙_A is the indicator function on the interval A.

Next, define the space ℱ_D as

ℱ_{D} = \cup_{M, L = 1}^{\infty} ℱ^{M L} .

Consequently, since Q is a complete, separable metric space, and by Theorem 3.1 in [9] and properties of the sup norm, ℱ_D is dense in ℱ in the ρ metric. Therefore we can directly conclude that any function F ∈ ℱ can be approximated by a sequence {F_{M_jL_k}}, F_{M_jL_k} ∈ ℱ^M_jL_k such that as M_j, L_k → ∞, ρ(F_{M_jL_k}, F) → 0.

Similar to the discussion concerning Theorem 4.1 in [9], we now state the theorem regarding the continuous dependence of the inverse problem upon the given data, as well as stability under approximation of the inverse problem solution space ℱ.

Theorem 3.1

Let Q = [0, x̅], assume that for fixed t ∈ T, x ∈ Q, F ↦ b(t, x, F) is continuous on ℱ, and let Q_D be a countable dense subset of Q as defined in (6). Suppose that ℱ^*ML(n^d) is the set of minimizers for J(F, n^d) over F ∈ ℱ^ML corresponding to the data n^d. Then, d_H(ℱ^*ML(n^d), F₀) → 0 as M, L, N_t, N_x → ∞.

Proof

Suppose that ℱ^*(n^d) is the set of minimizers for J(F; n^d) over F ∈ ℱ corresponding to the data n^d. Using continuous dependence of solutions on F, compactness of (ℱ, ρ), and the density of ℱ_D in ℱ, the arguments follow precisely those for Theorem 4.1 in [9]. In particular, one would argue in the present context that any sequence $F_{d}^{* M L} \in ℱ^{* M L} (n^{d})$ has a subsequence $F_{d_{k}}^{* M_{j} L_{i}}$ that converges to a F̃ ∈ ℱ^*(n^d). Therefore, we can claim that

d_{H} (ℱ^{* M L} (n^{d}), ℱ^{*} (n^{d})) \to 0

(7)

as M, L, N_t, N_x → ∞. Conversely, a simple application of the triangle inequality yields that

d_{H} (ℱ^{* M L} (n^{d}), F_{0}) \leq d_{H} (ℱ^{* M L} (n^{d}), ℱ^{*} (n^{d})) + d_{H} (ℱ^{*} (n^{d}), F_{0}) .

This is in turn, from (7) and Theorem 2.5, implies that d_H(ℱ^*ML(n^d), F₀) converges to zero as M, L, N_t, N_x → ∞.

Since we do not have direct access to an analytical solution to (1), our efforts are focused on finding a minimizer F ∈ ℱ of the approximate least squares cost functional

J^{N} (F, n^{d}) = \sum_{i = 1}^{N_{t}} \sum_{j = 1}^{N_{x}} {(\int_{x_{j - 1}}^{x_{j}} b^{N} (t_{i}, x_{j}; F) d x - n_{j i}^{d})}^{2} .

(8)

Here, N_t is the number of data observations, N_x is the number of data bins for floc volume, and b^N is the semi-discrete approximation to b. In Section 4, we will define a uniformly (in time) convergent discretization scheme and its corresponding approximation space H^N ⊂ H. The discretized version of (8) is represented by

b_{t}^{N} = g^{N} (b^{N}, F)

(9)

b^{N} (0, x) = b_{0}^{N} (x)

(10)

where g^N : H^N × ℱ → H^N denotes the discretized version of g. We will need that g^N exhibits a type of local Lipschitz continuity and accordingly define the following condition.

Condition 3.2

Suppose that the discretization given in (9)–(10) is a convergent scheme. Let (b^N, F), (b̃^N, F̃) ∈ H^N × ℱ. For fixed t, the function g^N : H^N × ℱ → H^N must satisfy

‖ g^{N} (b^{N}, F) - g^{N} ({\tilde{b}}^{N}, \tilde{F}) ‖ \leq C_{N} ‖ b^{N} - {\tilde{b}}^{N} ‖ + 𝒯^{N} (F, \tilde{F}),

where C_N > 0, and 𝒯^N(F, F̃) is some function such that |𝒯^N(F, F̃)| < ∞ and 𝒯^N(F, F̃) → 0 as ρ(F, F̃) → 0.

General method stability [18] requires b^N(t, x; F_i) → b(t, x; F) as F_i → F in the ρ metric and as N → ∞; we will now prove this.

Lemma 3.3

Let t ∈ T, F ∈ ℱ, and {F_i} ∈ ℱ such that lim_i→∞ ρ(F_i, F) = 0. For fixed N, if b^N(t, x; F_i) is the solution to (19)–(20) and Condition 3.2 holds, then b^N is pointwise continuous at F ∈ ℱ.

Proof

The proof of this lemma is identical to that for Lemma 2.3. We first recast (9) as an integral equation and then apply Condition 3.2 and Gronwall’s inequality to obtain the desired result.

Corollary 3.4

Under Condition 3.2 and Lemma 3.3, we can conclude that ‖b^N(t, ·; F_N) − b(t, ·; F)‖ → 0 as N → ∞ uniformly in t on I.

Proof

A standard application of the triangle inequality yields

‖ b^{N} (t, \cdot; F_{N}) - b (t, \cdot; F) ‖ \leq ‖ b^{N} (t, \cdot; F_{N}) - b^{N} (t, \cdot; F) ‖ + ‖ b^{N} (t, \cdot; F) - b (t, \cdot; F) ‖ .

The first term converges by Lemma 3.3, while the second term converges because the proposed numerical scheme is assumed to converge uniformly.

With this corollary, we now consider the existence of a solution to the approximate least squares cost functional in (8), as well as the solution’s dependence on the given data n^d.

Theorem 3.5

Assume that there exists solutions to both the original and the approximate inverse problems in (5) and (8), respectively. For fixed data n^d, there exist a subsequence of the estimators ${F_{N}}_{N = 1}^{\infty}$ of (8) that converge to a solution of the original inverse problem (5).

Proof

As noted above, (ℱ, ρ) is compact. By Lemmas 2.3 and 3.3, we have that both F ↦ b(t, x; F) and F ↦ b^N(t, x; F), for fixed t ∈ T, are continuous with respect to F. We therefore know there exist minimizers in ℱ to the original and approximate cost functionals J and J^N respectively.

Let ${F_{N}^{*}} \in ℱ$ be any sequence of solutions to (8) and ${F_{N_{k}}^{*}}$ a convergent (in ρ) subsequence of minimizers. Recall that minimizers are not necessarily unique, but one can always select a convergent subsequence of minimizers in ℱ. Denote the limit of this subsequence with F^*. By the minimizing properties of $F_{N_{k}}^{*} \in ℱ$ , we then know that

J^{N_{k}} (F_{N_{k}}^{*}, n^{d}) \leq J^{N_{k}} (F, n^{d}) for all F \in ℱ .

(11)

By Corollary 3.4, we have the convergence of b^N(t, x; F_N) → b(t, x; F) and thus J^N(F_N) → J(F) as N → ∞ when ρ(F_N, F) → 0. Thus in the limit as N_k → ∞, the inequality in (11) becomes

J (F^{*}, n^{d}) \leq J (F, n^{d}) for all F \in ℱ

with F^* providing a (not necessarily unique) minimizer of (5).

Theorem 3.6

Assume that for fixed t ∈ T, F ↦ b(t, x; F) is continuous on ℱ in ρ, b^N is the approximate solution to the forward problem given (19)–(20), J^N is the approximation given in (8), and Q_D a countable dense subset of Q as defined in (6). Moreover, suppose that $ℱ_{N}^{* M L} (n^{d})$ is the set of minimizers for J^N(F; n^d) over F ∈ ℱ^ML corresponding to the data n^d. Similarly, suppose that ℱ^*(n^d) is the set of minimizers for J(F; n^d) over F ∈ ℱ corresponding to the data n^d. Then, $d_{H} (ℱ_{N}^{* M L} (n^{d}), F_{0}) \to 0$ as N, M, L, N_t, N_x → ∞.

Proof

Observe that an application of a simple triangle inequality yields

d_{H} (ℱ_{N}^{* M L} (n^{d}), F_{0}) \leq d_{H} (ℱ_{N}^{* M L} (n^{d}), ℱ^{* M L} (n^{d})) + d_{H} (ℱ^{* M L} (n^{d}), F_{0}) .

Therefore, combining the arguments of Theorem 3.1 and Theorem 3.5, we readily obtain that $d_{H} (ℱ_{N}^{* M L} (n^{d}), F_{0})$ converges to zero as N, M, L, N_t, N_x → ∞.

With the results of these two theorems, we can claim that both that there exists a solution to the approximate inverse problem, defined in (8), and that it is continuously dependent on the given data. We have also established method stability under approximation of the state space and parameter space of our inverse problem. Therefore, we can conclude the existence and consistency of the estimators of the approximate least squares problem.

4. Application to flocculation equations

Modeling flocculation, a process whereby destabilized suspended particles (i.e., flocs) reversibly aggregate and fragment, has received considerably attention over the years [3, 37, 48, 74]. Flocculation is ubiquitous in many diverse areas such as water treatment, biofuel production, beer fermentation, etc. In modeling the dynamics of flocculation, four important mechanisms arise in a wide range of applications: aggregation, fragmentation, proliferation and sedimentation. Mathematical modeling of flocculation is usually based on a size-structured population equations ^‡ that take into account one or more of the above listed mechanisms.

The particular flocculation model we study here accounts for aggregation, fragmentation and sedimentation of the flocs. The equations for the flocculation model track the time-evolution of the particle size number and is given by the following integro-differential equation

b_{t} = 𝖠 [b] + ℬ [b] + ℛ [b],

(12)

b (0, x) = b_{0} (x),

(13)

where b(t, x) dx is the number of aggregates with volumes in [x, x + dx] at time t, and 𝖠, ℬ and ℛ are the aggregation, breakage (fragmentation) and removal operators, respectively. We consider x ∈ Q = [0, x̅], where x̅ is the maximum floc size and t ∈ T = [0, t_f], t_f < ∞. As investigated in our previous work [26, 64], the function space for both the initial condition b₀(·) and the solution b(t, x) is H = L¹(Q, ℝ⁺), where Q = [0, x̅], x̅ ∈ ℝ⁺ and g : H × ℱ → H.

The aggregation, fragmentation and removal operators are defined by:

𝖠 [p] (t, x) ≔ \frac{1}{2} \int_{0}^{x} k_{a} (x - y, y) p (t, x - y) p (t, y) d y - p (t, x) \int_{0}^{\bar{x}} k_{a} (x, y) p (t, y) d y,

(14)

ℬ [p] (t, x) ≔ \int_{x}^{\bar{x}} Γ (x; y) k_{f} (y) p (t, y) d y - \frac{1}{2} k_{f} (x) p (t, x)

(15)

and

ℛ [p] (t, x) ≔ - μ (x) p (t, x) .

(16)

The aggregation kernel, k_a(x, y), describes the rate at which flocs of volume x and y combine to form a floc of volume x + y and is a symmetric function satisfying k_a(x, y) = 0 for x + y > x̅. The fragmentation kernel k_f(x) describes the rate at which a floc of volume x fragments. The function Γ(x, y) is the post-fragmentation probability density function, for the conditional probability of producing a daughter floc of size x from a mother floc of size y. This probability density function is used to characterize the stochastic nature of floc fragmentation (e.g., see the discussions in [6, 26, 29, 48]).

The flocculation equations, presented in (12)–(20), are a generalization of many mathematical models appearing in the size-structured population modeling literature. The forward problem of the flocculation equations has also been the focus of considerable mathematical analysis. There is a significant literature on the existence and uniqueness of solutions to flocculation equations, see for example [34, 80]. Nevertheless, because of the nonlinear term introduced by the aggregation, derivation of analytical solutions for the flocculation equations has proven elusive except for some special cases [5]. However, many discretization schemes for numerical simulations of the PBEs have been proposed: the least squares spectral method [40], the finite volume methods [27] and the finite element method [65]. For a review of further mathematical results, we refer readers to the review article by Wattis [78].

We now consider the application of our inverse methodology to the evolution equation defined in (12)–(13). For fixed t ∈ T, b(t, ·) ∈ H, F ∈ ℱ, consider the right side of (12), represented by (1),

g (b, F) = 𝖠 [b] + ℬ [b; F] + ℛ [b],

(17)

where the conditional probability measure F(·, y) on Ω ⊆ Q and for fixed y ∈ Q is defined as

F (Ω, y) = \int_{Ω} Γ (ξ, y) d ξ .

Note that 0 ≤ F(Ω, y) ≤ 1 for all Ω ⊆ Q, since Γ(·, y) is a probability density function for each fixed y.

Setting Ω = [0, x] yields

F ([0, x], y) = \int_{0}^{x} Γ (ξ, y) d ξ,

which is cumulative probability of getting flocs smaller than size x when a parent floc of size y fragments. Hereafter, for convenience and in mild abuse of the notation, we refer to this quantity as cumulative density function (cdf) and denote it simply by F(x, y). Note that a fragmentation cannot result in a daughter floc larger than the original floc, therefore F(x, y) ≡ 1 for x ≥ y and fixed y ∈ Q. Furthermore, one can infer the post-fragmentation density function Γ(x, y) by numerical differentiation of the cdf F(x, y). Figure 1 depicts the relationship between the cdf F(x, y) and the post-fragmentation density function Γ(x, y) for several different probability distributions.

Relationship between post-fragmentation density functions Γ(*x, y*) (top row) and cumulative density function F(*x, y*) (bottom row) a) Γ(*x, y*), Beta distribution with α = β = 2. b) Γ(*x, y*), Beta distribution with α = 5 and β = 1. c) Γ(*x, y*), uniform distribution in x for fixed y.

Before we proceed to the inverse problem, we need to establish existence and uniqueness, i.e., well-posedness, of the solutions of the forward problem. Well-posedness of the forward problem for x̅ < ∞ and t_f < ∞ was first established by Ackleh and Fitzpatrick [2] in an L²-space setting and later by Banasiak and Lamb [7] for x̅ = ∞ and t_f = ∞ in an L¹-space setting. For the sake of completeness, in the following lemma, we will summarize assumptions needed for the well-posedness for x̅ < ∞ and t_f < ∞ in L¹-space setting and refer readers to [26, §3] for the proof.

Lemma 4.1

Suppose that k_f, μ ∈ L^∞(Q), k_a ∈ L^∞ (Q × Q). Moreover, assume that the post-fragmentation density function Γ(·, y) ∈ L¹ (Q) for all y ∈ (0, x̅] and

F (x, y) = \int_{0}^{x} Γ (ξ, y) d ξ = 1 for
all x \geq y .

The evolution equation (12)–(13) is well-posed on H = L¹ (Q, ℝ⁺) and for any compact set T = [0, t_f] and b₀ ≥ 0, the classical solution of (12)–(13) satisfies

C_{0} = sup_{t \in T, x \in Q} | b (t, x) | < \infty .

(18)

Having the well-posedness of the forward problem, we now show that the solution b(t, x, ·) is uniformly continuous on compact space ℱ. This in turn, from Theorem 2.4, proves the existence of minimizers for the least squares problem defined in (5). Towards this end, we establish the following lemma, which we will need later to show that g, defined in (17), satisfies the locally Lipschitz property of Condition 2.1.

Lemma 4.2

The operator 𝖠 + ℛ is locally Lipschitz

‖ 𝖠 [b] + ℛ [b] - 𝖠 [\tilde{b}] - ℛ [\tilde{b}] ‖ \leq C_{1} ‖ b - \tilde{b} ‖

where C₁ = 3C₀ ‖k_a‖_∞ + ‖μ‖_∞. Furthermore, the fragmentation operator ℬ satisfies the locally Lipschitz property of Condition 2.1.

Proof

To show that 𝖠 + ℛ is locally Lipschitz, first observe that

‖ 𝖠 [b] - 𝖠 [\tilde{b}] ‖ \leq \frac{1}{2} \int_{Q} | \int_{0}^{x} k_{a} (x - y, y) b (x - y) b (y) d y - \int_{0}^{x} k_{a} (x - y, y) \tilde{b} (x - y) \tilde{b} (y) d y | d x + \int_{Q} | b (x) \int_{Q} k_{a} (x, y) b (y) d y - \tilde{b} (x) \int_{Q} k_{a} (x, y) \tilde{b} (y) d y | d x \leq {‖ k_{a} ‖}_{\infty} [\frac{1}{2} \int_{Q} | \int_{0}^{x} b (x - y) (b (y) - \tilde{b} (y)) d y | d x + \frac{1}{2} \int_{Q} | \int_{0}^{x} \tilde{b} (y) (b (x - y) - \tilde{b} (x - y)) d y | d x + \int_{Q} | b (x) \int_{Q} (b (y) - \tilde{b} (y)) d y | d x + \int_{Q} | \tilde{b} (x) \int_{Q} (b (y) - \tilde{b} (y)) d y | d x] .

At this point, applying Young's inequality [4, Theorem 2.24] for the first two integrals yields the desired result

‖ 𝖠 [b] - 𝖠 [\tilde{b}] ‖ \leq {‖ k_{a} ‖}_{\infty} [\frac{1}{2} ‖ b ‖ ‖ b - \tilde{b} ‖ + \frac{1}{2} ‖ \tilde{b} ‖ ‖ b - \tilde{b} ‖ + ‖ b ‖ ‖ b - \tilde{b} ‖ + ‖ \tilde{b} ‖ ‖ b - \tilde{b} ‖] \leq 3 C_{0} {‖ k_{a} ‖}_{\infty} ‖ b - \tilde{b} ‖ .

For the second part of the lemma, examining the fragmentation term, we find

‖ ℬ (b, F) - ℬ (\tilde{b}, \tilde{F}) ‖ \leq ‖ \frac{1}{2} k_{f} (x) (\tilde{b} (t, x) - b (t, x)) ‖ + ‖ \int_{x}^{\bar{x}} k_{f} (y) (b (t, y) Γ (x, y) - \tilde{b} (t, y) \tilde{Γ} (x, y)) d y ‖ \leq \frac{1}{2} C_{frag} ‖ b - \tilde{b} ‖ + C_{frag} ‖ \int_{Q} b (t, y) (Γ (x, y) - \tilde{Γ} (x, y)) d y ‖ + C_{frag} ‖ \int_{Q} (b (t, y) - \tilde{b} (t, y)) \tilde{Γ} (x, y) d y ‖

where C_frag = ‖k_f‖_∞. The second term on the right hand side becomes

C_{frag} ‖ \int_{Q} b (t, y) (Γ (x, y) - \tilde{Γ} (x, y)) d y ‖ \leq C_{frag} \int_{Q} \int_{Q} | b (t, y) | | Γ (x, y) - \tilde{Γ} (x, y) | d y d x \leq C_{frag} \int_{Q} | b (t, y) | \int_{Q} | Γ (x, y) d x - \tilde{Γ} (x, y) d x | d y \leq C_{frag} \int_{Q} | b (t, y) | (\int_{Q} | d F_{y} - d {\tilde{F}}_{y} |) d y \leq C_{frag} sup_{y \in Q} \int_{Q} | d F_{y} - d {\tilde{F}}_{y} | \int_{Q} | b (t, y) | d y \leq C_{frag} \bar{x} C_{0} sup_{y \in Q} \int_{Q} | d F_{y} - d {\tilde{F}}_{y} |,

where dF_y denotes the Radon–Nikodym derivative of the measure F(·, y) and C₀ is the upper bound of the solutions defined in (18). Since ∫_Q|dF_y − dF̃_y| → 0 is equivalent to ρ_Proh(F_y, F̃_y) → 0, we know that

sup_{y \in Q} \int_{Q} | d F_{y} - d {\tilde{F}}_{y} | \to 0 as ρ (F, \tilde{F}) \to 0 .

Therefore,

C_{Frag} ‖ \int_{Q} | b (y) (Γ (x, y) - \tilde{Γ} (x, y)) d y ‖ \to 0 as ρ (F, \tilde{F}) \to 0 .

Similar analysis for the third term leads to the bound

C_{frag} ‖ \int_{Q} (b (y) - \tilde{b} (t, y)) \tilde{Γ} (x, y) d y ‖ \leq C_{frag} \bar{x} {‖ Γ ‖}_{\infty} ‖ b - \tilde{b} ‖ .

Combining these results we find the overall fragmentation term can be bounded by

‖ ℬ (b, ϕ) - ℬ (\tilde{b}, \tilde{ϕ}) ‖ \leq C_{frag} (\frac{1}{2} + \bar{x} {‖ Γ ‖}_{\infty}) ‖ b - \tilde{b} ‖ + 𝒯 (F, \tilde{F}) .

At this point we are ready to make the following claim.

Claim 4.3

The function g satisfies the locally Lipschitz property of Condition 2.1.

Proof

Consider

‖ g (b, F) - g (\tilde{b}, \tilde{F}) ‖ = ‖ 𝖠 [b] - 𝖠 [\tilde{b}] + ℬ [b; F] - ℬ [\tilde{b}; \tilde{F}] + ℛ [b] - ℛ [\tilde{b}] ‖ \leq ‖ 𝖠 [b] - 𝖠 [\tilde{b}] ‖ + ‖ ℬ [b; F] - ℬ [\tilde{b}; \tilde{F}] ‖ + ‖ ℛ [b] - ℛ [\tilde{b}] ‖ .

Using the Lipschitz constants from the fragmentation and aggregation terms,

‖ g (b, ϕ) - g (\tilde{b}, \tilde{ϕ}) ‖ \leq C ‖ b - \tilde{b} ‖ + 𝒯 (F, \tilde{F})

where $C = C_{frag} (\frac{1}{2} + \bar{x} {‖ Γ ‖}_{\infty}) + C_{1}$ and C₁ and C_frag are defined in Lemma 4.2.

The above claim proves continuity and the existence of minimizers of the least squares functional J, defined in (5), on the space of admissible probability distributions. Furthermore, Lemma 4.1 establishes that the classical solution of (12)–(13) is bounded on T × Q. Moreover, since the space of Riemann integrable functions are dense on L¹(Q, ℝ⁺), we can assume that the classical solution is also Riemann integrable. Therefore, the evolution equation (12)–(13) satisfies the consistency conditions of Theorem 2.5, and thus estimators of the least squares problem, defined in (5), are consistent for this flocculation model.

4.1. Numerical Implementation

In this section, we outline a discretization scheme for approximating the flocculation equations. We first form an approximation to H. We define basis elements

β_{i}^{N} (x) = {\begin{matrix} 1; & x_{i - 1}^{N} \leq x \leq x_{i}^{N}; i = 1, \dots, N \\ 0; & otherwise \end{matrix}

for positive integer N and ${x_{i}^{N}}_{i = 0}^{N}$ a uniform partition of $[0, \bar{x}] = [x_{0}^{N}, x_{N}^{N}]$ , and $Δ x = x_{j}^{N} - x_{j - 1}^{N}$ for all j. The β^N functions form an orthogonal basis for the approximate solution space

H^{N} = {h \in H | h = \sum_{i = 1}^{N} α_{i} β_{i}^{N}, α_{i} \in ℝ},

and accordingly, we define the orthogonal projections π^N : H ↦ H^N

π^{N} h = \sum_{j = 1}^{N} α_{j} β_{j}^{N}, where α_{j} = \frac{1}{Δ x} \int_{x_{j - 1}^{N}}^{x_{j}^{N}} h (x) d x .

Thus our approximating formulations of (12), (13) becomes the following system of N ODEs for b^N ∈ H^N and F ∈ ℱ:

b_{t}^{N} = π^{N} (𝖠 [b^{N}] + ℬ [b^{N}; F] + ℛ [b^{N}]),

(19)

b^{N} (0, x) = π^{N} b_{0} (x),

(20)

where

π^{N} 𝖠 [b^{N}] = (\begin{matrix} - α_{1} \sum_{j = 1}^{N - 1} k_{a} (x_{1}, x_{j}) α_{j} Δ x \\ \frac{1}{2} k_{a} (x_{1}, x_{1}) α_{1} α_{1} Δ x - α_{2} \sum_{j = 1}^{N - 2} k_{a} (x_{2}, x_{j}) α_{j} Δ x \\ ⋮ \\ \frac{1}{2} \sum_{j = 1}^{N - 2} k_{a} (x_{j}, x_{N - 1 - j}) α_{j} α_{N - 1 - j} Δ x - α_{N - 1} k_{a} (x_{N - 1,} x_{1}) α_{1} Δ x \\ \frac{1}{2} \sum_{j = 1}^{N - 1} k_{a} (x_{j}, x_{N - j}) α_{j} α_{N - j} Δ x \end{matrix})

and

π^{N} (ℬ [b^{N}; F] + ℛ [b^{N}]) = (\begin{matrix} \sum_{j = 2}^{N} Γ (x_{1}; x_{j}) k_{f} (x_{j}) α_{j} Δ x - \frac{1}{2} k_{f} (x_{1}) α_{1} - μ (x_{1}) α_{1} \\ \sum_{j = 3}^{N} Γ (x_{2}; x_{j}) k_{f} (x_{j}) α_{j} Δ x - \frac{1}{2} k_{f} (x_{2}) α_{2} - μ (x_{2}) α_{2} \\ ⋮ \\ Γ (x_{N - 1}; x_{N}) k_{f} (x_{N}) α_{N} Δ x - \frac{1}{2} k_{f} (x_{N - 1}) α_{N - 1} - μ (x_{N - 1}) α_{N - 1} - \frac{1}{2} k_{f} (x_{N}) α_{N} - μ (x_{N}) α_{N} \end{matrix}) .

In the following lemma we show that the numerical scheme satisfies Condition 3.2.

Claim 4.4

The function g^N : H^N × ℱ → H^N as defined by

g^{N} (b^{N}, F) = 𝖠 [b^{N}] + ℬ [b^{N}; F] + ℛ [b^{N}]

(21)

satisfies the Lipschitz-type property in Condition 3.2.

Proof

We consider the integrand

‖ π^{N} (𝖠 [b^{N}] + ℛ [b^{N}] + ℬ [b^{N}; F] - 𝖠 [{\tilde{b}}^{N}] - ℛ [{\tilde{b}}^{N}] - ℬ [{\tilde{b}}^{N}; \tilde{F}]) ‖,

and note that

\leq ‖ π^{N} ‖ (‖ 𝖠 [b^{N}] - 𝖠 [{\tilde{b}}^{N}] ‖ + ‖ ℛ [b^{N}] - ℛ [{\tilde{b}}^{N}] ‖ + ‖ ℬ [b^{N}; F] - ℬ [{\tilde{b}}^{N}; \tilde{F}] ‖) .

The induced L¹-norm on the projection operator will not be an issue as

‖ π^{N} ‖ = sup_{h \in H, ‖ h ‖ = 1} ‖ π^{N} h ‖ = sup_{h \in H, ‖ h ‖ = 1} ‖ \sum_{j = 1}^{N} \frac{β_{j}^{N} (\cdot)}{Δ x} \int_{x_{j - 1}^{N}}^{x_{j}^{N}} h (x) d x ‖ = 1 .

As illustrated in the proof of Claim 4.3, the bounding constants for 𝖠 + ℛ and ℬ are 3C₀ ‖k_a‖_∞ + ‖μ‖_∞ and ${‖ k_{f} ‖}_{\infty} (\frac{1}{2} + \bar{x} {‖ Γ ‖}_{\infty})$ , respectively. Combining these results, we have that

‖ {\tilde{b}}^{N} (t, x; \tilde{F}) - b^{N} (t, x; F) ‖ \leq C_{N} ‖ {\tilde{b}}^{N} (t, x; \tilde{F}) - b^{N} (t, x; F) ‖ + 𝖳^{N} (\tilde{F}, F)

where $𝖳^{N} (\tilde{F}, F) = \int_{0}^{t_{f}} π^{N} 𝒯 (\tilde{F}, F) d s$ , independent of t, and $C_{N} = {‖ k_{f} ‖}_{\infty} (\frac{1}{2} + \bar{x} {‖ Γ ‖}_{\infty}) + 3 C_{0} {‖ k_{a} ‖}_{\infty} + {‖ μ ‖}_{\infty}$ .

Corollary 4.5

The semi-discrete solutions to (19) converge uniformly in L¹-norm to the unique solution of (12) on a bounded time interval as N → ∞.

Proof

From results in [26], we can obtain semi-discrete solutions b^N to the forward problem that converge uniformly in norm to the unique solution of (12)–(13) on a bounded time interval as N → ∞.

For fixed N, we rewrite (19) in integral form and consider

‖ b^{N} (t, x; F) - π^{N} b (t, x; F) ‖ \leq \int_{0}^{t} ‖ π^{N} (ℛ [b^{N} (s, x; F)] - ℛ [b (s, x; F)]) ‖ d s + \int_{0}^{t} ‖ π^{N} (𝖠 [b^{N} (s, x; F)] + ℬ [b (s, x; F)] - 𝖠 [b^{N} (s, x; F)] - ℬ [b (s, x; F]) ‖ d s .

for t ∈ T. The general strategy is to use the fact that the discretized version of g, defined in (21), is locally Lipschitz and then apply Gronwall's inequality. We refer readers to [1, 26] for the detailed discussion about the convergence of the numerical scheme.

4.2. Convergence of the numerical scheme

In this section, we provide numerical evidence for the linear convergence of the approximation scheme described in Section 4.1. Towards this end, we choose the following model rates.

To describe the aggregation within a laminar shear field (orthokinetic aggregation [39]) we used the kernel,

k_{a} (x, y) = 10^{- 6} {(x^{1 / 3} + y^{1 / 3})}^{3} .

(22)

As in [2, 26, 63, 64] we assume that the breakage and removal rate of a floc of volume x is proportional to its radius,

k_{f} (x) = 10^{- 1} x^{1 / 3} μ (x) = 10^{- 3} x^{1 / 3} .

(23)

For the post-fragmentation density function we chose a uniform distribution in x for fixed y,

Γ (x, y) = 𝟙_{[0, y]} (x) \frac{1}{y},

(24)

a Beta distribution with α = β = 2.

The main advantage of this semi-discrete scheme, defined in (12)–(13), is that it can be initialized very fast using Toeplitz matrices [59]. This in turn proves useful in the optimization process, where the approximate forward problem is initialized and solved multiple times for each iteration. For solving the approximate forward problem we used an adaptive step size integration method implemented in an open-source Python library^§. The simulation was run with initial size-distribution b₀(x) = exp(x) on Q = [0, 1] for t_f = 1. L¹-error between actual solution u(t, x) and approximate solution u_N(t, x) was computed as

‖ u - u_{N} ‖ = \int_{0}^{t_{f}} \int_{0}^{\bar{x}} | u (t, x) - u_{N} (t, x) | d x d t .

Since no analytical solution is available for the nonlinear flocculation equations, defined (12)–(13), we estimate the error with a fine grid solution, i.e., u(t, x) ≈ u₁₀₀₀(t, x).

Figure 2 depicts loglog plot of the error, which implies that the numerical algorithm has a linear convergence rate. This is due to the fact that we chose zeroth order functions as basis functions for approximate subspaces. In general, if one desires a higher order convergence for Galerkin-type approximations, choosing higher order basis functions gives higher convergence rate [51].

4.3. Numerical optimization and results

As an initial investigation into the utility of this approach, we apply the framework presented in this paper to the flocculation equations. Towards this end, we generate two sets of artificial data.

In [29], we found that the resulting post-fragmentation density for small parent flocs resembles a Beta distribution with α = β = 2 (see Figure 1a for an illustration). Therefore, the first artificial data set was generated from the forward problem by assuming model rates given in (22)–(23) and a post-fragmentation density function a Beta distribution with α = β = 2,

Γ_{true} (x, y) = 𝟙_{[0, y]} (x) \frac{6 x (y - x)}{y^{3}} .

(25)

As in Section 4.1, we chose exponential initial size-distribution b₀(x) = 10³ exp(x) on Q = [0, 1] for t_f = 10. We also note that constants for the rate functions were chosen to emphasize the fragmentation as a driving factor. Moreover, Figure 3 illustrates the simulation of the forward problem for different post-fragmentation density functions. When the fragmentation is the driving mechanism, one can observe in Figures 3a–3c that model behaves significantly different for various post-fragmentation density functions.

Direct model simulation results. a) Simulation of the forward problem for Beta distribution with α = β = 2. b) Simulation of the forward problem using Beta distribution with α = 5 and β = 1. c) Simulation of the forward problem using a uniform distribution in x for fixed y.

Recall from Section 2.1 that data required for the inverse problem needs to be of the form n_j(t_i), representing the number of flocs with volume between x_j and x_j+1 at time t_i. In general, the number of bins for floc volume N_x comes fixed with measurement device (flow-cytometers, Coulter counters, etc). Therefore, for the synthetic data generation we choose fixed volume bins N_x = 10. Nevertheless, one has control of number of measurements taken in time, N_t.

The simulation results with fine grid (N = 1000 and Δt = 0.001) were interpolated onto the function b(t, x, F₀) using linear interpolation^‖. Consequently, aggregate data of the form (3) were obtained from the integration of the interpolated function b(t, x, F₀) on the interval [x_j, x_j+1] and at time t_i. Furthermore, normal i.i.d noise with zero mean and standard deviation σ = 20 were added to the aggregate data. For this choice of initial size-distribution and volume bins N_x values of the data are in the range [100, 300] and thus we note that σ = 20 is a significant noise.

To minimize the approximate cost functional in (8) we used nonlinear constrained optimization^¶ employing Powell's iterative direct search algorithm [70, 71]. At each iteration the algorithm forms linear approximations to nonlinear objective and constraint functions and thus performs well even when no derivative information is available. For better results we set the maximum number of iterations to 10⁴. The optimization was seeded with an initial density comprised of a uniform distribution in x for fixed y, illustrated in Figure 1c. Naturally, we constrained Γ(·, y) to be a probability density for each fixed y, i.e.,

F (x, y) = \int_{0}^{x} Γ (ξ; y) d ξ = 1 for all x \in (y, \bar{x}] .

The optimization procedure is computationally very expensive. That is because at each iteration the algorithm solves the approximate forward problem with dimension N. Moreover, recall from Section 3 that for each approximate space 𝖥^ML the optimization entails finding L probability measures discretized with M Dirac measures. Therefore, time required for minimizing the cost functional (8) increases substantially for larger dimensions of the approximate space 𝖥^ML, defined in Section 3. Towards this end, dimension of the approximate space and the approximate forward solution was set to 30; i.e., N = M = L = 30. For this case note that, since F(x, y) ≡ 1 for x ≥ y and fixed y ∈ Q,

\frac{N (N - 1)}{2} = 435

discrete parameters need to be optimized for the simulation of the inverse framework.

For computational convenience, the error plots are in terms of the total variation metric (also called the statistical distance) defined as

ρ_{T V} (F, \tilde{F}) = sup_{Ω \in 𝖠} | F (Ω) - \tilde{F} (Ω) |,

where 𝒜 is the Borel σ-algebra on Q as defined in Section 2.1. Note that convergence in the total variation metric implies the convergence of probability measures in the Prohorov metric [45], i.e.,

ρ_{proh} (F, \tilde{F}) \leq ρ_{T V} (F, \tilde{F}) .

(26)

The result of the optimization for the density function (25) is shown in Figure 4. The optimization was carried out with N = M = L = 30, N_x = 10, N_t = 20. In Figure 4d, we have illustrated error plots for different observation duration t_f ∈ {1, 2, ⋯, 20}. Observe the general trend that having larger observation duration improves the error in estimates. Figures 4a and 4b depict the result of the optimization for t_f = 10. One can see that fit between the true F₀ and approximate probability measures F₃₀ is satisfactory (though there is room for improvement). Moreover, Figure 4c also illustrates the satisfactory fit between F₀ and F₃₀, for two fixed values of y.

Simulation results for the artificial data generated using Beta distribution with α = β = 2, and normal i.i.d error with mean zero and standard deviation σ = 20. (a) *True* cdf F₀(*x, y*) (b) Approximate cdf F₃₀(*x, y*) for N = M = L = 30, *N_x* = 10, *N_t* = 20 and *t_f* = 10. (c) Comparison of F₀(*x, y*) and F₃₀(*x, y*) for y = 0.5 and y = 1.0 (d) Error plots in total variance norm for *t_f* ∈ {1, 2, ⋯, 20} and *N_t* = 20.

To investigate the behavior of the estimators for different probability distributions we applied our inverse methodology for another artificial data set. The artificial data set was generated with a post-fragmentation density function equal to a Beta distribution with α = 5 and β = 1 (see Figure 1a for an illustration),

Γ_{true} (x, y) = 𝟙_{[0, y]} (x) \frac{5 x^{4}}{y^{5}} .

(27)

Other model rates were chosen same as in the first artificial data set. The application of our inverse problem framework to this second artificial data set is depicted in Figure 5. Figure 5d illustrates error plots in total variance norm for different observation duration t_f ∈ {1, 2, ⋯, 20}. Once again having larger observation duration is generally improving the error in estimates. Furthermore, Figures 5a–5c, depict that the fit between F₀ and F₃₀ is satisfactory.

Simulation results for the artificial data generated using Beta distribution with α = 5 and β = 1 and normal i.i.d error with mean zero and standard deviation σ = 20. (a) *True* cdf F₀(*x, y*). (b) Approximate cdf F₃₀(*x, y*) for N = M = L = 30, *N_x* = 10, *N_t* = 20 and *t_f* = 10. (c) Comparison of F₀(*x, y*) and F₃₀(*x, y*) for y = 0.5 and y = 1.0 (d) Error plots in total variance norm for *t_f* ∈ {1, 2, ⋯, 20} and *N_t* = 20.

We also note that our simulations gave better convergence results for larger values of t_f ≥ 5 as depicted in Figures 4d and 5d. For the last few values of t_f, the error does appear to exhibit an increasing trend, possibly due a lack of sufficient resolution in time. This result is consistent with the literature [11, 52, 75] in that the size and resolution of the observation interval can have a substantial effect on identifiability of parameters. For instance, Thomaseth and Cobelli [75] developed generalized sensitivity functions that can be used for the qualitative analysis of the impact of the observation intervals on identifiability of parameters in dynamical systems. These sensitivity functions help to identify the most relevant data and time subdomains for identification of certain parameters. Later, Banks et al. [12, 13] offered a quantitative means to choose the duration t_f required for an optimal experiment design. Moreover, Keck and Bortz [52], provided an extension of this sensitivity functions to the size-structured population models. Hence, as a future research direction, we plan to incorporate these sensitivity functions for choosing the optimum observation duration t_f.

Figure 6 depicts the effect of increasing noise on the reconstruction of the conditional probability measures. In particular, in Figure 6a (and Figure 6b) we plotted the error between true cdf F₀ and approximate cdf F₃₀ for increasing standard deviation σ of artificial noise added to the first data set (and the second data set). The optimization was carried with t_f = 10, N = M = L = 30 and N_t = N_x = 10. One can observe that for both data sets the optimization performs well for the standard deviations in the range [0, 25].

Effect of noise on reconstruction of the conditional probability measure with N = M = L = 30 and *N_t* = *N_x* = 10. a) Artificial data set generated using Beta distribution with α = β = 2 and increasing normal i.i.d error with mean zero and standard deviation σ ∈ [0, 50]. b) Artificial data set generated using Beta distribution with α = 5 and β = 1 and increasing normal i.i.d error with mean zero and standard deviation σ ∈ [0, 50].

5. Concluding Remarks

Our efforts here are motivated by a class of mathematical models which characterize a random process, such as fragmentation, by a probability distribution. We are concerned with the inverse problem for inferring the conditional probability distribution in measure-dependent evolution equations, and present the specific problem for the flocculation dynamics of aggregates in suspension which motivated this study. We then developed the mathematical framework in which we formulate the inverse problem as a least squares problem for inferring the conditional probability distributions. We prove existence and consistency of the least squares estimates using the Prohorov metric framework. We also include results for overall method stability for numerical approximation, confirming a computationally feasible methodology. Finally, we verify that our motivating example in flocculation dynamics conforms to the developed framework, and illustrate its utility by identifying sample distributions.

To conclude, this work is one piece of a larger effort aimed at advancing our abilities for identifying microscale phenomena from size-structured population measurements. In particular, we are interested in the propensity of suspended bacterial aggregates to fragment in a flowing system. The model proposed in [29] uses knowledge of the hydrodynamics to predict a breakage event and thus the post fragmentation density Γ. With this work, we now have a tool to bridge the gap between the experimental and microscale modeling efforts for fragmentation. Our future effort will focus on using experimental evidence to validate (or refute) our proposed fragmentation model.

Supplementary Material

f_fit_alpha_2.png

NIHMS804840-supplement-f_fit_alpha_2_png.png^{(117.1KB, png)}

f_fit_alpha_5.png

NIHMS804840-supplement-f_fit_alpha_5_png.png^{(116.1KB, png)}

Acknowledgments

The authors would like to thank J. G. Younger (University of Michigan) for insightful discussions. This research was supported in part by grants NIH-NIBIB-1R01GM081702-01A2, NIH-NIGMS 2R01GM069438-06A2, NSF-DMS-1225878, and DOD-AFOSR-FA9550-09-1-0404. This work utilized the Janus supercomputer, which is supported by the National Science Foundation (award number CNS-0821794) and the University of Colorado Boulder. The Janus supercomputer is a joint effort of the University of Colorado Boulder, the University of Colorado Denver and the National Center for Atmospheric Research.

Footnotes

^‡

Also known as population balance equation (PBE) in engineering literature [72]

^§

scipy.integrate.odeint

^‖

scipy.interpolate.interp2d

^¶

scipy.optimize.fmin_cobyla

References

1.Ackleh AS. Parameter estimation in a structured algal coagulation-fragmentation model. Nonlinear Analysis. 1997;28:837–854. [Google Scholar]
2.Ackleh AS, Fitzpatrick BG. Modeling aggregation and growth processes in an algal population model: analysis and computations. Journal of Mathematical Biology. 1997;35:480–502. [Google Scholar]
3.Adachi Y. Dynamic aspects of coagulation and flocculation. Advances in Colloid and Interface Science. 1995;56:1–31. [Google Scholar]
4.Adams R, Fournier J. Sobolev spaces. Oxford, UK: Elsevier Ltd; 2003. [Google Scholar]
5.Aldous D. Deterministic and Stochastic Models for Coalescence (Aggregation, Coagulation): A Review of the Mean-Field Theory for Probabilists. Bernoulli. 1999;5:3–48. [Google Scholar]
6.Bäbler MU, Morbidelli M, Baldyga J. Modelling the breakup of solid aggregates in turbulent flows. Journal of Fluid Mechanics. 2008;612:261–289. [Google Scholar]
7.Banasiak J, Lamb W. Coagulation, fragmentation and growth processes in a size structured population. Discrete and Continuous Dynamical Systems - Series B. 2009;11:563–585. [Google Scholar]
8.Banks HT, Banks JE, Dick LK, Stark JD. Estimation of Dynamic Rate Parameters in Insect Populations Undergoing Sublethal Exposure to Pesticides. Bull. Math. Biol. 2007;69:2139–2180. doi: 10.1007/s11538-007-9207-z. [DOI] [PubMed] [Google Scholar]
9.Banks HT, Bihari KL. Modeling and Estimating Uncertainty in Parameter Estimation. Inverse Problems. 2001;17:95–111. [Google Scholar]
10.Banks HT, Bortz DM. Inverse problems for a class of measure dependent dynamical systems. Journal of Inverse and Ill-posed Problems. 2005;13:103–121. [Google Scholar]
11.Banks HT, Dediu S, Ernstberger SL. Sensitivity functions and their uses in inverse problems. Journal of Inverse and Ill-posed Problems jiip. 2007;15:683–708. [Google Scholar]
12.Banks HT, Dediu S, Ernstberger SL. Sensitivity functions and their uses in inverse problems. Journal of Inverse and Ill-posed Problems. 2007;15 [Google Scholar]
13.Banks HT, Dediu S, Ernstberger SL, Kappel F. Generalized sensitivities and optimal experimental design. Journal of Inverse and Ill-posed Problems. 2010;18 [Google Scholar]
14.Banks HT, Fitzpatrick BG. Statistical methods for model comparison in parameter estimation problems for distributed systems. Journal of Mathematical Biology. 1990;28:501–527. [Google Scholar]
15.Banks HT, Fitzpatrick BG. Estimation of growth rate distributions in size structured population models. Quarterly of Applied Mathematics. 1991;49:215–235. [Google Scholar]
16.Banks HT, Fitzpatrick BG, Potter LK, Zhang Y. Estimation of Probability Distributions for Individual Parameters Using Aggregate Population Data. In: McEneaney WM, Yin GG, Zhang Q, editors. Stochastic Analysis, Control, Optimization and Applications. Birkhäuser Boston: 1999. pp. 353–371. Systems & Control: Foundations & Applications. [Google Scholar]
17.Banks HT, Kenz ZR, Thompson WC. A review of selected techniques in inverse problem nonparametric probability distribution estimation. Journal of Inverse and Ill-Posed Problems. 2012;20:429–460. [Google Scholar]
18.Banks HT, Kunisch K. vol. 1 of Systems & Control: Foundations & Applications. Birkhäuser, Boston, MA: 1989. Estimation Techniques for Distributed Parameter Systems. [Google Scholar]
19.Banks HT, Sutton KL, Thompson WC, Bocharov G, Roose D, Schenkel T, Meyerhans A. Estimation of Cell Proliferation Dynamics Using CFSE Data. Bull. Math. Biol. 2010;73:116–150. doi: 10.1007/s11538-010-9524-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Banks HT, Thompson WC. Technical Report CRSC-TR12-21. Raleigh, NC: North Carolina State University Center for Research in Scientific Computation; 2012. Least Squares Estimation of Probability Measures in the Prohorov Metric Framework. [Google Scholar]
21.Banks HT, Thompson WC. Existence and consistency of a nonparametric estimator of probability measures in the prohorov metric framework. International Journal of Pure and Apllied Mathematics. 2015;103 [Google Scholar]
22.Banks JE, Dick LK, Banks HT, Stark JD. Time-varying vital rates in ecotoxicology: Selective pesticides and aphid population dynamics. Ecological Modelling. 2008;210:155–160. [Google Scholar]
23.Becker R, Döring W. Kinetische Behandlung der Keimbildung in übersättigten Dämpfen. Ann. Phys. 1935;416:719–752. [Google Scholar]
24.Bell G, Anderson E. Cell growth and division: I. a mathematical model with applications to cell volume distributions in mammalian suspension cultures. Biophysical journal. 1967 doi: 10.1016/S0006-3495(67)86592-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Billingsley P. Convergence of Probability Measures. New York, NY: John Wiley & Sons; 1968. [Google Scholar]
26.Bortz DM, Jackson TL, Taylor KA, Thompson AP, Younger JG. Klebsiella pneumoniae Flocculation Dynamics. Bull. Math. Biology. 2008;70:745–768. doi: 10.1007/s11538-007-9277-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Bourgade J-P, Filbet F. Convergence of a Finite Volume Scheme for Coagulation-Fragmentation Equations. Mathematics of Computation. 2008;77:851–882. [Google Scholar]
28.Bourgeron T, Doumic M, Escobedo M. Estimating the division rate of the growth-fragmentation equation with a self-similar kernel. Inverse Problems. 2014;30:025007. [Google Scholar]
29.Byrne E, Dzul S, Solomon M, Younger J, Bortz DM. Postfragmentation density function for bacterial aggregates in laminar flow. Physical Review E. 2011;83:041911. doi: 10.1103/PhysRevE.83.041911. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Carrera J, Neuman SP. Estimation of Aquifer Parameters Under Transient and Steady State Conditions: 2. Uniqueness, Stability, and Solution Algorithms. Water Resour. Res. 1986;22:211–227. [Google Scholar]
31.Chavent G. Identification of distributed parameter systems: about the output least square method, its implementation and identifiability. Proc. 5th IFAC Symposium on Identification and System Parameter Estimation. 1979;1:85–97. [Google Scholar]
32.Chavent G. Local stability of the output least square parameter estimation technique. Math. Appl. Comp. 1983;2:3–22. [Google Scholar]
33.Cooley RL. Incorporation of prior information on parameters into nonlinear regression groundwater flow models: 1. Theory. Water Resour. Res. 1982;18:965–976. [Google Scholar]
34.Dacosta FP. Existence and Uniqueness of Density Conserving Solutions to the Coagulation-Fragmentation Equations with Strong Fragmentation. Journal of Mathematical Analysis and Applications. 1995;192:892–914. [Google Scholar]
35.Dagan G. Stochastic modeling of groundwater flow by unconditional and conditional probabilities: 1. Conditional simulation and the direct problem. Water Resources Research. 1982;18:813–833. [Google Scholar]
36.Darzynkiewicz Z, Robinson JP, Crissman HA. Methods in Cell Biology. 2nd. San Diego, CA: Academic Press; 1994. Flow Cytometry; pp. 1–697. [Google Scholar]
37.Davis RH, Hunt TP. Modeling and Measurement of Yeast Flocculation. Biotechnol Progress. 1986;2:91–97. doi: 10.1002/btpr.5420020208. [DOI] [PubMed] [Google Scholar]
38.DeVita VT, Lawrence S, Theodore, Rosenberg SA. Cancer: Principles and Practice of Oncology. Vol. 1. Lippincott Williams & Wilkins; 2008. [Google Scholar]
39.Dobias B. Coagulation and Flocculation: Theory and Applications. CRC Press; 1993. Jan. [Google Scholar]
40.Dorao CA, Jakobsen HA. A least squares method for the solution of population balance problems. Computers & Chemical Engineering. 2006;30:535–547. [Google Scholar]
41.Doumic M, Perthame B, Zubelli JP. Numerical solution of an inverse problem in size-structured population dynamics. Inverse Problems. 2009;25:045008. [Google Scholar]
42.Doumic M, Tine LM. Estimating the division rate for the growth-fragmentation equation. J. Math. Biol. 2012;67:69–103. doi: 10.1007/s00285-012-0553-6. [DOI] [PubMed] [Google Scholar]
43.Engl HW, Rundell W, Scherzer O. A Regularization Scheme for an Inverse Problem in Age-Structured Populations. Journal of Mathematical Analysis and Applications. 1994;182:658–679. [Google Scholar]
44.Gamma CD, Jimeno CL. Rock fragmentation control for blasting cost minimization and environmental impact abatement. In: Roy PP, editor. Rock Fragmentation by Blasting. Amsterdam, The Netherlands: A. A. Balkema Publishers; 1993. p. 273. [Google Scholar]
45.Gibbs AL, Su FE. On Choosing and Bounding Probability Metrics. International Statistical Review. 2002;70:419–435. [Google Scholar]
46.Gyllenberg M, Osipov A, Päivärinta L. The inverse problem of linear age-structured population dynamics. J. evol. equ. 2002;2:223–239. [Google Scholar]
47.Hall P, Racine J, Li Q. Cross-Validation and the Estimation of Conditional Probability Densities. Journal of the American Statistical Association. 2004;99:1015–1026. [Google Scholar]
48.Han B, Akeprathumchai S, Wickramasinghe SR, Qian X. Flocculation of biological cells: Experiment vs. theory. AIChE J. 2003;49:1687–1701. [Google Scholar]
49.Ilana N, Elkinb M, Vlodavsky I. Regulation, function and clinical significance of heparanase in cancer metastasis and angiogenesis. The International Journal of Biochemistry & Cell Biology. 2006;38:2018. doi: 10.1016/j.biocel.2006.06.004. [DOI] [PubMed] [Google Scholar]
50.Iori G, Kapar B, Olmo J. Bank characteristics and the interbank money market: a distributional approach. Studies in Nonlinear Dynamics & Econometrics. 2015;19 [Google Scholar]
51.Kappel F. Spline approximation for autonomous nonlinear functional differential equations. Nonlinear Analysis: Theory, Methods & Applications. 1986;10:503–513. [Google Scholar]
52.Keck DD, Bortz DM. Generalized sensitivity functions for size-structured population models. Journal of Inverse and Ill-posed Problems. 2016 (to appear) [Google Scholar]
53.Kelley JL. General Topology. Princeton, NJ: Van Nostrand-Reinhold; 1955. [Google Scholar]
54.Krishnaswamy S, Spitzer MH, Mingueneau M, Bendall SC, Litvin O, Stone E, Pe’er D, Nolan GP. Conditional density-based analysis of T cell signaling in single-cell data. Science. 2014;346:1250689–1250689. doi: 10.1126/science.1250689. [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Lu CF, Spielman LA. Kinetics of floc breakage and aggregation in agitated liquid suspensions. Journal of Colloid and Interface Science. 1985;103:95–105. [Google Scholar]
56.Luzyanina T, Mrusek S, Edwards JT, Roose D, Ehl S, Bocharov G. Computational analysis of CFSE proliferation assay. J. Math. Biol. 2006;54:57–89. doi: 10.1007/s00285-006-0046-6. [DOI] [PubMed] [Google Scholar]
57.Luzyanina T, Roose D, Bocharov G. Distributed parameter identification for a label-structured cell population dynamics model using CFSE histogram time-series data. J. Math. Biol. 2008;59:581–603. doi: 10.1007/s00285-008-0244-5. [DOI] [PubMed] [Google Scholar]
58.Luzyanina T, Roose D, Schenkel T, Sester M, Ehl S, Meyerhans A, Bocharov G. Numerical modelling of label-structured cell population growth using CFSE distribution data. Theor Biol Med Model. 2007;4:1–15. doi: 10.1186/1742-4682-4-26. [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Matveev SA, Smirnov AP, Tyrtyshnikov EE. A fast numerical method for the Cauchy problem for the Smoluchowski equation. Journal of Computational Physics. 2015;282:23–32. [Google Scholar]
60.Metz JAJ, Diekmann O. The Dynamics of Physiologically Structured Populations. Springer-Verlag; 1986. [Google Scholar]
61.Miao H, Xia X, Perelson A, Wu H. On Identifiability of Nonlinear ODE Models and Applications in Viral Dynamics. SIAM Rev. 2011;53:3–39. doi: 10.1137/090757009. [DOI] [PMC free article] [PubMed] [Google Scholar]
62.Michel P, Mischler S, Perthame B. General relative entropy inequality: an illustration on growth models. Journal de Mathématiques Pures et Appliquées. 2005;84:1235–1260. [Google Scholar]
63.Mirzaev I, Bortz DM. Criteria for linearized stability for a size-structured population model. 2015 arXiv:1502.02754. [Google Scholar]
64.Mirzaev I, Bortz DM. Stability of steady states for a class of flocculation equations with growth and removal. 2015 arXiv:1507.07127. [Google Scholar]
65.Nicmanis M, Hounslow MJ. Finite-element methods for steady-state population balance equations. AIChE J. 1998;44:2258–2272. [Google Scholar]
66.Pandya JD, Spielman LA. Floc breakage in agitated suspensions: Theory and data processing strategy. Journal of Colloid and Interface Science. 1982;90:517–531. [Google Scholar]
67.Persson P-A, Holmberg R, Lee J. Rock Blasting and Explosives engineering. CRC Press; 1994. [Google Scholar]
68.Perthame B, Zubelli JP. On the inverse problem for a size-structured population model. Inverse Problems. 2007;23:1037–1052. [Google Scholar]
69.Pilant M, Rundell W. Determining a Coefficient in a First-Order Hyperbolic Equation. SIAM J. Appl. Math. 1991;51:494–506. [Google Scholar]
70.Powell MJ. Advances in optimization and numerical analysis. Springer; 1994. A direct search optimization method that models the objective and constraint functions by linear interpolation; pp. 51–67. [Google Scholar]
71.Powell MJD. Direct search algorithms for optimization calculations. Acta Numerica. 1998;7:287–336. [Google Scholar]
72.Ramkrishna D. Population Balances: Theory and Applications to Particulate Systems in Engineering. Academic Press; 2000. [Google Scholar]
73.Sinko JW, Streifer W. A new model for age-size structure of a population. Ecology. 1967;48:910–918. [Google Scholar]
74.Thomas DN, Judd SJ, Fawcett N. Flocculation modelling: a review. Water Research. 1999;33:1579–1592. [Google Scholar]
75.Thomaseth K, Cobelli C. Generalized sensitivity functions in physiological system identification. Annals of biomedical engineering. 1999;27:607–616. doi: 10.1114/1.207. [DOI] [PubMed] [Google Scholar]
76.van Smoluchowski M. Drei Vortrage uber Diffusion, Brownsche Bewegung und Koagulation von Kolloidteilchen. Zeitschrift fur Physik. 1916;17:557–585. [Google Scholar]
77.van Smoluchowski M. Versuch einer mathematischen theorie der koagulation kinetic kolloider losungen. Zeitschrift für physikalische Chemie. 1917;92:129–168. [Google Scholar]
78.Wattis JA. An introduction to mathematical models of coagulation–fragmentation processes: A discrete deterministic mean-field approach. Physica D: Nonlinear Phenomena. 2006;222:1–20. [Google Scholar]
79.Webb GF. Theory of nonlinear age-dependent population dynamics. CRC Press; 1985. [Google Scholar]
80.White WH. A Global Existence Theorem for Smoluchowski’s Coagulation Equations. Proceedings of the American Mathematical Society. 1980;80:273–276. [Google Scholar]
81.Wyckoff JB, Jones JG, Condeelis JS, Segall JE. A critical step in metastasis: in vivo analysis of intravasation at the primary tumor. Cancer research. 2000;60:2504–2511. [PubMed] [Google Scholar]
82.Yeh WW-G. Review of Parameter Identification Procedures in Groundwater Hydrology: The Inverse Problem. Water Resour. Res. 1986;22:95–108. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

f_fit_alpha_2.png

NIHMS804840-supplement-f_fit_alpha_2_png.png^{(117.1KB, png)}

f_fit_alpha_5.png

NIHMS804840-supplement-f_fit_alpha_5_png.png^{(116.1KB, png)}

[R1] 1.Ackleh AS. Parameter estimation in a structured algal coagulation-fragmentation model. Nonlinear Analysis. 1997;28:837–854. [Google Scholar]

[R2] 2.Ackleh AS, Fitzpatrick BG. Modeling aggregation and growth processes in an algal population model: analysis and computations. Journal of Mathematical Biology. 1997;35:480–502. [Google Scholar]

[R3] 3.Adachi Y. Dynamic aspects of coagulation and flocculation. Advances in Colloid and Interface Science. 1995;56:1–31. [Google Scholar]

[R4] 4.Adams R, Fournier J. Sobolev spaces. Oxford, UK: Elsevier Ltd; 2003. [Google Scholar]

[R5] 5.Aldous D. Deterministic and Stochastic Models for Coalescence (Aggregation, Coagulation): A Review of the Mean-Field Theory for Probabilists. Bernoulli. 1999;5:3–48. [Google Scholar]

[R6] 6.Bäbler MU, Morbidelli M, Baldyga J. Modelling the breakup of solid aggregates in turbulent flows. Journal of Fluid Mechanics. 2008;612:261–289. [Google Scholar]

[R7] 7.Banasiak J, Lamb W. Coagulation, fragmentation and growth processes in a size structured population. Discrete and Continuous Dynamical Systems - Series B. 2009;11:563–585. [Google Scholar]

[R8] 8.Banks HT, Banks JE, Dick LK, Stark JD. Estimation of Dynamic Rate Parameters in Insect Populations Undergoing Sublethal Exposure to Pesticides. Bull. Math. Biol. 2007;69:2139–2180. doi: 10.1007/s11538-007-9207-z. [DOI] [PubMed] [Google Scholar]

[R9] 9.Banks HT, Bihari KL. Modeling and Estimating Uncertainty in Parameter Estimation. Inverse Problems. 2001;17:95–111. [Google Scholar]

[R10] 10.Banks HT, Bortz DM. Inverse problems for a class of measure dependent dynamical systems. Journal of Inverse and Ill-posed Problems. 2005;13:103–121. [Google Scholar]

[R11] 11.Banks HT, Dediu S, Ernstberger SL. Sensitivity functions and their uses in inverse problems. Journal of Inverse and Ill-posed Problems jiip. 2007;15:683–708. [Google Scholar]

[R12] 12.Banks HT, Dediu S, Ernstberger SL. Sensitivity functions and their uses in inverse problems. Journal of Inverse and Ill-posed Problems. 2007;15 [Google Scholar]

[R13] 13.Banks HT, Dediu S, Ernstberger SL, Kappel F. Generalized sensitivities and optimal experimental design. Journal of Inverse and Ill-posed Problems. 2010;18 [Google Scholar]

[R14] 14.Banks HT, Fitzpatrick BG. Statistical methods for model comparison in parameter estimation problems for distributed systems. Journal of Mathematical Biology. 1990;28:501–527. [Google Scholar]

[R15] 15.Banks HT, Fitzpatrick BG. Estimation of growth rate distributions in size structured population models. Quarterly of Applied Mathematics. 1991;49:215–235. [Google Scholar]

[R16] 16.Banks HT, Fitzpatrick BG, Potter LK, Zhang Y. Estimation of Probability Distributions for Individual Parameters Using Aggregate Population Data. In: McEneaney WM, Yin GG, Zhang Q, editors. Stochastic Analysis, Control, Optimization and Applications. Birkhäuser Boston: 1999. pp. 353–371. Systems & Control: Foundations & Applications. [Google Scholar]

[R17] 17.Banks HT, Kenz ZR, Thompson WC. A review of selected techniques in inverse problem nonparametric probability distribution estimation. Journal of Inverse and Ill-Posed Problems. 2012;20:429–460. [Google Scholar]

[R18] 18.Banks HT, Kunisch K. vol. 1 of Systems & Control: Foundations & Applications. Birkhäuser, Boston, MA: 1989. Estimation Techniques for Distributed Parameter Systems. [Google Scholar]

[R19] 19.Banks HT, Sutton KL, Thompson WC, Bocharov G, Roose D, Schenkel T, Meyerhans A. Estimation of Cell Proliferation Dynamics Using CFSE Data. Bull. Math. Biol. 2010;73:116–150. doi: 10.1007/s11538-010-9524-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Banks HT, Thompson WC. Technical Report CRSC-TR12-21. Raleigh, NC: North Carolina State University Center for Research in Scientific Computation; 2012. Least Squares Estimation of Probability Measures in the Prohorov Metric Framework. [Google Scholar]

[R21] 21.Banks HT, Thompson WC. Existence and consistency of a nonparametric estimator of probability measures in the prohorov metric framework. International Journal of Pure and Apllied Mathematics. 2015;103 [Google Scholar]

[R22] 22.Banks JE, Dick LK, Banks HT, Stark JD. Time-varying vital rates in ecotoxicology: Selective pesticides and aphid population dynamics. Ecological Modelling. 2008;210:155–160. [Google Scholar]

[R23] 23.Becker R, Döring W. Kinetische Behandlung der Keimbildung in übersättigten Dämpfen. Ann. Phys. 1935;416:719–752. [Google Scholar]

[R24] 24.Bell G, Anderson E. Cell growth and division: I. a mathematical model with applications to cell volume distributions in mammalian suspension cultures. Biophysical journal. 1967 doi: 10.1016/S0006-3495(67)86592-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Billingsley P. Convergence of Probability Measures. New York, NY: John Wiley & Sons; 1968. [Google Scholar]

[R26] 26.Bortz DM, Jackson TL, Taylor KA, Thompson AP, Younger JG. Klebsiella pneumoniae Flocculation Dynamics. Bull. Math. Biology. 2008;70:745–768. doi: 10.1007/s11538-007-9277-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Bourgade J-P, Filbet F. Convergence of a Finite Volume Scheme for Coagulation-Fragmentation Equations. Mathematics of Computation. 2008;77:851–882. [Google Scholar]

[R28] 28.Bourgeron T, Doumic M, Escobedo M. Estimating the division rate of the growth-fragmentation equation with a self-similar kernel. Inverse Problems. 2014;30:025007. [Google Scholar]

[R29] 29.Byrne E, Dzul S, Solomon M, Younger J, Bortz DM. Postfragmentation density function for bacterial aggregates in laminar flow. Physical Review E. 2011;83:041911. doi: 10.1103/PhysRevE.83.041911. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Carrera J, Neuman SP. Estimation of Aquifer Parameters Under Transient and Steady State Conditions: 2. Uniqueness, Stability, and Solution Algorithms. Water Resour. Res. 1986;22:211–227. [Google Scholar]

[R31] 31.Chavent G. Identification of distributed parameter systems: about the output least square method, its implementation and identifiability. Proc. 5th IFAC Symposium on Identification and System Parameter Estimation. 1979;1:85–97. [Google Scholar]

[R32] 32.Chavent G. Local stability of the output least square parameter estimation technique. Math. Appl. Comp. 1983;2:3–22. [Google Scholar]

[R33] 33.Cooley RL. Incorporation of prior information on parameters into nonlinear regression groundwater flow models: 1. Theory. Water Resour. Res. 1982;18:965–976. [Google Scholar]

[R34] 34.Dacosta FP. Existence and Uniqueness of Density Conserving Solutions to the Coagulation-Fragmentation Equations with Strong Fragmentation. Journal of Mathematical Analysis and Applications. 1995;192:892–914. [Google Scholar]

[R35] 35.Dagan G. Stochastic modeling of groundwater flow by unconditional and conditional probabilities: 1. Conditional simulation and the direct problem. Water Resources Research. 1982;18:813–833. [Google Scholar]

[R36] 36.Darzynkiewicz Z, Robinson JP, Crissman HA. Methods in Cell Biology. 2nd. San Diego, CA: Academic Press; 1994. Flow Cytometry; pp. 1–697. [Google Scholar]

[R37] 37.Davis RH, Hunt TP. Modeling and Measurement of Yeast Flocculation. Biotechnol Progress. 1986;2:91–97. doi: 10.1002/btpr.5420020208. [DOI] [PubMed] [Google Scholar]

[R38] 38.DeVita VT, Lawrence S, Theodore, Rosenberg SA. Cancer: Principles and Practice of Oncology. Vol. 1. Lippincott Williams & Wilkins; 2008. [Google Scholar]

[R39] 39.Dobias B. Coagulation and Flocculation: Theory and Applications. CRC Press; 1993. Jan. [Google Scholar]

[R40] 40.Dorao CA, Jakobsen HA. A least squares method for the solution of population balance problems. Computers & Chemical Engineering. 2006;30:535–547. [Google Scholar]

[R41] 41.Doumic M, Perthame B, Zubelli JP. Numerical solution of an inverse problem in size-structured population dynamics. Inverse Problems. 2009;25:045008. [Google Scholar]

[R42] 42.Doumic M, Tine LM. Estimating the division rate for the growth-fragmentation equation. J. Math. Biol. 2012;67:69–103. doi: 10.1007/s00285-012-0553-6. [DOI] [PubMed] [Google Scholar]

[R43] 43.Engl HW, Rundell W, Scherzer O. A Regularization Scheme for an Inverse Problem in Age-Structured Populations. Journal of Mathematical Analysis and Applications. 1994;182:658–679. [Google Scholar]

[R44] 44.Gamma CD, Jimeno CL. Rock fragmentation control for blasting cost minimization and environmental impact abatement. In: Roy PP, editor. Rock Fragmentation by Blasting. Amsterdam, The Netherlands: A. A. Balkema Publishers; 1993. p. 273. [Google Scholar]

[R45] 45.Gibbs AL, Su FE. On Choosing and Bounding Probability Metrics. International Statistical Review. 2002;70:419–435. [Google Scholar]

[R46] 46.Gyllenberg M, Osipov A, Päivärinta L. The inverse problem of linear age-structured population dynamics. J. evol. equ. 2002;2:223–239. [Google Scholar]

[R47] 47.Hall P, Racine J, Li Q. Cross-Validation and the Estimation of Conditional Probability Densities. Journal of the American Statistical Association. 2004;99:1015–1026. [Google Scholar]

[R48] 48.Han B, Akeprathumchai S, Wickramasinghe SR, Qian X. Flocculation of biological cells: Experiment vs. theory. AIChE J. 2003;49:1687–1701. [Google Scholar]

[R49] 49.Ilana N, Elkinb M, Vlodavsky I. Regulation, function and clinical significance of heparanase in cancer metastasis and angiogenesis. The International Journal of Biochemistry & Cell Biology. 2006;38:2018. doi: 10.1016/j.biocel.2006.06.004. [DOI] [PubMed] [Google Scholar]

[R50] 50.Iori G, Kapar B, Olmo J. Bank characteristics and the interbank money market: a distributional approach. Studies in Nonlinear Dynamics & Econometrics. 2015;19 [Google Scholar]

[R51] 51.Kappel F. Spline approximation for autonomous nonlinear functional differential equations. Nonlinear Analysis: Theory, Methods & Applications. 1986;10:503–513. [Google Scholar]

[R52] 52.Keck DD, Bortz DM. Generalized sensitivity functions for size-structured population models. Journal of Inverse and Ill-posed Problems. 2016 (to appear) [Google Scholar]

[R53] 53.Kelley JL. General Topology. Princeton, NJ: Van Nostrand-Reinhold; 1955. [Google Scholar]

[R54] 54.Krishnaswamy S, Spitzer MH, Mingueneau M, Bendall SC, Litvin O, Stone E, Pe’er D, Nolan GP. Conditional density-based analysis of T cell signaling in single-cell data. Science. 2014;346:1250689–1250689. doi: 10.1126/science.1250689. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R55] 55.Lu CF, Spielman LA. Kinetics of floc breakage and aggregation in agitated liquid suspensions. Journal of Colloid and Interface Science. 1985;103:95–105. [Google Scholar]

[R56] 56.Luzyanina T, Mrusek S, Edwards JT, Roose D, Ehl S, Bocharov G. Computational analysis of CFSE proliferation assay. J. Math. Biol. 2006;54:57–89. doi: 10.1007/s00285-006-0046-6. [DOI] [PubMed] [Google Scholar]

[R57] 57.Luzyanina T, Roose D, Bocharov G. Distributed parameter identification for a label-structured cell population dynamics model using CFSE histogram time-series data. J. Math. Biol. 2008;59:581–603. doi: 10.1007/s00285-008-0244-5. [DOI] [PubMed] [Google Scholar]

[R58] 58.Luzyanina T, Roose D, Schenkel T, Sester M, Ehl S, Meyerhans A, Bocharov G. Numerical modelling of label-structured cell population growth using CFSE distribution data. Theor Biol Med Model. 2007;4:1–15. doi: 10.1186/1742-4682-4-26. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R59] 59.Matveev SA, Smirnov AP, Tyrtyshnikov EE. A fast numerical method for the Cauchy problem for the Smoluchowski equation. Journal of Computational Physics. 2015;282:23–32. [Google Scholar]

[R60] 60.Metz JAJ, Diekmann O. The Dynamics of Physiologically Structured Populations. Springer-Verlag; 1986. [Google Scholar]

[R61] 61.Miao H, Xia X, Perelson A, Wu H. On Identifiability of Nonlinear ODE Models and Applications in Viral Dynamics. SIAM Rev. 2011;53:3–39. doi: 10.1137/090757009. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R62] 62.Michel P, Mischler S, Perthame B. General relative entropy inequality: an illustration on growth models. Journal de Mathématiques Pures et Appliquées. 2005;84:1235–1260. [Google Scholar]

[R63] 63.Mirzaev I, Bortz DM. Criteria for linearized stability for a size-structured population model. 2015 arXiv:1502.02754. [Google Scholar]

[R64] 64.Mirzaev I, Bortz DM. Stability of steady states for a class of flocculation equations with growth and removal. 2015 arXiv:1507.07127. [Google Scholar]

[R65] 65.Nicmanis M, Hounslow MJ. Finite-element methods for steady-state population balance equations. AIChE J. 1998;44:2258–2272. [Google Scholar]

[R66] 66.Pandya JD, Spielman LA. Floc breakage in agitated suspensions: Theory and data processing strategy. Journal of Colloid and Interface Science. 1982;90:517–531. [Google Scholar]

[R67] 67.Persson P-A, Holmberg R, Lee J. Rock Blasting and Explosives engineering. CRC Press; 1994. [Google Scholar]

[R68] 68.Perthame B, Zubelli JP. On the inverse problem for a size-structured population model. Inverse Problems. 2007;23:1037–1052. [Google Scholar]

[R69] 69.Pilant M, Rundell W. Determining a Coefficient in a First-Order Hyperbolic Equation. SIAM J. Appl. Math. 1991;51:494–506. [Google Scholar]

[R70] 70.Powell MJ. Advances in optimization and numerical analysis. Springer; 1994. A direct search optimization method that models the objective and constraint functions by linear interpolation; pp. 51–67. [Google Scholar]

[R71] 71.Powell MJD. Direct search algorithms for optimization calculations. Acta Numerica. 1998;7:287–336. [Google Scholar]

[R72] 72.Ramkrishna D. Population Balances: Theory and Applications to Particulate Systems in Engineering. Academic Press; 2000. [Google Scholar]

[R73] 73.Sinko JW, Streifer W. A new model for age-size structure of a population. Ecology. 1967;48:910–918. [Google Scholar]

[R74] 74.Thomas DN, Judd SJ, Fawcett N. Flocculation modelling: a review. Water Research. 1999;33:1579–1592. [Google Scholar]

[R75] 75.Thomaseth K, Cobelli C. Generalized sensitivity functions in physiological system identification. Annals of biomedical engineering. 1999;27:607–616. doi: 10.1114/1.207. [DOI] [PubMed] [Google Scholar]

[R76] 76.van Smoluchowski M. Drei Vortrage uber Diffusion, Brownsche Bewegung und Koagulation von Kolloidteilchen. Zeitschrift fur Physik. 1916;17:557–585. [Google Scholar]

[R77] 77.van Smoluchowski M. Versuch einer mathematischen theorie der koagulation kinetic kolloider losungen. Zeitschrift für physikalische Chemie. 1917;92:129–168. [Google Scholar]

[R78] 78.Wattis JA. An introduction to mathematical models of coagulation–fragmentation processes: A discrete deterministic mean-field approach. Physica D: Nonlinear Phenomena. 2006;222:1–20. [Google Scholar]

[R79] 79.Webb GF. Theory of nonlinear age-dependent population dynamics. CRC Press; 1985. [Google Scholar]

[R80] 80.White WH. A Global Existence Theorem for Smoluchowski’s Coagulation Equations. Proceedings of the American Mathematical Society. 1980;80:273–276. [Google Scholar]

[R81] 81.Wyckoff JB, Jones JG, Condeelis JS, Segall JE. A critical step in metastasis: in vivo analysis of intravasation at the primary tumor. Cancer research. 2000;60:2504–2511. [PubMed] [Google Scholar]

[R82] 82.Yeh WW-G. Review of Parameter Identification Procedures in Groundwater Hydrology: The Inverse Problem. Water Resour. Res. 1986;22:95–108. [Google Scholar]

PERMALINK

An Inverse Problem for a Class of Conditional Probability Measure-Dependent Evolution Equations

Inom Mirzaev

Erin C Byrne

David M Bortz

Abstract

1. Introduction

2. Least squares problem for estimation of conditional probability distributions

2.1. Theoretical framework for the least squares problem

2.2. Existence and consistency of the least squares estimates

2.2.1. Existence of the estimator

Condition 2.1

Lemma 2.2

Proof

Lemma 2.3

Proof

Theorem 2.4

Proof

2.2.2. Consistency of the estimator

Theorem 2.5

Proof

3. Approximate Inverse Problem

Theorem 3.1

Proof

Condition 3.2

Lemma 3.3

Proof

Corollary 3.4

Proof

Theorem 3.5

Proof

Theorem 3.6

Proof

4. Application to flocculation equations

Figure 1.

Lemma 4.1

Lemma 4.2

Proof

Claim 4.3

Proof

4.1. Numerical Implementation

Claim 4.4

Proof

Corollary 4.5

Proof

4.2. Convergence of the numerical scheme

Figure 2.

4.3. Numerical optimization and results

Figure 3.

Figure 4.

Figure 5.

Figure 6.

5. Concluding Remarks

Supplementary Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases