Summary
Partial differential equations (PDEs) are used to model complex dynamical systems in multiple dimensions, and their parameters often have important scientific interpretations. In some applications, PDE parameters are not constant but can change depending on the values of covariates, a feature that we call varying coefficients. We propose a parameter cascading method to estimate varying coefficients in PDE models from noisy data. Our estimates of the varying coefficients are shown to be consistent and asymptotically normally distributed. The performance of our method is evaluated by a simulation study and by an empirical study estimating three varying coefficients in a PDE model arising from LIDAR data.
Keywords: B-splines, Dynamical models, Inverse problems, LIDAR data, Parameter cascading, System identification
1. Introduction
Partial differential equations (PDEs) are used to describe a wide variety of phenomena such as heat, sound, fluid flow and many others. PDEs model these complicated multidimensional dynamical systems by linking some multivariate functions and their partial derivatives together. PDEs are generally much more complicated than ordinary differential equations (ODEs), as ODEs only deal with functions of a single variable. Most PDEs cannot be solved analytically, and many numerical methods have been developed to solve PDEs numerically, for example, the finite element method and finite difference method (Jost, 2002). Parameters in PDEs have important scientific interpretations, and their values determine the behaviour of the dynamical system. However, the values of these parameters are often unknown. There is a need then to estimate these parameters from the measurements or observations of the multidimensional dynamical systems in the presence of measurement errors.
The methods on estimating ODE parameters in the presence of noise have been developed quickly in recent years. Popular methods include the nonlinear least squares approach (Bard (1974), Biegler et al. (1986), Sauer (2006)), the two-step method (Ramsay and Silverman (2005), Chen and Wu (2008), Brunel (2008)), Bayesian methods (Gelman et al. (1996), Huang et al. (2006)), and the parameter cascading method (Ramsay et al. (2007), Cao et al. (2008), Qi and Zhao (2010)). Gugushvili and Klaassen (2012) provided a rigorous analysis of the two-step method. Chen and Wu (2008), Liang et al. (2010), and Cao et al. (2012) proposed the two-step method, the multi-stage smoothing-based approach, and the parameter cascading method, respectively, for estimating time-varying parameters. Xue et al. (2010) considered both numerical error and measurement error and proposed a sieve method for estimating constant and time-varying coefficients in nonlinear ODEs. Cao et al. (2011) proposed a robust estimating method for estimating ODE parameters. Lu et al. (2011) and Wu et al. (2014) proposed multi-stage methods to estimate parameters for high dimensional ODEs. Miao et al. (2014) proposed a generalized ODE model for discrete data. Vujacic et al. (2016) proposed one unifying framework based on generalized Tikhonov regularization and extremum estimation for some existing ODE estimation methods.
The statistical literature on estimating PDE parameters is relatively sparse. Three main methods have been proposed to estimate PDE parameters from noisy data. The first method uses the framework of nonlinear regression by minimizing the distance of the noisy data to the numerical solutions of PDEs for given parameter values (Vallette et al., 1997). Because PDE solutions are very sensitive to parameter values, the nonlinear regression method often has convergence problems, and is computationally costly. This nonlinear regression method also requires the estimation of initial conditions of the PDEs. Müller and Timmer (2002) proposed an extended multiple shooting method to improve the optimization convergence.
The second method is a two-step method (Bär et al., 1999; Voss et al., 1999; Parlitz and Merkwirth, 2000; Coca and Billings, 2000). In the first step, all the state variables and derivatives are estimated from the noisy data by using the multivariate polynomials or nonparametric regression methods such as smoothing splines (Ramsay and Silverman, 2005) or local polynomial regression (Fan and Gijbels, 1996). In the second step, the PDE parameters are estimated by minimizing the distance between the smooth fit and the PDE equations. Both steps have available computer programs to implement, and the computation is also very fast. However, this method depends heavily on the estimation accuracy of derivatives, but derivatives are more difficult to estimate because there are no direct observed data for them. The two-step method also requires that all state variables in the PDEs are observed or measured, which are often hard to achieve in real applications. Müller and Timmer (2004) compared the two-step method with the extended multiple shooting method in their parameter estimates with respect to different noise levels and data resolutions. They concluded that the extended multiple shooting method obtained more accurate parameter estimates than the two-step method, and the superiority of the former method increased with the complexity of the PDE model. The two-step method failed completely when the noise level is higher than a critical level, due to the difficulty of estimating derivatives. Müller and Timmer (2004) also pointed out that the computational cost of the multiple shooting method is a factor of 1.5 × 103 higher than the two-step method. Dattner and Klaassen (2015) proposed a direct integral approach for estimating ODE parameters and avoided estimating derivatives.
The third method is parameter cascading, which has been shown to obtain more accurate PDE parameter estimates than the two-stage method (Xun et al., 2013). Xun et al. (2013) also established the asymptotic behaviour of the parameter estimates. Frasso et al. (2015) also proposed frequentist and Bayesian approaches for estimating parameters in linear PDEs. Recently, Chkrebtii et al. (2013) considered the uncertainty coming from discretization of system states defined by ODEs or PDEs, and proposed a Bayesian solution to quantify this uncertainty.
The work referenced above assumes that the PDE parameters are constants. In this article, we consider the estimation of PDE parameters which change as functions of covariates, and hence the parameters are varying coefficients. These varying coefficients are often very flexible and easy to interpret, as we show in Section 6. We consider a multidimensional dynamic process g(x,w), where x = (x1,…, xp)T ∈ Rp is a multi-dimensional argument and w is a covariate determining the varying coefficients. Suppose this dynamic process can be modeled with a varying coefficient PDE model
| (1) |
where θ(w) = {θ1(w),…, θm(w)}T is the varying coefficient vector depending on the scalar w. In practice, we do not observe g(x, w) but instead its surrogate Y(x, w). We assume that g(x, w) are observed over a meshgrid of size n, with measurement errors, and that w is observed at a = 1,…, A locations. Thus, for i = 1,…, n and a = 1,…, A, we observe the data (Yai,xai, wa) satisfying
| (2) |
where the εai's are independent and identically distributed measurement errors and εai is assumed here to follow a distribution with mean zero and variance σ2. Our goal is to estimate the varying coefficient vector θ(w) in the PDE model (1) from noisy data, and to quantify the uncertainty of the estimates.
We propose a parameter cascading method to estimate the varying coefficients in PDE models from noisy data (2) in two nested levels of optimization. Without knowing the boundary conditions of the PDEs, it is impossible to obtain the numerical solutions to PDEs. To address this problem, we approximate the PDE solutions using flexible nonparametric functions, which are expressed as linear combinations of basis functions. For given values of the varying coefficients in PDEs, we estimate the basis coefficients by minimizing the tradeoff between fitting to the data and fidelity to the PDE models in the inner optimization level. Therefore, the estimated basis coefficients, as well as the estimated PDE solution, can be treated as a function of PDE varying coefficients.
Without strong parametric assumptions, we also represent the varying coefficients using flexible nonparametric functions. In the outer optimization level, we estimate the varying coefficients by minimizing the penalized squared errors of the estimated PDE solution to the noisy data in (2). The smoothness of the varying coefficients is controlled by adding a roughness penalty to the squared fitting errors.
The rest of the article is organized as follows. Section 2 introduces our method for estimating varying coefficients in PDEs. Section 3 demonstrates the limiting distribution of the estimates of the varying coefficients. Smoothing parameter selection is described in Section 4. Simulation studies are presented in Section 5 to evaluate the finite performance of our method. Our method is illustrated using LIDAR data in Section 6. Section 7 provides some concluding remarks. Conditions and proofs for the asymptotic theory are provided in the Supplementary Materials, as are the computer codes for our simulation studies.
2. Method for Estimating Varying Coefficients in PDEs
We estimate the varying coefficient vector θ(w) in the varying-coefficient PDE model (1) using the parameter cascading method (Cao and Ramsay, 2007, 2010) in two nested levels of optimization. In the inner level of optimization, we estimate the (p + 1)-dimensional dynamical process g(x, w) for any given value of the varying coefficient vector θ(w). Hence, the estimated ĝ(x, w) can be viewed as a function of θ(w), which is denoted as ĝ {x, w, θ(w)}. In the outer level of optimization, we estimate the varying coefficient vector θ(w) by plugging in ĝ{x, w, θ(w)}. The inner level of optimization of g(x, w) for any given θ(w) is embedded inside the outer level of optimization of θ(w). The two levels of optimization are implemented iteratively until convergence. The algorithm for the parameter cascading method is also outlined at the end of Section 2.2.
2.1 Inner Level of Optimization: Estimation of g(x, w)
Here we describe the estimation of g(x, w) in (2) for a given value of θ(w). The (p + 1)-dimensional dynamical process, g(x, w), is represented by , where b(x) = {b1(x),…,bK(x)}T is a vector of basis functions depending on x, and β(w) = {β1(w), …, βk(w)}T is the corresponding vector of basis coefficients depending on w. We further estimate each βk(w) as a linear combination of basis functions
| (3) |
where ϕk(w) = {ϕk1(w),…, ϕkVk(w)}T is an additional vector of basis functions, and αk = (αk1,…, αkVk)T is the corresponding vector of basis coefficients. Let and , where diag(A1,…, AK) means a block-diagonal matrix with diagonal elements being A1,…, AK. Then, from (3), β(w) = Φ(w)α and
| (4) |
For any given varying coefficient vector, θ(w), the vector of all basis coefficients α is estimated by minimizing
| (5) |
where the first term measures the fit of (4) to the data, and the second term measures the fidelity of (4) to the PDE model defined in (1). The smoothing parameter λ1 controls the trade-off between fitting to the data and fidelity to the PDE model.
The estimated basis coefficients α̂ can be obtained for any given varying coefficient vector θ(w) by minimizing (5), so α̂ can be viewed as a function of θ(w), which is denoted as α̂ {θ(w)}. The (p + 1)-dimensional dynamical process can then be estimated as
| (6) |
2.2 Outer Level of Optimization: Estimation of θ(w)
The PDE parameter θ(w) is then estimated in the outer level of optimization. For k = 1, …, m, write each θk(w) as a linear combination of basis functions
| (7) |
where ψk(w) = {ψk1(w), …, ψkUk(w)}T is the vector of basis functions, and ck = (ck1,…, ckUk) is the vector of basis coefficients. Because the estimator α̂ {θ(w)} is already regularized, we propose to estimate θ(w) by minimizing the penalized least squares measure of fit
| (8) |
where λ2 is a smoothing parameter, which controls the roughness of the estimated varying coefficients θ(w).
Write and such that
| (9) |
Let Ωk be a Uk × Uk penalty matrix with the (u1,u2)th entry
and Ω = diag(Ω1,…, Ωm). Following (8), we estimate c by minimizing
| (10) |
The function H(c) can be optimized by the Newton-Raphson method, and the gradient and Hessian functions can be calculated analytically. Magnus and Neudecker (2007) is an excellent reference book for matrix differential calculus. Since ĝ{xai, wa, Ψ(wa)c} may have no analytical form, it may be an implicit function of c, in which case the implicit function theorem can be applied to obtain the gradient and Hessian functions analytically. Ramsay et al. (2007) provided some detailed expressions for this optimization procedure for ordinary differential equations, which can be directly extended to partial differential equations. Denote the minimizer of (10) by ; the estimates of varying coefficients are then given by and θ̂(w) = {θ̂1(w),…, θ̂m(w)}T.
Below is the algorithm for the parameter cascading method.
|
|
| Algorithm 1 Parameter Cascading Method |
|
|
Outer Level of Optimization - Estimate θ(w) by minimizing H {θ(w)}
|
|
|
2.3 Specialization to Linear PDE Models
The PDE model (1) can be expressed by substituting (4) into model (1) as
| (11) |
When the PDE model (11) is linear in α, it can be expressed as
| (12) |
then α̂ {θ(w)} has a closed form and the algorithm can be stated as follows.
By using (12), the inner level of optimization criterion (5) becomes
Let N = nA, Ba = {b(x1a), …, b(xna)}T for a = 1, …, A, B* = diag(B1, …, BA), D = {ΦT(w1)…, ΦT(wa)}T, B = B*D being a basis matrix, Y = (Y11, Y12, …, YAn)T, and
Then, the inner level of optimization criterion (5) can be expressed in the matrix notation
| (13) |
By minimizing (13) for given θ(w), the estimator for α can be obtained in closed form:
| (14) |
In addition, because of (9), we can write α̂{θ(w)} = α̂(c) and R{θ(w), w} = R(c) for simplicity. By substituting into (9), (13) becomes
| (15) |
and by substituting in (6), (9) and (14), (10) becomes
| (16) |
The vector of basis coefficients c can be estimated by minimizing H(c), thus we obtain the estimates for the varying coefficients in the PDE model (1).
3. Limit Distribution of Varying Coefficient Estimator for Linear PDE Models
For simplicity, we denote c11, c12, …, cmUm by c1, c2,…, cQ. Let c0 = (c01, …, c0Q)T be the true value of c; see Condition C.2 in the Supplementary Materials for its definition. Let λ̃1 = λ1/N, S = N−1BTB, G(c) = S + λ̃1R(c), ϒ1 = E(S), and g = {g(x11, w1), g(x12, w1),…, g(xAn, wA)}T. For q = 1,…, Q, we define Ṙq(c) = ∂R(c)/∂cq, 𝒱q(c) = R(c)G−1(c)Ṙq(c), and . Assume that as N → ∞, λ̃1R(c) converges to a matrix, denoted by R̃(c). Let ϒ2 (c) = ϒ1 + R̃(c), α(c) = N−1G−1(c)BTg, and 𝒞(c) = {𝒲1(c)α(c),…, 𝒲Q(c)α(c)}T. Also, let ∧(c) be a Q × Q matrix with the (q1, q2)th element
where .
In the Supplementary Materials, we show that under Conditions (C.1)-(C4)
| (17) |
as N → ∞. Let Πk be a projection matrix such that ck = Πkc. Then, for any w, we have
| (18) |
as N → ∞. To estimate Δ, we replace c0 by ĉ and estimate σ2 by
where Q is the dimension of c.
4. Smoothing Parameter Selection
We propose a two-step procedure for the smoothing parameter selection. In the first step we set and , both of which satisfy Condition (C.1) in the Supplementary Materials. For any λ1 and λ2, we rewrite ĉ as ĉ(λ1, λ2) and Δ as Δ(λ1, λ2). From (17), we have that as N → ∞, .
Let be an estimator of . In the second step, we select (λ1, λ2) by minimizing
| (19) |
The first term of 𝒟(λ1, λ2) in (19) is to ensure that the resulting estimator of c0 is √N-consistent.
Let 𝒰 be a candidate set of (λ1,λ2), containing . The estimated smoothing parameters are (λ1,λ2) = argmin(λ1, λ2)∈𝒰𝒟(λ1,λ2). Define 𝒰* be a subset of 𝒰 such that ĉ(λ1, λ2) − c0 = O(N−1/2) for any (λ1, λ2) ∈ 𝒰*. Obviously, 𝒰* is not empty because belongs to 𝒰*. Assuming converges to a positive definite matrix as N → ∞, we can prove that
| (20) |
as N → ∞. See the Supplementary Materials for the proof of (20). This result means that the estimate ĉ(λ̂1, λ̂2) is √N-consistent.
5. Simulations
5.1 Data Generating Mechanism
The following varying coefficient PDE model is used to simulate data
| (21) |
with varying coefficients θD(w) = 1 − 0.2(w − 1)2, θS(w) = 0.1 + 0.02 cos(wπ) and θA(w) = 0.1 + 0.02 sin(1.2wπ), where w ∈ {0.1,0.2,…, 2.0}.
Model (21) is a PDE of parabolic type in one space dimension, and is also called a (one-dimensional) linear reaction-convection-diffusion equation. Here θD(·) is the diffusion rate, θS(·) is the drift rate, and θA(·) is the reaction rate.
For each w, the PDE model (21) is solved numerically by setting the boundary condition as g(t,0, w) = 0 and the initial condition as g(0,z,w) = {1 + 0.1 × (20 − z)2}−1 over a meshgrid in the time domain t ∈ [1, 20] and the range domain z ∈ [1, 40].
Thus, for each w, our data is on a 20-by-40 meshgrid in the domain [1, 20] × [1, 40]. The resulting sample size N is 20 × 20 × 40 = 16000. Figure 1 displays the numerical solutions of the PDE (21) when w = 0.3, 1.0 and 1.7.
Figure 1.
3-D plots of the surface of the numerical solution of the PDE (21) when varying w = 0.3, 1, 1.7.
The observed error-prone data (2) is simulated by adding independent and identically distributed Gaussian noise with standard deviation σ to the PDE solutions. We varied σ, as 0.02 and 0.05. These values are the same as the simulation used in Xun et al. (2013).
5.2 Performance of Parameter Cascading
The parameter cascading method was applied to estimate the three varying coefficients in the PDE model (21) from the simulated data. The two smoothing parameters λ1 and λ2 were selected by using the method proposed in Section 4. We used B-splines of order 4 to form the basis functions, with 5, 15, and 3 equally spaced knots in t, z and w domains, respectively. The parameter cascading method was also compared with the two-stage method. The two-stage method estimated varying coefficient parameters for the PDE model in two stages. In the first stage, for each wa, a = 1,…, A, g(x, wa), and its partial derivatives were estimated by multidimensional penalized signal regression (Marx and Eilers, 2005). In the second stage, similar to (8), we estimated the vector of varying coefficient parameters, θ(wa), by minimizing
where θ(wa) was estimated as a linear combination of basis functions, as defined in (9), and the basis functions for expanding θ(wa) were chosen as the same as the parameter cascading method. The simulation was repeated 1000 times.
Figures S1 and S2 of the Supplementary Materials shows the estimated varying coefficients θD(w), θS(w), and θA(w) for the PDE model (21) from the simulated data in the first 10 simulation replications in our simulation study using the parameter cascading method and the two-stage method, respectively. It is seen that our parameter cascading method can estimate all three varying coefficients reasonably well, but the two-stage method's estimated varying coefficients θD(w) are very different from the true curve. To assess the accuracy of the estimated varying coefficients, we use the squared root of the average squared errors (RASEs), namely
| (22) |
Table 1 summarizes the mean, median, and standard deviation of the RASEs for the estimated varying coefficients θ̂D(w), θ̂S(w) and θ̂A(w) in 1000 simulation replicates. It shows that our parameter cascading method outperforms the two-stage method. Particularly noticeable is that the two-stage method yields about ten times larger mean, median, and standard deviation of the RASE than the parameter cascading method does for θD(w) and θS(w).
Table 1.
Mean, Median, and Standard Deviation (SD) of the square root of the average squared errors (RASEs) defined in (22) for the estimated varying coefficients θ̂D(w), θ̂S(w) and θ̂A(w) in 1000 simulation replicates by using the proposed parameter cascading method (PC) and the two-stage method (TS). Here a is the standard deviation of ε in (2), θD(·) is the diffusion rate, θS(·) is the drift rate, θA(·) is the reaction rate.
| σ = 0.05 | σ = 0.02 | ||||
|---|---|---|---|---|---|
| PC | TS | PC | TS | ||
| RMSE of θ̂D(w) | Mean ×102 | 1.54 | 22.76 | 1.29 | 15.85 |
| Median ×102 | 1.53 | 22.55 | 1.28 | 15.88 | |
| SD ×103 | 3.53 | 40.22 | 1.58 | 21.29 | |
|
| |||||
| RMSE of θ̂S(w) | Mean ×103 | 1.86 | 14.94 | 1.09 | 10.03 |
| Median ×103 | 1.85 | 14.50 | 1.10 | 9.90 | |
| SD ×104 | 6.08 | 43.58 | 2.78 | 21.09 | |
|
| |||||
| RMSE of θ̂A(w) | Mean ×103 | 1.88 | 3.46 | 1.87 | 2.22 |
| Median ×103 | 1.88 | 3.38 | 1.87 | 2.19 | |
| SD ×104 | 0.30 | 7.19 | 0.11 | 1.85 | |
For the parameter cascading method, when the standard deviation of Gaussian noises added to simulated data is increased from 0.02 to 0.05, the mean and median of RASEs for θ̂D(w) and θ̂S(w) increase around 20% and 71%, respectively, while the mean and median of RASEs for θ̂A(w) only increase slightly. The standard deviation of RASEs is doubled for θ̂D(w) and θ̂S(w) and tripled for θ̂A(w) when increasing σ from 0.02 to 0.05.
6. Application
6.1 Background
We have access to a small subset of long range infrared light detection and ranging (LIDAR) data. The data set consists of samples collected for 28 aerosol clouds, 14 of them being biological and the other 14 being non-biological. For each sample, there is a transmitted signal that is sent into the aerosol cloud at 19 different laser wavelengths w1, …, w19, and at T time points. For each wavelength and time point, received LIDAR data were observed at equally spaced ranges z = 1,…, Z. The experiment also included background data which were collected before the aerosol cloud was released. The received LIDAR data from aerosol clouds were then background corrected.
An example of the background-corrected received data for the 13th aerosol cloud (non-biological) is given in Figure 2. Such data such are well-described by
Figure 2.
Snapshots of the background-corrected received data for the 13th aerosol cloud (non-biological).
| (23) |
where the parameters θD(w), θS(w) and θA(w) describe the diffusion rate, the drift rate and the reaction rate, respectively, and they are varying with w. In order to estimate the PDE model (23) from the real data, we take T = 20 time points and Z = 60 range values, so that the sample size N is 20 × 60 × 19 = 22800. We use B-spline basis functions of order 4 constructed with 5 inner knots in the time domain, 17 inner knots in the range domain, and 3 inner knots in the wavelength domain.
6.2 Estimation Results
Figure 3 displays the estimates for the three varying coefficients θD(w), θS(w) and θA(w) for the PDE model (23) from the real data shown in Figure 2. It is obvious that the three varying coefficients are not constant over the range of w. Specifically, the diffusion rate, θD(w), slightly decreases with the early laser wavelengths and then levels off. The drift rate, θS(w), becomes the smallest around w = 10, and increases with the laser length thereafter. The reaction rate, θA(w), decreases throughout most of the wavelengths. We used the theoretical result (18) to obtain the 95% confidence intervals for the three varying coefficients, which are also displayed in Figure 3.
Figure 3.
Estimated curves of θD(w) (diffusion rate, top), θS(w) (drift rate, center), and θA(w) (reaction rate, bottom) for the PDE model (21) and their 95% confidence intervals for the 13th aerosol cloud. This figure appears in color in the electronic version of this article.
Figures 4 and 5 display the means and variances of the estimated varying coefficients θD(w), θS(w) and θA(w) for biological and non-biological aerosol clouds. It is seen that for θD(w) and θA(w), the biological clouds have higher mean varying coefficients than the non-biological clouds in the whole range of wavelengths, while for θS(w), the former has lower mean varying coefficients than the latter in the whole range of wavelengths. For all three coefficients, the non-biological clouds yields larger variances than the biological clouds in the whole range of wavelengths.
Figure 4.
Mean of the estimated varying coefficients θ̂D(w) (diffusion rate, left), θ̂S(w) (drift rate, center), and θ̂A(w) (reaction rate, right) in the PDE model (21) for 14 biological aerosol clouds (solid lines) and 14 non-biological aerosol clouds (dashed lines). This figure appears in color in the electronic version of this article.
Figure 5.
Sample variance of the estimated varying coefficients θ̂D(w) (diffusion rate, top left), θ̂S(w) (drift rate, top right), and θ̂A(w) (reaction rate, lower left) in the PDE model (21) for 14 biological aerosol clouds (solid lines) and 14 non-biological aerosol clouds (dashed lines). Lower right: p-values to assess the hypothesis of equality of the variances from the two groups of clouds by using the Brown-Forsythe test. Solid line is for estimated curve of θD(w) (diffusion rate), dash-dotted line is for estimated curve of θS(w) (drift rate), and dashed line is for estimated curve of θA(w) (reaction rate). The dotted lines represent p = 0.10 and p = 0.05. This figure appears in color in the electronic version of this article.
We further assessed the equality of the variances of the estimated varying coefficients θD(w), θS(w) and θA(w) for biological and non-biological aerosol clouds by using the Brown-Forsythe test (Brown and Forsythe, 1974), which generally has better control of test level than the Levene's test (Levene, 1960). Figure 5 also shows the pointwise p-values for the Brown-Forsythe test of the equality of the variances of the estimated varying coefficients θ̂D(w), θ̂S(w) and θ̂A(w) for biological and non-biological aerosol clouds. Most p-values are smaller than 0.10 and some of them are even smaller than 0.05. It indicates that the variances of the estimated varying coefficients θ̂D(w), θ̂S(w) and θ̂A(w) for biological and non-biological aerosol clouds are significantly different at most wavelengths. At the significance level of 0.10, the diffusion rate θ(·) shows the significant different variance at the wavelengths in [2,19], while the drift rate θS(·) shows the significant different variance at the wavelengths in [2,6] and [13,15]. The reaction rate θD(·) shows the significant different variance at wavelengths [1,18].
Next, we employ functional principal components analysis to investigate the major variations of the estimated varying coefficients θ̂D(w), θ̂S(w) and θ̂A(w) for all 28 aerosol clouds. Figure S3 of the Supplementary Materials shows the first two functional principal components for the three estimated varying coefficients for all 28 aerosol clouds. The first functional principal component accounts for 91.5%, 98.9%, and 97.4% of total variation of the estimated θ̂D(w), θ̂S(w) and θ̂A(w) for all 28 aerosol clouds, respectively. The first functional principal components for all three estimated varying coefficients change with the wavelength, but they are positive in the whole range of wavelengths. This may be interpreted as that more than 91% of total variation of these estimated varying coefficients comes from the differences of their weighted means, where the weights are defined by the first functional principal components. The second functional principal components account for 7.3%, 0.7%, and 2.1% of total variation of the estimated θ̂D(w), θ̂S(w) and θ̂A(w) for all 28 aerosol clouds, respectively. For all three varying coefficients, the second functional principal components for θ̂D(w), θ̂S(w) and θ̂A(w) are negative when the wavelength is less than 10, 14, 10, respectively, and is positive with larger wavelengths. This may be interpreted as that the second largest source of variations of the three estimated varying coefficients is the change of the varying coefficients from small wavelengths to large wavelengths.
7. Concluding Remarks
We have proposed a parameter cascading method to estimate varying coefficients in PDE models from noisy data. The method employs two nested levels of optimization. In the inner level of optimization, the PDE solution is approximated by a flexible nonparametric function. For any given value of PDE varying coefficients, the nonparametric function is estimated by minimizing the penalized squared errors of the approximated PDE solution to the noisy data, and the fitting penalty is defined by the fit of the nonparametric function to PDEs. In the outer level of optimization, we estimate the varying coefficients as a flexible nonparametric function and add a roughness penalty to control the smoothness of the estimated varying coefficients.
We show that our estimates for varying coefficients are consistent and asymptotically normally distributed. Our simulation study shows that we can obtain accurate estimates for the varying coefficients in PDEs from finite samples in the presence of noises with various scales. We demonstrate our method by estimating three varying coefficients in a PDE model from the LIDAR data. We find that the biological clouds have higher average diffusion rate and reaction rate than the non-biological clouds, but the biological clouds have lower drift rate through the whole range of wavelengths. On the other hand, the non-biological clouds have larger variances for all three varying coefficients than the biological clouds.
Our method allows us to use any multidimensional basis functions to represent the dynamical systems. In our simulation and application, we use tensor product of B-spline basis functions. For this basis function system, the number of basis functions increases quickly with the dimensions of the dynamical system. We believe that we can increase the computational efficiency of our method by defining more efficient multidimensional basis functions. For example, in a bivariate dynamical system, the fast bivariate smoother method of Xiao et al. (2013) or the spatial spline regression method of Sangalli et al. (2013) are promising in this aspect.
Supplementary Material
Acknowledgments
The authors are very grateful for the very constructive comments from the Editor, an Associate Editor and two reviewers. These comments are extremely helpful for us to improve this work. Zhang's work was partially supported by National Natural Science Foundation of China (Grant numbers 11471324, 71522004 and 71631008) and “Chen Jingrun Future Star” Project. Cao's research was supported by a discovery grant (356044-2013) from the Natural Science and Engineering Research Council of Canada (NSERC). Carroll's research was supported by a grant from the National Cancer Institute (U01-CA057030).
Footnotes
Supplementary Materials: They include proofs for the asymptotic theory, Figures referenced in Sections 5 and 6, and some additional simulation studies. They are available with this paper at the Biometrics website on Wiley Online Library, as are the computer codes for our simulation studies.
References
- Bär M, Hegger R, Kantz H. Fitting partial differential equations to space-time dynamics. Physical Review E. 1999;59:337–342. [Google Scholar]
- Bard Y. Nonlinear Parameter Estimation. Academic Press; New York, NY, USA: 1974. [Google Scholar]
- Biegler L, Damiano JJ, Blau GE. Nonlinear parameter estimation: a case study comparison. AIChE Journal. 1986;32:29–45. [Google Scholar]
- Brown MB, Forsythe AB. Robust tests for the equality of variances. Journal of the American Statistical Association. 1974;69:364–367. [Google Scholar]
- Brunel NJ. Parameter estimation of ODE's via nonparametric estimators. Electronic Journal of Statistics. 2008;2:1242–1267. [Google Scholar]
- Cao J, Fussmann G, Ramsay JO. Estimating a predator-prey dynamical model with the parameter cascades method. Biometrics. 2008;64:959–967. doi: 10.1111/j.1541-0420.2007.00942.x. [DOI] [PubMed] [Google Scholar]
- Cao J, Huang JZ, Wu H. Penalized nonlinear least squares estimation of time-varying parameters in ordinary differential equations. Journal of Computational and Graphical Statistics. 2012;21:42–56. doi: 10.1198/jcgs.2011.10021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cao J, Ramsay J. Parameter cascades and profiling in functional data analysis. Computational Statistics. 2007;22:335–351. [Google Scholar]
- Cao J, Ramsay J. Linear mixed effects modeling by parameter cascading. Journal of the American Statistical Association. 2010;105:365–374. [Google Scholar]
- Cao J, Wang L, Xu J. Robust estimation for ordinary differential equation models. Biometrics. 2011;67:1305–1313. doi: 10.1111/j.1541-0420.2011.01577.x. [DOI] [PubMed] [Google Scholar]
- Chen J, Wu H. Efficient local estimation for time-varying coefficients in deterministic dynamic models with applications to HIV-1 dynamics. Journal of the American Statistical Association. 2008;103:369–383. [Google Scholar]
- Chkrebtii OA, Campbell DA, Calderhead B, Girolami MA. Bayesian solution uncertainty quantification for differential equations. Preprint. 2013;arXiv:1306.2365. [Google Scholar]
- Coca D, Billings S. Direct parameter identification of distributed parameter systems. International Journal of Systems Science. 2000;31:11–17. [Google Scholar]
- Dattner I, Klaassen CAJ. Optimal rate of direct estimators in systems of ordinary differential equations linear in functions of the parameters. Electronic Journal of Statistics. 2015;9:1939–1973. [Google Scholar]
- Fan J, Gijbels I. Local Polynomial Modelling and its Applications. CRC Press; New York: 1996. [Google Scholar]
- Frasso G, Jaeger J, Lambert P. Parameter estimation and inference in dynamic systems described by linear partial differential equations. Advances in Statistical Analysis. 2015:1–29. doi: 10.1007/s10182-015-0257-5. [DOI] [Google Scholar]
- Gelman A, Bois F, Jiang J. Physiological pharmacokinetic analysis using population modeling and informative prior distributions. Journal of the American Statistical Association. 1996;91:1400–1412. [Google Scholar]
- Gugushvili S, Klaassen CAJ. √n-consistent parameter estimation for systems of ordinary differential equations: bypassing numerical integration via smoothing. Bernoulli. 2012;18:1061–1098. [Google Scholar]
- Huang Y, Liu D, Wu H. Hierachical bayesian methods for estimation of parameters in a longitudinal HIV dynamic system. Biometrics. 2006;62:413–423. doi: 10.1111/j.1541-0420.2005.00447.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jost J. Partial Differential Equations. Springer-Verlag; New York: 2002. [Google Scholar]
- Levene H. In: Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling. Olkin I, et al., editors. Stanford University Press; 1960. [Google Scholar]
- Liang H, Miao H, Wu H. Estimation of constant and time-varying dynamic parameters of HIV infection in a nonlinear differential equation model. Annals of Applied Statistics. 2010;4:460–483. doi: 10.1214/09-AOAS290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lu T, Liang H, Li H, Wu H. High dimensional odes coupled with mixed-effects modeling techniques for dynamic gene regulatory network identification. Journal of the American Statistical Association. 2011;106:1242–1258. doi: 10.1198/jasa.2011.ap10194. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Magnus JR, Neudecker H. Matrix differential calculus with applications in statistics and econometrics. John Wiley & Sons; New York: 2007. [Google Scholar]
- Marx B, Eilers P. Multidimensional penalized signal regression. Technometrics. 2005;47:13–22. [Google Scholar]
- Miao H, Wu H, Xue H. Generalized ordinary differential equation models. Journal of the American Statistical Association. 2014;109:1672–1682. doi: 10.1080/01621459.2014.957287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Müller TG, Timmer J. Fitting parameters in partial differential equations from partially observed noisy data. Physica D. 2002;171:1–7. [Google Scholar]
- Müller TG, Timmer J. Parameter identification techniques for partial differential equations. International Journal of Bifurcation and Chaos. 2004;14:2053–2060. [Google Scholar]
- Parlitz U, Merkwirth C. Prediction of spatiotemporal time series based on reconstructed local states. Physical Review Letters. 2000;84:1890–1893. doi: 10.1103/PhysRevLett.84.1890. [DOI] [PubMed] [Google Scholar]
- Qi X, Zhao H. Asymptotic efficiency and finite-sample properties of the generalized profiling estimation of parameters in ordinary differential equations. The Annals of Statistics. 2010;38:435–481. [Google Scholar]
- Ramsay JO, Hooker G, Campbell D, Cao J. Parameter estimation for differential equations: a generalized smoothing approach (with discussion) Journal of the Royal Statistical Society, Series B. 2007;69:741–796. [Google Scholar]
- Ramsay JO, Silverman BW. Functional Data Analysis. Springer; New York: 2005. [Google Scholar]
- Sangalli L, Ramsay J, Ramsay T. Spatial spline regression models. J R Stat Soc Ser B Stat Methodol. 2013;75:681–703. [Google Scholar]
- Sauer T. Numerical Analysis. Pearson Addison-Wesley; New York: 2006. [Google Scholar]
- Vallette D, Jacobs G, Gollub J. Oscillations and spatiotemporal chaos of one-dimensional fluid fronts. Phys Rev E. 1997;55:4274–4287. [Google Scholar]
- Voss HU, Kolodner P, Abel M, Kurths J. Amplitude equations from spatiotemporal binary-fluid convection data. Physical Review Letters. 1999;83:3422–3425. [Google Scholar]
- Vujacic I, Mahmoudi SM, Wit E. Generalized Tikhonov regularization in estimation of ordinary differential equations models. Stat. 2016;5:132–143. [Google Scholar]
- Wu H, Lu T, Xue H, Liang H. Sparse additive ODEs for dynamic gene regulatory network modeling. JASA. 2014;109:700–716. doi: 10.1080/01621459.2013.859617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xiao L, Li Y, Ruppert D. Fast bivariate p-splines: the sandwich smoother. Journal of the Royal Statistical Society, Series B. 2013;75:577–599. [Google Scholar]
- Xue H, Miao H, Wu H. Sieve estimation of constant and time-varying coefficients in nonlinear ordinary differential equation models by considering both numerical error and measurement error. Annals of Statistics. 2010;38:2351–2387. doi: 10.1214/09-aos784. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xun X, Cao J, Mallick B, Maity A, Carroll RJ. Parameter estimation of partial differential equation models. JASA. 2013;108:1009–1020. doi: 10.1080/01621459.2013.794730. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





