Towards Identification of Wiener Systems with the Least Amount of a priori Information: IIR Cases

Er-Wei Bai; John Reyland, Jr

doi:10.1016/j.automatica.2008.11.020

. Author manuscript; available in PMC: 2013 Mar 5.

Published in final edited form as: Automatica (Oxf). 2009 Apr;45(4):956–964. doi: 10.1016/j.automatica.2008.11.020

Towards Identification of Wiener Systems with the Least Amount of a priori Information: IIR Cases

Er-Wei Bai ^a, John Reyland Jr ^a

PMCID: PMC3587721 NIHMSID: NIHMS106936 PMID: 23471210

Abstract

In this paper, we investigate what constitutes the least amount of a priori information on the nonlinearity so that the linear part is identifiable in the non-Gaussian input case. Under the white noise input, three types of a priori information are considered including quadrant information, point information and monotonic information. In all three cases, identifiability has been established and the corresponding nonparametric identification algorithms are developed along with their convergence proofs.

Keywords: system identification, nonlinear systems, wiener systems, a priori information

1 Introduction

The Wiener nonlinear system has been used in various applications and identification of such systems has been an active research area for a long time [3,4,6,7,10,14,15,18]. In Wiener system identification, several assumptions are made in the literature. In the case that not enough a priori information on the unknown system is available, a common assumption is a Gaussian random input [4,6,10,15]. Thanks to the Bussgang Theorem [2], identification of Wiener systems is possible. Without the Gaussian assumption, identification of Wiener systems becomes non-trivial. In the case that the nonlinearity is known a priori, identification of the linear part is relatively easy [16,18]. If the nonlinearity is unknown, it is usually assumed that the nonlinearity or the inverse of the nonlinearity is expressed by some known basis functions [3,13] or by a piece-wise polynomial function [14,17]. Therefore, a nonlinear and non-parametric identification problem is reduced to a parameter estimation problem which is often much simpler because there is no uncertainty in the structure anymore. The key assumptions is either the invertibility of the unknown nonlinearity or the availability of some appropriate basis functions. Recently, another approach for Wiener system identification was proposed based on a monotonic assumption [19]. It was shown [19] that if the linear part is FIR and nonlinearity is monotonic, identification of the FIR linear part is possible using only the input-output data, though the solution is not necessarily unique.

Since only the input-output data is available for identification purpose and no internal signals are available, identification of the linear part and/or the nonlinearity is general impossible if no assumptions are made on the unknown system or the input. This is why conditions as discussed above are imposed in the literature. However, a fundamental question remains unanswered so far, i.e., what constitutes the least amount a priori information required for a non-Gaussian white input case in Wiener system identification? Answers to this question have impacts on both theoretical side and application side. Unfortunately, a solution of this question requires not only mathematical quantification of a priori information but also this quantification has to be expressed in practical terms for application purposes. This is a hard problem. A closely related but a little bit simplified question is what constitutes the least amount of a priori information on the unknown nonlinearity so that the linear part of the system can be uniquely identified based on input-output data. Obviously, if the linear part can be identified, the static nonlinearity can be consequently identified. Unfortunately, this reformulated and simplified question is again hard to answer because we are facing the same difficulty of mathematical quantification of a priori information. To overcome the difficulty, our approach is to study the problem in an indirect way, i.e., to develop identification algorithms for the linear part based on as little as possible a priori information on the unknown nonlinearity under a white noise input.

The paper is a continuation of our previous work [2] that discusses the same problem but was limited to Wiener systems with an FIR linear part. Several interesting results were obtained in [2]. It was shown under the FIR assumption, that identification of the linear part is feasible with very little a priori information on the unknown nonlinearity. For instance, quadrant information on the unknown nonlinearity suffices for identification purpose. Note no exact values of the nonlinearity but only sign information is needed. Further, the nonlinearity can be non-smooth and non-monotonic. In addition, it was shown in [2] that either point a priori or monotonic a priori information also suffices. It became clear [2] that for a Wiener system with an FIR linear part, identification is possible with very little a priori information. It was not clear however whether and how if possible, the similar conclusions could be extended to Wiener systems with an IIR linear part. The proofs used in the FIR case are not easily modifiable to an IIR case. In this paper, we show that the main results derived for FIR cases can be extended to IIR cases, though extensions are nontrivial and derivations are more involved. In particular, it is shown that identification of the IIR part is feasible based only on quadrant or point or monotonousness a priori information on the unknown nonlinearity.

The layout of the paper is as follows. The system and problems are introduced in Section 2. Section 3 discusses the point a priori information. With this little a priori information, it is shown that the linear part can be uniquely identified and the corresponding numerical algorithm is also developed along with its convergence proof. In Section 4, the results are extended to a priori information in terms of monotonousness. Similar identifiability results are established. Section 5 is devoted to a priori information in terms of quadrant knowledge of the unknown nonlinearity. It is shown that with quadrant information, the linear part can be uniquely identified. Finally, some concluding remarks are provided in Section 6.

2 Problem statement and preliminary

The Wiener system considered in the paper is shown in Figure 1, where the unknown linear and nonlinear parts are represented by

G (z) = \sum_{i = 0}^{\infty} h (i) z^{- i}, and f (\cdot)

(2.1)

respectively. The linear part G(z) is assumed to be stable so that |h(i)| ≤ Mλⁱ for some M < ∞ and 0 < λ < 1. No order information on G(z) is available, unless otherwise specified. The input, internal signal and output at time k = 0, 1,, N are represented by u(k), x(k) and y(k) respectively. The internal signal x(k) is unavailable for identification. The nonlinearity is unknown but bounded for bounded inputs. No structural a priori information on f(·) is assumed.

Because of embedded scaling ambiguity in Wiener systems, either the linear part or the nonlinear part has to be normalized for identification purpose [2]. It is assumed in the paper that

{∥ h ∥}^{2} = \sum_{i = 0}^{\infty} h^{2} (i) = 1

where h = (h(0), h(1), …)′ is an infinitely dimensional vector representing the impulse response of the linear part. Further, it is assumed that the first non-zero entry of h is positive. All these assumptions are standard to guarantee identifiability. Throughout the paper, it is also assumed that the input u(k) is an bounded independent identically distributed (iid) random sequence and its probability density function is positive over an interval [−a, a] for some 0 < a < 1. No specific distribution on the input is needed and the actual distribution could be unknown. Clearly, all the signals u(k), x(k) and y(k) are bounded because of the stability of the system and bounded inputs.

The goal of identification is to determine an estimate $\hat{G} (z) = \sum_{i = 0}^{\infty} \hat{h} (i) z^{- i}$ of G(z), based on the input-output data up to time N with little a priori information on the unknown nonlinearity f(·), specified later in subsequent sections, so that

\frac{1}{2 π} \int_{- π}^{π} {∣ G (e^{j ω}) - \hat{G} (e^{j ω}) ∣}^{2} d ω \to 0

(2.2)

as N → ∞ in some probability sense, preferably convergence with probability one.. Note again no order information on G(z) is available.

Now, observe that if the estimate Ĝ(z) is stable, then

{∥ h - \hat{h} ∥}^{2} = \sum_{i = 0}^{\infty} {∣ h (i) - \hat{h} (i) ∣}^{2} = \frac{1}{2 π} \int_{- π}^{π} {∣ G (e^{j ω}) - \hat{G} (e^{j ω}) ∣}^{2} d ω

where ĥ = (ĥ(0), ĥ(1), …)′ is the impulse response vector of the estimate Ĝ(z). Thus, the identification problem is equivalent to finding ĥ of h so that ĥ → h. Now, given a positive integer n, define

\begin{array}{l} h_{n} = {(h (0), h (1), \dots, h (n - 1))}^{'}, \\ {\hat{h}}_{n} = {(\hat{h} (0), \hat{h} (1), \dots, \hat{h} (n - 1))}^{'} . \end{array}

Because of the stability assumption $\sum_{i = 0}^{\infty} h {(i)}^{2} = 1, \sum_{i = n}^{\infty} h {(i)}^{2} \to 0$ as n → ∞. Thus, ĥ → h if and only ĥ_n → h_n as n → ∞. Further, ||h_n|| → 1 implies

\begin{array}{l} ∥ {\hat{h}}_{n} - h_{n} ∥ = ∥ {\hat{h}}_{n} - \frac{h_{n}}{∥ h_{n} ∥} + \frac{h_{n}}{∥ h_{n} ∥} - h_{n} ∥ \\ \leq ∥ {\hat{h}}_{n} - \frac{h_{n}}{∥ h_{n} ∥} ∥ + ∥ \frac{h_{n}}{∥ h_{n} ∥} - h_{n} ∥ \end{array}

The second term goes to zero as n → ∞ and therefore, as n → ∞,

∥ \hat{h} - h ∥ \to 0 \Leftrightarrow ∥ {\hat{h}}_{n} - h_{n} ∥ \to 0 \Leftrightarrow ∥ {\hat{h}}_{n} - \frac{h_{n}}{∥ h_{n} ∥} ∥ \to 0

What we have to do is to identify the normalized first n taps of the impulse response of the unknown linear part. In short, to overcome the problem of the unknown order, our way is to find the impulse response.

3 Point a priori information

In this section, we consider identification of the linear part with point a priori information f(x₀) = y₀ on the unknown nonlinearity for some known x₀ and y₀. For simplicity, both x₀ and y₀ are assumed to be at the origin.

Assumption 3.1

It is assumed that

f (x) = 0 \Leftrightarrow x = 0

and f(·) is continuous in the neighborhood of the origin.

The condition is based on the local information f(0) = 0 but is stronger than the local point condition f(0) = 0. f(x) = 0 → x = 0 provides some global information on the nonlinearity since no other value of x could lead to f(x) = 0.

There are two aspects in identification based on the a priori information f(0) = 0. First, no other a priori information on the unknown f(·) is known so all the observed outputs y(k) ≠ 0 together with corresponding inputs do not reveal much information on x(k) or on f(·). In other words, theoretically only the outputs y(k) = 0’s together with the corresponding inputs u(k)’s are useful for identification. Practically, however, y(k) ≈ 0 exactly is unlikely and in fact is not robust in the presence of noise. The hope is that by continuity of f(·) in the neighborhood of the origin, all the data y(k) ≈ 0 implies x(k) ≈ 0 that would result in an estimate ĥ_n close to h_n/||h_n||. Thus, analysis contains two parts. The first part is to show that h_n/||h_n|| can be identified if there is enough data available under the constraint y(k) = 0. Then, we will show that with the data set |y(k)| ≤ ε for some small ε > 0, the obtained estimate is a continuous function of ε and converges to h_n/||h_n|| as ε → 0.

For each n, consider a fictitious FIR system

x (k) = \frac{1}{∥ h_{n} ∥} (h (0), h (1), \dots, h (n - 1)) \underset{φ_{n} (k)}{\underset{︸}{(\begin{matrix} u (k) \\ u (k - 1) \\ ⋮ \\ u (k - n + 1) \end{matrix})}}

where h_n = (h(0), h(1), …, h(n − 1))′ ≠ = 0 which is automatically satisfied for large n because ||h_n|| → ||h|| = 1. Given the input-output data set ${φ_{n} (k), y (k)}_{1}^{N}$ , it can be easily verified [2] that h_n/||h_n|| is identifiable for this fictitious FIR linear system based on the point a priori information f(0) = 0 if and only if there exist some 1 ≤ p₁ < p₂ < … < p_k ≤ N so that x(p₁) = x(p₂) = … = x(p_k) = 0 (or equivalently y(p₁) = … = y(p_k) = 0) and the corresponding matrix Φ(p₁, p₂, …, p_k ) satisfies

rank \underset{Φ (p_{1}, \dots, p_{k})}{\underset{︸}{(\begin{matrix} φ_{n}^{'} (p_{1}) \\ φ_{n}^{'} (p_{2}) \\ ⋮ \\ φ_{n}^{'} (p_{k}) \end{matrix})}} = n - 1 .

(3.1)

Further, let Φ (p₁, …, p_k ) = U Σ(V₁, V₂, …, V_n)′ be the singular value decomposition (SVD) of Φ(p₁, …, p_k ). It follows that

V_{n} = h_{n} / ∥ h_{n} ∥

(3.2)

modulus ± sign. Therefore, h_n/||h_n|| is identifiable from the SVD of Φ(p₁, …, p_k ) for data y(p₁) = … = y(p_k) = 0. Now, the actual system is not FIR but IIR

x (k) - \sum_{i = n}^{\infty} h (i) u (k - i) = h_{n} φ_{n} (k)

Define

z (k) = [x (k) - \sum_{i = n}^{\infty} h (i) u (k - i)] / ∥ h_{n} ∥ = \frac{h_{n}}{∥ h_{n} ∥} φ_{n} (k)

(3.3)

If z(p₁) = z(p₂) = … = z(p_k) = 0, the same conclusion as discussed above applies. With the fact that $∣ \sum_{i = n}^{\infty} h (i) u (k - i) ∣ \leq M_{1} λ^{n} \to 0$ for large n, y(k) ≈ 0 implies x(k), z(k) ≈ 0. The question is if the SVD of Φ(p₁, …, p_k ) would provide a vector V_n that is close to h_n/||h_n|| when ε is small but non-zero. To this end, we need some preliminary works.

First, for each n, define an orthonormal basis functions e₁, e₂, …, e_n₋₁ and e_n = h_n/||h_n|| in Rⁿ. Construct a truncated cone C_i around each e_i, i = 1, 2, …, n, as follows. For i = 1, 2, …, n − 1,

\begin{array}{l} φ \in C_{i} \Leftrightarrow 0 < \frac{a}{2} \leq ∥ φ ∥ \leq 1, and \\ ∣ \cos (∠ (φ, e_{j})) ∣ = {\begin{array}{l} \geq \frac{8}{9} & i f j = i \\ \leq \frac{a}{9 (n - 2)} & i f j \neq i \end{array} \end{array}

(3.4)

where ∠(φ, e_j ) is the angle between φ and e_j, and [a, −a] is the interval in which the input probability density function is positive. For i = n,

\begin{array}{l} φ \in C_{n} \Leftrightarrow 0 < ∥ φ ∥ \leq ε (n) and \\ ∣ \cos (∠ (φ, e_{j})) ∣ = {\begin{array}{l} \geq \frac{8}{9} & i f j = n \\ \leq \frac{a}{9 (n - 2)} & i f j \neq n \end{array} \end{array}

(3.5)

where $\sqrt{n} ε (n) \to 0$ as n → ∞. Clearly, if φ ∈ C_n, we have

\begin{array}{l} ∣ < φ, e_{i} > ∣ = ∥ φ ∥ ∥ e_{i} ∥ \cdot ∣ \cos (∠ (φ, e_{i})) ∣ = \\ {\begin{array}{l} \leq ε \frac{a}{9 (n - 2)} & i = 1, 2,, n - 1 \\ \leq ε & i = n \end{array} \end{array}

(3.6)

and similarly, if φ ∈ C_i, i = 1, 2, …, n − 1, we have

\begin{array}{l} ∣ < φ, e_{j} > ∣ = ∥ φ ∥ ∥ e_{j} ∥ \cdot ∣ \cos (∠ (φ, e_{j})) ∣ \\ = {\begin{array}{l} \leq \frac{a}{9 (n - 2)} & j \neq i \\ \geq \frac{4}{9} a & j = i \end{array} \end{array}

(3.7)

Now recall the definition of φ_n(i_j ) = (u(i_j ), u(i_j − 1), …, u(i_j − n + 1))′. Write each φ_n(i_j) in terms of the basis functions e_i’s

φ_{n} (i_{j}) = β_{j 1} e_{1} + β_{j 2} e_{2} + \dots + β_{j n} e_{n}

where β _ji is the projection of φ_n(i_j) on e_i, and

\begin{array}{l} (\begin{matrix} x (i_{1}) \\ ⋮ \\ x (i_{n}) \end{matrix}) = (\begin{matrix} φ_{n}^{'} (i_{1}) \\ ⋮ \\ φ_{n}^{'} (i_{n}) \end{matrix}) h_{n} + \underset{R (i_{1}, \dots, i_{n})}{\underset{︸}{(\begin{matrix} \sum_{i = n}^{\infty} h (i) u (i_{1} - i) \\ ⋮ \\ \sum_{i = n}^{\infty} h (i) u (i_{n} - i) \end{matrix})}} \\ = (\begin{matrix} β_{1, 1} & \dots & β_{1, n} \\ ⋮ & ⋱ & ⋮ \\ β_{n, 1} & \dots & β_{n, n} \end{matrix}) (\begin{matrix} e_{1}^{'} \\ ⋮ \\ e_{n}^{'} \end{matrix}) h_{n} + R (i_{1}, \dots, i_{n}) \\ = {\underset{Q}{\underset{︸}{(\begin{matrix} β_{1, 1} & \dots & β_{1, n - 1} \\ ⋮ & ⋱ & ⋮ \\ β_{n - 1, 1} & \dots & β_{n - 1, n - 1} \\ β_{n, 1} & \dots & β_{n, n - 1} \end{matrix}) (\begin{matrix} e_{1}^{'} \\ ⋮ \\ e_{n - 1}^{'} \end{matrix})}} \\ + \underset{E (ε)}{\underset{︸}{(\begin{matrix} β_{1, n} \\ ⋮ \\ β_{n - 1, n} \\ β_{n, n} \end{matrix}) e_{n}^{'}}}} h_{n} + R (i_{1}, \dots, i_{n}) \end{array}

(3.8)

for some 1 ≤ i₁ < i₂ < … < i_n ≤ N.

Lemma 3.1

Consider the Wiener system shown in Figure 1 under Assumption (3.1). Then, we have

For any given large n and ε(n) satisfying $\sqrt{n} ε (n) \to 0$ as n → ∞, with probability one as N → ∞, there exists a sequence of φ_n(i_j ), j = 1, 2, …, n so that |y(i_j)| ≤ ε and

$rank Φ (i_{1}, \dots, i_{n}) = rank (\begin{matrix} φ_{n}^{'} (i_{1}) \\ φ_{n}^{'} (i_{2}) \\ ⋮ \\ φ_{n}^{'} (i_{n}) \end{matrix}) \geq n - 1.$
The matrix Φ (i₁, …, i_n) can be written as

$Φ (i_{1}, \dots, i_{n}) = Q + E (ε)$ (3.9)

for some Q and E(ε), where rank Q = n − 1 independent of ε and ||E(ε)|| → 0 as ε → 0. Further, let

$Φ (i_{1}, i_{2}, \dots, i_{n}) = U (ε) \sum (ε) {(V_{1} (ε), V_{2} (ε), \dots, V_{n} (ε))}^{'}$

be the SVD decomposition of Φ(i₁, i₂, …, i_n). Then, modulus of ± signs,

$∥ V_{n} (ε) - \frac{h_{n}}{∥ h_{n} ∥} ∥ \to 0, a s n \to \infty .$ (3.10)

Proof

The proof of the first part is essentially the same as for the FIR case [2] by noting ||R(i₁, …, i_n)|| → 0 as n → ∞. To show the second part, consider a submatrix as in (3.8)

(\begin{matrix} β_{1, 1} & \dots & β_{1, n - 1} \\ ⋮ & ⋱ & ⋮ \\ β_{n - 1, 1} & \dots & β_{n - 1, n - 1} \end{matrix})

By the construction of the cones C_i’s and the definition of φ_n(i_j), it follows that

∣ β_{i i} ∣ \geq \frac{4}{9} a, ∣ β_{i j} ∣ \leq \frac{a}{9 (n - 2)}, i \neq j

which leads to

∣ β_{i i} ∣ - \frac{1}{2} {\sum_{j = 1, j \neq i}^{n - 1} ∣ β_{j i} ∣ + \sum_{j = 1, j \neq i}^{n - 1} ∣ β_{i j} ∣} \geq \frac{4}{9} a - \frac{1}{9} a = \frac{3}{9} a

By the Gershgorin’ Theorem [11] on singular values, the singular values of the above submatrix satisfy

σ_{1} \geq σ_{2} \geq \dots \geq σ_{n - 1} \geq \frac{3}{9} a

independent of n. Further, by the fact that

(\begin{matrix} e_{1}^{'} \\ ⋮ \\ e_{n - 1}^{'} \end{matrix}) (e_{1}, \dots, e_{n - 1}) = (\begin{matrix} 1 & 0 & \dots & 0 \\ 0 & 1 & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & 1 \end{matrix})

and

\begin{array}{l} Q = (\begin{matrix} β_{1, 1} & \dots & β_{1, n - 1} \\ ⋮ & ⋱ & ⋮ \\ β_{n - 1, 1} & \dots & β_{n - 1, n - 1} \\ 0 & \dots & 0 \end{matrix}) (\begin{matrix} e_{1}^{'} \\ ⋮ \\ e_{n - 1}^{'} \end{matrix}) \\ + (\begin{matrix} 0 & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & 0 \\ β_{n, 1} & \dots & β_{n, n - 1} \end{matrix}) (\begin{matrix} e_{1}^{'} \\ ⋮ \\ e_{n - 1}^{'} \end{matrix}) \end{array}

(3.11)

and

∣ β_{n, 1} ∣ \leq ε, \dots, ∣ β_{n, n - 1} ∣ \leq ε .

(3.12)

we have from the Wielandt-Hoffman Theorem [8] that the first n − 1 singular values of Q satisfy

σ_{1} \geq σ_{2} \geq \dots \geq σ_{n - 1} \geq \frac{3}{9} a - O (\sqrt{n} ε) \geq \frac{2}{9} a

for large n. Moreover, Qh_n = 0, because h_n ⊥ e_i for i = 1, 2, …, n − 1, implies that the last or the smallest singular value σ_n = 0 and

rank Q = n - 1

for all n. In addition, |β_n,j| ≤ ε implies $∥ E (ε) ∥ = O (\sqrt{n} ε) \to 0$ . Now, define

\begin{array}{l} Q = U \sum {(V_{1}, \dots, V_{n})}^{'} and Q + E (ε) \\ = U (ε) \sum (ε) {(V_{1} (ε), \dots, V_{n} (ε))}^{'} \end{array}

(3.13)

It is clear that h_n/||h_n|| = ±V_n and what left to show is that V_n(ε) → V_n when n gets larger. To this end, again by the Wielandt-Hoffman Theorem [8], the gap between the smallest singular value and the second smallest singular value of the matrix Q + E(ε) is bounded below by

σ_{n - 1} - σ_{n} \geq \frac{2}{9} a - O (\sqrt{n} ε) \geq \frac{1}{9} a

Now, we apply a version of the Circle Theorem [5] (equ 1.2.19)

\begin{array}{l} \sin (∠ (V_{n} (ε), V_{n})) \leq \frac{O (\sqrt{n} ε)}{(σ_{n - 1} - σ_{n}) - O (\sqrt{n} ε)} \\ \leq \frac{O (\sqrt{n} ε)}{\frac{1}{9} a - O (\sqrt{n} ε)} \to 0 \end{array}

(3.14)

as n → ∞. Since ||V_n|| = ||V_n(ε)|| = 1, the conclusion V_n(ε) → V_n = h_n/||h_n|| follows. This completes the proof.

The following result is a direct consequence of the above lemma.

Theorem 3.1

Let ĥ_n = (ĥ(0), …, ĥ(n − 1))′ = ±V_n(ε) be the estimate of $\frac{h_{n}}{∥ h_{n} ∥}$ so that the first non-zero entry is positive and

\hat{G} (z) = \sum_{i = 0}^{n - 1} \hat{h} (i) z^{- i}

Then, as n → ∞

\frac{1}{2 π} \int_{- π}^{π} {∣ G (e^{j ω}) - \hat{G} (e^{j ω}) ∣}^{2} d ω \to 0

Based on the results, we can collect the data set |y(i₁)| ≤ ε, |y(i₂)| ≤ ε, …, |y(i_n)| ≤ ε so that rank Φ(i₁, i₂, …, i_n) ≥ n − 1. Then, the SVD of Φ(i₁, i₂, …, i_n) provides the estimate ĥ_n = V_n(ε), modulus ± sign. A problem is that only data at time i₁, i₂, …, i_n are used and all other data is discarded which is not efficient and in fact is not robust in the presence of noise. An efficient way is to use all the data |y(k)| ≤ ε and the corresponding matrix Φ. The analysis as discussed before carries over with no or minimal modifications. But at the same time, since more data is used, the average effect of the noise is reduced making the identification algorithm more robust. We are now in a position to introduce the identification algorithm based on point a priori information.

Identification algorithm with the point a priori information f(0) = 0: Consider the system shown in Figure 1 under Assumption (3.1).

Step 1: Collect data u(k)’s and y(k)’s, k = 1, 2, …, N.

Step 2: For each n, construct a submatrix Φ(i₁, i₂, …, i_l) of Φ (1, 2, …, N ) by deleting k’s row if |y(k)| > ε, where $\sqrt{n} ε \to 0$ as n → ∞.

Step 3: Calculate SVD

Φ (i_{1}, \dots, i_{l}) = U (ε) \sum (ε) {(V_{1} (ε), \dots, V_{n} (ε))}^{'} .

Step 4: Define ĥ_n = ±V_n(ε) so that the first non-zero element of ĥ_n is positive.

Step 5: Set $\hat{G} (z) = \sum_{i = 0}^{n - 1} \hat{h} (i) z^{- i}$ .

Then, from the lemma and theorem, for each n, ĥ_n → h_n/||h_n|| as N → ∞. Further, as n gets larger and larger, ĥ_n → h and Ĝ (e^jω ) → G(e^jω ) in the integral least squares sense.

We comment that in the algorithm, the choice of ε is not unique. Small ε throws away all the data which is larger than ε and results in few data to construct the estimate. Thus, it takes a longer time to collect the same number of data useful to construct the estimate for a small ε than a large ε. On the other hand, however, a large ε collects y(k) that is not so small which results in x(k) that is not in the neighborhood of 0 but is mistakenly considered to be near 0 and used to construct the estimate. Clearly, this tends to increase the bias and at the same time, to reduce the variance because more and more data can be used. So the choice of ε is to balance the bias and variance which is reminiscent of the choice of the bandwidth in kernel identification [9,12]. The idea is to use local data near y(k) = 0 to identify the linear part without interference from the unknown nonlinearity. Some guidelines are provided in Section 5.4 of [9]. Of course, preferably, any choice of ε needs to be tested on a fresh data set for validation purpose.

We now provide a numerical simulation. Let the linear part be an 4th order system

G (z) = \frac{0.7616 z^{2} + 0.6160}{z^{4} + 0.223 z^{2} + 0.41}

(3.15)

and the nonlinear part be a non-continuous, non-symmetric and non-monotonic nonlinearity shown in Figure 2,

y = f (x) = {\begin{array}{l} 0.5 x - 0.2 & x \leq - 0.2 \\ 1.2 x & - 0.2 < x (k) \leq 0.8 \\ 0.3 x + 0.5 & x > 0.8 \end{array}

(3.16)

The input u(k) is iid uniformly in [− 1, 1] and the Gaussian noise is added to the output. Figures 3 and 4 show the estimates ĥ_n, Ĝ (e^jω) of h_n and G(e^jω) respectively when N = 3, 000, ε = 0.1, SNR=20db and n = 30 with the estimation error

Fig. 4 — Ĝ(*e^jω*)(solid) and G(*e^jω*)(dash-dot).

{∥ \hat{h} - h ∥}^{2} = \frac{1}{2 π} \int_{- π}^{π} {∣ \hat{G} (e^{j ω}) - G (e^{j ω}) ∣}^{2} d ω = 0.0011

The estimate ĥ is defined as ĥ = (ĥ(0), …, ĥ (n − 1), 0, 0, …)′.

To demonstrate the performance of the identification, the algorithm has been simulated for different combinations of SNR, data length N, order n and threshold ε. Table 1 shows the estimation error ||ĥ − h||² for various N and SNR values when ε and n are fixed at ε = 0.1 and n = 30. Table 2 shows the estimation error ||ĥ − h||² for various ε and n when N and SNR are fixed at N = 3, 000 and SNR=20dB. All the results are the averages of 50 Monte Carlo simulations.

Table 1.

Estimation error vs N and SNR when ε = 0.1 and n = 30.

SNR	10dB	20dB	40dB	∞
N=1,000	0.2552	0.0344	0.0049	0.0027
N=2,000	0.0852	0.0144	0.0022	0.0010
N=3,000	0.0600	0.0087	0.0012	0.0007

Open in a new tab

Table 2.

Estimation error vs ε and n when N = 3, 000 and SNR=20dB.

	n=15	n=20	n=30
ε = 0.08	0.0046	0.0069	0.0106
ε = 0.1	0.0038	0.0056	0.0085
ε = 0.12	0.0036	0.0050	0.0082

Open in a new tab

4 Monotonic nonlinearities

The idea of point a priori information is that though there is no other information about the nonlinearity, data in the neighborhood of origin could be used to construct an estimate because the knowledge about the nonlinearity around the origin is known. In this section, we extend the idea to a case where no point a priori information is available but the nonlinearity is assumed to be monotonic in some intervals. More precisely, it is assumed that

Assumption 4.1

There exists an interval −∞ < f < f̄ < ∞ and within the interval f(x) ∈ [f, f̄], f(·) is continuous and

f (x_{1}) = f (x_{2}) \Leftrightarrow x_{1} = x_{2}

Again, we comment that the assumption actually contains some global information on the nonlinearity. Let f(x) = f and f(x̄) = f̄. Then, the assumption prevents the nonlinearity from taking any value between f and f̄ anywhere outside of the range (x, x̄).

Clearly, f(·) is monotonic if y = f(x) ∈ [f, f̄].

Now, define

\begin{array}{l} z (i, j) = x (i) - x (j) \\ ψ_{n} (i, j) = φ_{n} (i) - φ_{n} (j), \end{array}

(4.1)

It is easily verified that

z (i, j) = h_{n}^{'} ψ_{n} (i, j) + \sum_{l = n}^{\infty} h (l) (u (i - l) - u (j - l))

(4.2)

The equation is reminiscent of (3.3) and is a key for identification based on point a priori information. Note from the monotonic assumption,

\begin{matrix} y (i) - y (j) = f (x (i)) - f (x (j)) = 0 \Leftrightarrow \\ x (i) = x (j) \Leftrightarrow z (i, j) = 0 \end{matrix}

and

∣ f (x (i)) - f (x (j)) ∣ \leq ε \Leftrightarrow ∣ x (i) - x (j) ∣ \leq ε_{1}

for small ε₁ thanks to the continuity of f, if ε is small enough. Therefore, by re-naming z(i, j) as x(i) and ψ_n(i, j) as φ_n(i), everything developed for point a priori information in the previous section can be carried over here. The following result is a straightforward extension of Theorem 3.1.

Theorem 4.1

Consider the system shown in Figure 1 under Assumption (4.1). Assume that the probability density function of y = f(x) is positive in the interval [f, f̄]. Then,

For any n and ε > 0 so that $\sqrt{n} ε \to 0$ as n → ∞, with probability one as N → ∞, there exist two sequences ψ_n(i_l, j_l) = φ_n(i_l) − φ_n(j_l) and |z(i_l, j_l)| = |y(i_l) − y(j_l)| ≤ ε and

$\begin{array}{l} rank Ψ (i_{1}, j_{1}, \dots i_{n}, j_{n}) \\ = rank (\begin{matrix} φ_{n}^{'} (i_{1}) - φ_{n}^{'} (j_{1}) \\ ⋮ \\ φ_{n}^{'} (i_{n}) - φ_{n}^{'} (j_{n}) \end{matrix}) \geq n - 1. \end{array}$ (4.3)
The matrix Ψ(i_l, j₁, …i_n, j_n) can be written as

$Ψ (i_{1}, j_{1}, \dots i_{n}, j_{n}) = Q + E (ε)$

for some Q and E(ε), where rank Q = n − 1 independent of n and ||E(ε)|| → 0 as n → 0. Further, let

$Ψ (i_{1}, j_{1}, \dots i_{n}, j_{n}) = U (ε) \sum (ε) {(V_{1} (ε), V_{2} (ε), \dots, V_{n} (ε))}^{'}$

be the SVD of Ψ. Then, modulus of ± signs,

$∥ V_{n} (ε) - \frac{h_{n}}{∥ h_{n} ∥} ∥ \to 0, a s n \to 0$ (4.4)

or equivalently $\frac{1}{2 π} \int_{- π}^{π} {∣ G (e^{j ω}) - \hat{G} (e^{j ω}) ∣}^{2} d ω \to 0$ .

Similarly, we can define the identification algorithm where the unknown nonlinearity is monotonic in [f, f̄].

Identification algorithm with the monotonic assumption: Consider the system in Figure 1 under Assumption (4.1).

Step 1: Collect data u(k)’s and y(k)’s for those y(k) ∈ [f, f̄].

Step 2. Sort out the collected data in a decreasing order y(k₁) ≥ y(k₂) ≥ … ≥ y(k_l).

Step 3: For each n and ε with $\sqrt{n} ε \to 0$ , construct z(k_i, k_i₊₁) = y(k_i) − y(k_i₊₁), ψ_n(k_i, k_i₊₁) = φ_n(k_i) − φ_n(k_i₊₁). Construct a submatrix Ψ_n(i₁, j₁, …, i_l, j_l) of Ψ_n by deleting q’s row if |z(q, q + 1)| > ε.

Step 4: Calculate the SVD Ψ_n(i₁, j₁, …, i_l, j_l) =U (ε)Σ(ε)(V₁(ε), …, V_n(ε))′.

Step 5: Define ĥ_n = ±V_n(ε) so that the first non-zero element of ĥ_n is positive.

Step 6: Set $\hat{G} (z) = \sum_{i = 0}^{n - 1} \hat{h} (i) z^{- i}$ .

As before, ĥ_n → h and Ĝ (e^jω) → G(e^jω).

We now test the algorithm on the same example (3.15) as in the previous section under the same input but under the assumption that the nonlinearity is monotonic for |y| ≤ 0.7. Figures 5 and 6 show the estimates ĥ_n, Ĝ (e^jω) of h_n and G(e^jω) respectively when N = 2, 000, ε = 0.1, SNR=20db and n = 30 with the estimation error 0.0041.

Fig. 5 — *ĥ_n* and *h_n*, monotonic assumption.

Fig. 6 — Ĝ(*e^jω*)(solid) and G(*e^jω*)(dash-dot), monotonic assumption.

Again, to demonstrate the performance of the identification algorithm, Table 3 shows the estimation errors for various N and SNR when ε = 0.1 and n = 30 and Table 4 shows the estimation errors for various ε and n when N = 1, 000 and SNR=20dB. All the results are the averages of 50 Monte Carlo simulations.

Table 3.

Estimation error vs N and SNR when ε = 0.1 and n = 30, monotonic priori information.

SNR	10dB	20dB	40dB	∞
N=500	0.0670	0.0186	0.0020	0.0000021
N=1,000	0.0315	0.0080	0.0009	0.00000026
N=2,000	0.0142	0.0042	0.0005	0.00000003

Open in a new tab

Table 4.

Estimation error vs ε and n when N = 1, 000 and SNR=20dB, monotonic priori information.

	n=15	n=20	n=30
ε = 0.08	0.0037	0.0054	0.0084
ε = 0.1	0.0040	0.0058	0.0086
ε = 0.12	0.0038	0.0051	0.0084

Open in a new tab

This algorithm seems to outperform the one with point a priori information. One explanation is that this algorithm utilizes the data y ∈ [−0.7, 0.7] and the previous one only uses the data y close to zero. Simply put, more data is allowed for this algorithm than the previous one and thus, the effect of noises is small. We also comment that the nonlinearity is actually non-continuous but the algorithm works anyway. This is because the nonlinearity is non-continuous only at one point for |y| ≤ 0.7. Further, all the data collected on two segments separated by this point will not be used in identification because the difference 0.18 is larger than the threshold ε = 0.1.

5 Quadrant a priori information

In this section, we discuss identification with quadrant or sign a priori information. It is assumed that

Assumption 5.1

sign (x) = sign (f (x)) = sign (y) .

Clearly, the unknown nonlinearity is strictly in the first and third quadrants and no other information is available. The nonlinearity can be non-smooth and non-monotonic. It is important to comment that the results derived in this section are not limited to a priori information of Assumption 5.1 but apply to sign(y) = −sign(x) or other similar a priori information with minimal modifications.

In this section, we make an additional assumption on the linear part.

Assumption 5.2

The order m of the linear part is known

G (z) = \frac{α_{1} z^{m - 1} + α_{2} z^{m - 2} + \dots + α_{m}}{z^{m} + β_{1} z^{m - 1} + \dots + β_{m}}

Obviously, there is no other a priori information and identification has to rely on the knowledge of sign(x(k)) = sign(y(k)). Let (α̂₁, …, α̂_m, β̂₁, …, β̂_m)′ denote an estimate of (α₁, …, α_m, β₁, …, β_m)′ and

\hat{G} (z) = \frac{{\hat{α}}_{1} z^{m - 1} + {\hat{α}}_{2} z^{m - 2} + \dots + {\hat{α}}_{m}}{z^{m} + {\hat{β}}_{1} z^{m - 1} + \dots + {\hat{β}}_{m}}

an estimate of G(z). Because Ĝ (z) = G(z) if

({\hat{α}}_{1}, \dots, {\hat{α}}_{m}, {\hat{β}}_{1}, \dots, {\hat{β}}_{m}) = (α_{1}, \dots, α_{m}, β_{1}, \dots, β_{m})

our approach to find an estimate is by the following minimization

\begin{array}{l} ({\hat{α}}_{1}, \dots, {\hat{α}}_{m}, {\hat{β}}_{1}, \dots, {\hat{β}}_{m}) \\ = \underset{{\bar{α}}_{i}, {\bar{β}}_{j}}{argmin} \sum_{k = 1}^{N} {(sign (y (k)) - sign (\bar{y} (k)))}^{2} \\ = \underset{{\bar{α}}_{i}, {\bar{β}}_{j}}{argmin} \sum_{k = 1}^{N} {(sign (y (k)) - sign (\bar{x} (k)))}^{2} \end{array}

(5.1)

subject to ||ĥ|| = 1, where ĥ is the impulse response of the estimate Ĝ (z) and x̄ (k) is generated by the input and $\bar{G} (z) = \frac{{\bar{α}}_{1} z^{m - 1} + {\bar{α}}_{2} z^{m - 2} + \dots + {\bar{α}}_{m}}{z^{m} + {\bar{β}}_{1} z^{m - 1} + \dots + {\bar{β}}_{m}}$ .

It is clear that $\sum_{k = 1}^{N} {(sign (y (k)) - sign (\hat{x} (k)))}^{2} = 0$ if (α̂₁, …, α̂_m, β̂₁, …, β̂_m) = (α₁, …, α_m, β₁, …, β_m). To guarantee that the optimization (5.1) will produce a correct estimate, what we have to show is that for all (α̂₁, …, α̂_m, β̂₁, …, β̂_m) ≠ (α₁, …, α_m, β₁, …, β_m), $0 < \sum_{k = 1}^{N} {(sign (x (k)) - sign (\hat{x} (k)))}^{2}$ or equivalently,

{(sign (x (k)) - sign (\hat{x} (k)))}^{2} = 4

for some k. The meaning is that the minimization of (5.1) has one and only one solution that is achieved at the true but unknown (α₁, …, α_m, β₁, …, β_m).

Recall h and ĥ are the impulse responses of G(z) and Ĝ (z) respectively. Obviously,

h = \hat{h} \Leftrightarrow ({\hat{α}}_{1}, \dots, {\hat{α}}_{m}, {\hat{β}}_{1}, \dots, {\hat{β}}_{m}) = (α_{1}, \dots, α_{m}, β_{1}, \dots, β_{m})

Now, write

\begin{array}{l} x (k n) = h_{n}^{'} φ_{n} (k n) + \sum_{i = n}^{\infty} h (i) u (k n - i) \\ \hat{x} (k n) = {\hat{h}}_{n}^{'} φ_{n} (k n) + \sum_{i = n}^{\infty} \hat{h} (i) u (k n - i) \end{array}

Before presenting the main result of this section, we make a few observations.

φ_n(kn) and φ_n(jn) are iid if j ≠ k, Also, φ_n(kn) is a n-dimensional random vector that assumes any direction with a positive probability. Moreover, ||φ_n(kn) || ≥ a/2 with a positive probability.
||ĥ|| = ||h|| = 1. Thus for any small ε > 0, there is an n₁ > 0 such that for all n ≥ n₁,

$1 \geq \sum_{k = 0}^{n - 1} h {(k)}^{2} \geq 1 - ε, 1 \geq \sum_{k = 0}^{n - 1} \hat{h} {(k)}^{2} \geq 1 - ε$

Further, h_n ≠ ĥ_n if h ≠ ĥ.
For any small ε > 0, there exists n₂ > 0 such that for all n ≥ n₂,

$∣ \sum_{i = n}^{\infty} h (i) u (n - i) ∣ \leq ε, ∣ \sum_{i = n}^{\infty} \hat{h} (i) u (n - i) ∣ \leq ε .$

We now state the main results of this section.

Theorem 5.1

Consider the Wiener system in Figure 1 under Assumptions 5.1 and 5.2, the estimate $\hat{G} (z) = \frac{{\hat{α}}_{1} z^{m - 1} + {\hat{α}}_{2} z^{m - 2} + \dots + {\hat{α}}_{m}}{z^{m} + {\hat{β}}_{1} z^{m - 1} + \dots + {\hat{β}}_{m}}$ derived from the minimization of (5.1). Then, with probability one as N → ∞,

\sum_{k = 1}^{N} {(sign (y (k)) - sign (\hat{x} (k)))}^{2} \geq 4

if (α̂₁, …, α̂_m, β̂₁, …, β̂_m) ≠ (α₁, …, α_m, β₁, …, β_m).

Proof

If (α̂₁, …,α̂_m, β̂₁, …, β̂_m) ≠ (α₁, …, α_m, β₁, …, β_m) or equivalently ĥ ≠ h, the angle θ = ∠(h, ĥ) between h and ĥ is non-zero. There are two cases, 0 < θ < 90° and 90° ≤ θ ≤ 180°. The proof for the second case is similar and we only show the first case.

From the observations, there is a large (possibly unknown) n and a small (possible unknown) ε > 0 such that

\begin{array}{l} h_{n} \neq {\hat{h}}_{n}, ∥ h_{n} ∥ > 1 - ε, ∥ {\hat{h}}_{n} ∥ > 1 - ε \\ ∣ \sum_{i = n}^{\infty} h (i) u (k n - i) ∣ < ε, ∣ \sum_{i = n}^{\infty} \hat{h} (i) u (k n - i) ∣ < ε \end{array}

and

0 < ε < \frac{a}{2} (1 - ε) ∣ \sin (ξ) ∣, \frac{1}{4} θ \leq ξ \leq \frac{3}{4} θ .

This is because the right hand side goes to a positive value and the middle term goes to zero as ε → 0. In addition, the angle ∠(h_n, ĥ_n) is between [3/4θ, 5/4θ] for large n because ∠(h_n, ĥ_n) → ∠(h, ĥ). Further, there exists a vector φ_n(kn) with ||φ_n(kn)|| ≥ a/2 as shown in Figure 7.

Now,

\begin{array}{l} \cos (90 ° + \frac{1}{4} θ) = - \sin (\frac{1}{4} θ) < \frac{- 2 ε}{a (1 - ε)}, \\ \cos (90 ° + \frac{3}{4} θ) = - \sin (\frac{3}{4} θ) < \frac{- 2 ε}{a (1 - ε)} \\ \cos (90 ° - \frac{1}{2} θ) = \sin (\frac{1}{2} θ) > \frac{2 ε}{a (1 - ε)} \end{array}

which results in

\begin{array}{l} h_{n}^{'} φ_{n} (k n) = ∥ h_{n} ∥ ∥ φ_{n} (k n) ∥ \cos (∠ (h_{n}, φ_{n} (k n))) < \\ ∥ h_{n} ∥ ∥ φ_{n} (k n) ∥ \frac{- 2 ε}{a (1 - ε)} < - ε \\ {\hat{h}}_{n}^{'} φ_{n} (k n) = ∥ {\hat{h}}_{n} ∥ ∥ φ_{n} (k n) ∥ \cos (∠ ({\hat{h}}_{n}, φ_{n} (k n))) = \\ ∥ {\hat{h}}_{n} ∥ ∥ φ_{n} (k n) ∥ \cos (90 ° - \frac{1}{2} θ) > ε \end{array}

Therefore,

\begin{array}{l} x (k n) = \sum_{i = n}^{\infty} h (i) u (k n - i) + h_{n}^{'} φ_{n} (k n) \\ < ε + h_{n}^{'} φ_{n} (k n) < 0 \\ \hat{x} (k n) = \sum_{i = n}^{\infty} \hat{h} (i) u (k n - i) + {\hat{h}}_{n}^{'} φ_{n} (k n) \\ > - ε + {\hat{h}}_{n}^{'} φ_{n} (k n) > 0 \end{array}

{(sign (x (k n)) - sign (\hat{x} (k n)))}^{2} = 4

(5.2)

By the continuity arguments, any vector close enough to φ_n(kn) would result in the same conclusion as (5.2). Again from the observations, φ_n(kn), k = 1, 2, …, is iid with a positive probability ||φ_n(kn)|| ≥ a/2 and assumes any direction with a positive probability. Therefore, there is a positive probability for each k that φ_n(kn) produces (5.2). More precisely, for each k, there is positive probability p > 0 that (sign(x(kn)) −sign(x̂ (kn)))² = 4 if (α̂₁, …, α̂_m, β̂₁, …, β̂_m) ≠ (α₁, …, α_m, β₁, …, β_m). By the Borel Lemma $\sum_{k = 1}^{\infty} {(1 - p)}^{k} < \infty$ , we conclude that with probability one as N → ∞, there is a k such that (sign(x(kn)) − sign(x̂ (kn)))² = 4. This completes the proof.

The result presented above is actually weaker than its counterpart for an FIR case as in [2] where not only does the minimization have one and only one global minimum but also there are no other local minimum. In other words, the objective function is a monotonic function of the angle between the estimate and the true but unknown system. We conjecture the same conclusion holds for the IIR case but do not have any proof yet.

Identification algorithm under quadrant a priori information.: Consider the system in Figure 1 under Assumptions (5.1) and (5.2).

Step 1: Collect data φ (k) and y(k), k = 1, …, N.

Step 2: Solve the minimization problem (5.1) to find the estimate (α̂₁, …, α̂_m, β̂₁, …, β̂_m).

Step 3: Define

\hat{G} (z) = \frac{{\hat{α}}_{1} z^{m - 1} + {\hat{α}}_{2} z^{m - 2} + \dots + {\hat{α}}_{m}}{z^{m} + {\hat{β}}_{1} z^{m - 1} + \dots + {\hat{β}}_{m}}

We now test the algorithm on the same example (3.15) as in the previous section under the same input but under the assumption sign(y(k)) = sign(x(k)). A genetic algorithm [1] was applied with n = 30 and N = 5, 000 and 10, 000. The genetic algorithm is a heuristic zero-order iterative search algorithm. The total number of the genetic algorithm parent was 64. Figures 8 and 9 show the estimates ĥ_n, Ĝ (e^jω) of h_n and G(e^jω) respectively when SNR=20db. Table 5 shows the estimation errors for various SNR. All the results are the averages of 50 Monte Carlo simulations.

Fig. 8 — *ĥ_n* and *h_n*, sign a priori information.

Fig. 9 — Ĝ(*e^jω*)(solid) and G(*e^jω*)(dash-dot), sign a priori information.

Table 5.

Estimation error, sign a priori information.

SNR	10dB	20dB	40dB	∞
N=5,000	0.0093	0.0028	0.0005	0.0003
N=10,000	0.0057	0.0014	0.0002	0.0001

Open in a new tab

6 Concluding remarks

The focus of this paper is to derive identifiability under various minimal a priori information on the unknown nonlinearity. No theoretical results on noise analysis are presented. Noise effects are however extensively tested in numerical simulations. Theoretical study of noise effects will be an interesting research topic.

Our long term goal is to find what constitutes the least amount of a priori information that makes identification of a Wiener system possible. The finding presented in the paper are useful in this regard but there is still a long way to go to find the answer.

Biographies

graphic file with name nihms106936b1.gif Er-Wei Bai was educated in Fudan University, Shanghai Jiaotong University, both in Shanghai, China, and the University of California at Berkeley. Dr. Bai is Professor of Electrical and Computer Engineering at the University of Iowa where he teaches and conducts research in identification, control, signal processing and their applications in engineering and medicine.

Dr. Bai is an IEEE Fellow and a recipient of the President’s Award for Teaching Excellence.

graphic file with name nihms106936b2.gif John Reyland, Jr. is a Principle Digital Signal Processing Engineer at Rockwell Collins, Inc. in Cedar Rapids, Iowa. He is also a Ph.D. candidate in Electrical and Computer Engineering at the University of Iowa. Mr. Reyland has a B.S.E.E. from Texas A&M University and an M.S.E.E. from George Mason University in Fairfax, Virginia.

Footnotes

^⋆

This paper was not presented at any IFAC meeting. The work was supported in part by NSF ECS-0555394 and NIH/NIBIB EB004287.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

1.Abe M. Comparison of the Convergence of IIR Evolutionary Digital Filters and Other Adaptive Digital Filters on a Multiple-Peak Surface. Proc the Thirty-First Asilomar Conference on Signals, Systems & Computers. 1997;2:1674–1678. [Google Scholar]
2.Bai EW, Reyland J. Towards identification of Wiener systems with the least amount of a priori information on the nonlinearity. Automatica. 2008;44:910–919. doi: 10.1016/j.automatica.2008.11.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Bai EW. A blind approach to the Hammerstein-Wiener model identification. Automatica. 2002;38:967–979. [Google Scholar]
4.Billings SA, Fakhouri SY. Identification of a class of nonlinear systems using correlation analysis. Proc of IEE. 1978;125(7):691–697. [Google Scholar]
5.Bjorck A. Numerical methods for least squares problems. SIAM publisher; 1996. [Google Scholar]
6.Hu X, Chen HF. Strong consistence of recursive identification for Wiener systems. Automatica. 2005;41:1905–1916. [Google Scholar]
7.Crama P, Schoukens J. Initial estimates of Wiener and Hammerstein systems using multisine excitation. IEEE Trans on Instrumentation and Measurement. 2001;50:1791–1795. [Google Scholar]
8.Golub GH, Van Loan C. Matrix Computations. The John Hopkins University Press; Baltimore, Maryland: 1984. [Google Scholar]
9.Fan J, Yao Q. NONLINEAR TIME SERIES. Springer; New York: 2003. [Google Scholar]
10.Greblicki W. Nonparametric identification of Wiener systems. IEEE Trans on Info Theory. 1992;38:1487–1493. [Google Scholar]
11.Johnson C, Szulc T. Further lower bounds for the smallest singular values. Linear algebra and its applications. 1998;272:169–179. [Google Scholar]
12.Nadaraya EA. NONPARAMETRIC ESTIMATION OF PROBABILITY DENSITIES AND REGRESSION CURVES. Kluwer Academic Pub; Dordrecht, The Netherlands: 1989. [Google Scholar]
13.Papoulis A, Pillai SU. Probability, Random Variables and Stochastic Processes. 4. McGraw Hill; Boston: 2002. [Google Scholar]
14.Voros J. Parameter identification of Wiener systems with discontinuous nonlinearities. Systems and Control Letters. 2001;44(5):363–372. [Google Scholar]
15.Westwick D, Verhaegen M. Identifying MIMO Wiener systems using subspace model identification method. Signal Processing. 1996;52:235–258. [Google Scholar]
16.Wigren T. Circle criteria in recursive identification. IEEE Trans on Automatic Control. 1997;42:975–979. [Google Scholar]
17.Wigren T. Recursive prediction error identification using the nonlineari Wiener model. Automatica. 1993;29:1011–1025. [Google Scholar]
18.Wigren T. Adaptive filtering using quantized output measurements. IEEE Trans on Signal Processing. 1998;46:3423–3426. [Google Scholar]
19.Zhang Q, Iouditski A, Ljung L. IFAC Symp on System Identification. Newcastle; Australia: 2006. Identification of Wiener system with monotonous nonlinearity; pp. 166–171. [Google Scholar]

[R1] 1.Abe M. Comparison of the Convergence of IIR Evolutionary Digital Filters and Other Adaptive Digital Filters on a Multiple-Peak Surface. Proc the Thirty-First Asilomar Conference on Signals, Systems & Computers. 1997;2:1674–1678. [Google Scholar]

[R2] 2.Bai EW, Reyland J. Towards identification of Wiener systems with the least amount of a priori information on the nonlinearity. Automatica. 2008;44:910–919. doi: 10.1016/j.automatica.2008.11.020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Bai EW. A blind approach to the Hammerstein-Wiener model identification. Automatica. 2002;38:967–979. [Google Scholar]

[R4] 4.Billings SA, Fakhouri SY. Identification of a class of nonlinear systems using correlation analysis. Proc of IEE. 1978;125(7):691–697. [Google Scholar]

[R5] 5.Bjorck A. Numerical methods for least squares problems. SIAM publisher; 1996. [Google Scholar]

[R6] 6.Hu X, Chen HF. Strong consistence of recursive identification for Wiener systems. Automatica. 2005;41:1905–1916. [Google Scholar]

[R7] 7.Crama P, Schoukens J. Initial estimates of Wiener and Hammerstein systems using multisine excitation. IEEE Trans on Instrumentation and Measurement. 2001;50:1791–1795. [Google Scholar]

[R8] 8.Golub GH, Van Loan C. Matrix Computations. The John Hopkins University Press; Baltimore, Maryland: 1984. [Google Scholar]

[R9] 9.Fan J, Yao Q. NONLINEAR TIME SERIES. Springer; New York: 2003. [Google Scholar]

[R10] 10.Greblicki W. Nonparametric identification of Wiener systems. IEEE Trans on Info Theory. 1992;38:1487–1493. [Google Scholar]

[R11] 11.Johnson C, Szulc T. Further lower bounds for the smallest singular values. Linear algebra and its applications. 1998;272:169–179. [Google Scholar]

[R12] 12.Nadaraya EA. NONPARAMETRIC ESTIMATION OF PROBABILITY DENSITIES AND REGRESSION CURVES. Kluwer Academic Pub; Dordrecht, The Netherlands: 1989. [Google Scholar]

[R13] 13.Papoulis A, Pillai SU. Probability, Random Variables and Stochastic Processes. 4. McGraw Hill; Boston: 2002. [Google Scholar]

[R14] 14.Voros J. Parameter identification of Wiener systems with discontinuous nonlinearities. Systems and Control Letters. 2001;44(5):363–372. [Google Scholar]

[R15] 15.Westwick D, Verhaegen M. Identifying MIMO Wiener systems using subspace model identification method. Signal Processing. 1996;52:235–258. [Google Scholar]

[R16] 16.Wigren T. Circle criteria in recursive identification. IEEE Trans on Automatic Control. 1997;42:975–979. [Google Scholar]

[R17] 17.Wigren T. Recursive prediction error identification using the nonlineari Wiener model. Automatica. 1993;29:1011–1025. [Google Scholar]

[R18] 18.Wigren T. Adaptive filtering using quantized output measurements. IEEE Trans on Signal Processing. 1998;46:3423–3426. [Google Scholar]

[R19] 19.Zhang Q, Iouditski A, Ljung L. IFAC Symp on System Identification. Newcastle; Australia: 2006. Identification of Wiener system with monotonous nonlinearity; pp. 166–171. [Google Scholar]

PERMALINK

Towards Identification of Wiener Systems with the Least Amount of a priori Information: IIR Cases

Er-Wei Bai

John Reyland Jr

Abstract

1 Introduction

2 Problem statement and preliminary

Fig. 1.

3 Point a priori information

Assumption 3.1

Lemma 3.1

Proof

Theorem 3.1

Fig. 2.

Fig. 3.

Fig. 4.

Table 1.

Table 2.

4 Monotonic nonlinearities

Assumption 4.1

Theorem 4.1

Fig. 5.

Fig. 6.

Table 3.

Table 4.

5 Quadrant a priori information

Assumption 5.1

Assumption 5.2

Theorem 5.1

Proof

Fig. 7.

Fig. 8.

Fig. 9.

Table 5.

6 Concluding remarks

Biographies

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases