Skip to main content
Computational Intelligence and Neuroscience logoLink to Computational Intelligence and Neuroscience
. 2016 Jun 5;2016:1690924. doi: 10.1155/2016/1690924

Neural Net Gains Estimation Based on an Equivalent Model

Karen Alicia Aguilar Cruz 1,*, José de Jesús Medel Juárez 1, José Luis Fernández Muñoz 2, Midory Esmeralda Vigueras Velázquez 1
PMCID: PMC4913025  PMID: 27366146

Abstract

A model of an Equivalent Artificial Neural Net (EANN) describes the gains set, viewed as parameters in a layer, and this consideration is a reproducible process, applicable to a neuron in a neural net (NN). The EANN helps to estimate the NN gains or parameters, so we propose two methods to determine them. The first considers a fuzzy inference combined with the traditional Kalman filter, obtaining the equivalent model and estimating in a fuzzy sense the gains matrix A and the proper gain K into the traditional filter identification. The second develops a direct estimation in state space, describing an EANN using the expected value and the recursive description of the gains estimation. Finally, a comparison of both descriptions is performed; highlighting the analytical method describes the neural net coefficients in a direct form, whereas the other technique requires selecting into the Knowledge Base (KB) the factors based on the functional error and the reference signal built with the past information of the system.

1. Introduction

There are different techniques for modelling a system to identify the characteristics that make it an excellent approximation. When considering the physical behaviour of the real process, these models usually lead to complex nonlinear systems, which are difficult to analyse and are simplified through relations between their input and output signals, obtaining a Black-Box (BB) description [1].

The system evolution viewed as a BB has no access to its internal properties but only to the input and output responses, without paying attention to the internal parameters that have a dynamical evolution. However, the human experiences, in many cases, give the original answer or selection (a fair process) through the intuition (if then (fuzzy logic) inferences) selecting the new parameters. These experiences provide better tolerable approximations in combination with other theories, for example, Lyapunov, Sliding Modes, or Intelligent Systems.

Artificial Neural Nets (ANN) viewed as mathematical models, applied for complex systems [2, 3], and inspired by biological neurons operations, generate a fast answer. Nevertheless, the computer devices, compared with the human brain, have different connotations because the methods considered are faster due to robust programming instead of chemical reactions in natural conditions. In addition, the human brain has the ability of self-programming and adaptability without requiring a new programming code, adjusting its required energy levels based on natural instincts using the intuition as a great tool.

An artificial neural model (Figure 1) based on a biological neuron principle implemented in a computational model is helpful in prediction and classification of problems, pattern recognition, signal processing, estimation, and control [4]. In the case when more than a neuron is connected, we obtain a MISO (Multiple Inputs and a Single Output) neural net, and similar to human adaptability, the model considered adjusts its functional parameters using learning processes for different outputs depending on the stimuli.

Figure 1.

Figure 1

Basic neuron model.

When analysing a system, its characteristics help us determine the best method according to our requirements, improving the convergence rate. In this paper, we compare a hybrid model and an analytical model. The first implies a fuzzy estimation combined with the Kalman filter description, and the second is optimal with systematic evaluation, considering the expected value obtained from the previous information. Both are based on an Equivalent Artificial Neural Net (EANN).

2. Equivalent Neural Net

In the biological sense, for neural-specific tasks, the neuron cell inputs require accomplishing adequate properties for generating a fire action in the neuron soma to obtain an accurate output in the neuron axon; in other cases, the biological system operates with the minimal energy required which allows not losing connection with other neurons.

In an ANN, we identify three principal sections: input, hidden, and output layers. The input layer is the interaction between the input signals and the first block of weights; its result becomes the input to the following coating. At the very first stage, the designer selects the weights intuitively and adjusts the following while looking for the desired response [5]. The hidden layers have a set of inputs and outputs in different stages related through the weights. The output layer represents the convolution or binary sum of the last block of weights and its respective data. In an illustrative way, Figure 2 depicts an example of a typical ANN, with two hidden layers, input and output.

Figure 2.

Figure 2

Artificial Neural Net description.

The EANN is a simplified representation of an ANN whose task is to obtain the parameters vector which allows the system to reach the desired reference signal without paying initial attention to the internal layers, focusing on the estimation procedure. It considers the total input-output signals relation trying to reduce unnecessary delays and always keeping the weight interconnection form, achieving the desired response.

As shown in Figure 2, input data is denoted by a set described as ui:i=1,n-,nZ+,uR, and the output data is denoted by {y nR}, where n represents the number of input elements. In addition, the weights or parameters are considered as Wlk:k=1,n-,n,lZ+l=1,L-, where k indicates the specific parameter number in a layer l. The admitted layers are within a set of functions fi,jl,j=1,m-,mZ+i=1,n-l=1,L-, interconnecting directionally from an original parameter i into layer l to a target parameter j into layer (l + 1). Each weight from the input and output layers requires an activation function and all the hidden layers have proper activation functions connected to other weights to achieve different and specific requirements for each output stage. This description corresponds to Figure 3, where the traditional ANN connections have now simple flow diagram lines containing activation functions described as fil,i=1,n-;n,lZ+l=1,L-, where i represents the function number and l represents the layer this functions leaves.

Figure 3.

Figure 3

Simple description of an Artificial Neural Net through activation functions.

Figure 3 presents the activation functions from the first hidden layer (l = I) operating with an accumulative energy W I k=i convolved with an input u i in agreement with the following equation:

fiI=WIiuiif  ϖi_uiϖi¯0inothercases. (1)

The set of pairs ϖi_,ϖi¯:i=1,n-,nZ+ represents activation limit functions for f i I. These limits denote the minimum and maximum required energy to excite a neuron for a specific weight W I i, known as fire limits. In the following hidden layer (II), f i II requires that the set of inputs accomplishes the same requirements, considered for previous results; that is,

fiII=WIIif1If2IfnIif  ϖiII_f1If2IfnIϖiII¯0inothercases. (2)

In (2), the binary operator “” represents the composition of the involved terms without indicating a particular operation.

The equivalent weights sequence allows each input to include the structure of the previous parameters in the final description. Each layer takes part in the following activation function due to the interaction between the new weights and the previous composed output signal. Figure 4 shows the EANN model in the simplified form.

Figure 4.

Figure 4

Equivalent Neural Net representation.

According to [6], each neuron output has a function whose parameters are the inputs and weights for the following layer. Equation (3) expresses the influence of the previously mentioned parameters, weights, and inputs to the following layer in a recursive form, where, instead of y n, S l n describes the operation ∑i=1 n f i l W l i u i as the core neuron function:

Sln=fnlWlnun+MSln1, (3)

where M is a proportional constant adjusting the previous layer; therefore, the model converges to the neural net development. At the final coat, we have the convolution y n = ((FW)∘U)l n, which represents the neural net response. For computational applications, this reaction has the effect of an activation function, usually the sigmoid function.

A sophisticated ANN considers the integration of more than one EANN since its description allows using the recursive characteristics. In addition to this, the implementation of EANNs gives the possibility to restrict the number of necessary iterations to reach a reference, which is the remarkable feature in systems where time delays are considerable.

3. Equivalent Neural Net Using Arma Description

An ARMA (1,1) (Autoregressive Moving Average) model is a tool used for obtaining the parameters matrix from a reference system viewed as a MISO BB; its primary structure is specified by (4) and (5), with n being the time evolution:

Sln+1=ASln+Blnun, (4)
yn=CSln, (5)

where S l nR [0,1, {u n}⊆N(μ u, σ u 2 < ), and y nR [0,1.

This model has observable (y n) and internal (S l n) states, an input signal (u n), gains (B, C), and internal gain (A). The measurable state (6) in explicit form is a function of its immediate past, internal gain, and the inputs uii=1,n-. Consider

yn=fyn1,A,uii=1,n. (6)

In [7] the internal state using the traditional Kalman filter (KF) is described even though the internal gain A and the gain K n are still unknown. The complexity of the filter increases because after the identification the internal gain depends on the error, which has an application in (4) for obtaining the observable signal approximation in (5), represented in the discrete form in the following equation:

Sln=Sln1+Bln1un1. (7)

By applying (7) in (4), including the still unknown internal state, we obtain (8)

yn=CASln1+CBln1un1. (8)

The internal state from (6) allows in (9) obtaining the internal lagged state as a measurable state function and output perturbations. Consider

Sln1=C+yn1. (9)

Considering (9) in (8) we determine the output in the following equation:

yn=Ayn1+CBln1un1. (10)

Equation (11) represents a recursive form of (10) describing the reference system with an innovation process:

yn=Ayn1+δ^w^n, (11)
δ^w^n=CSln1un1. (12)

In agreement with [8], the gain (K n−1) with (12) corresponds to Kn-1δ^w^ny^n-A^nyn-1. The hybrid filter (13) considers the fuzzy parameter estimation, the gain description, and the lagged signal:

y^n=A^nyn1+Kn1δ^w^n. (13)

With the innovation process and the reference system, bounded by the same general Membership Function (MF) [8, 9], it is possible to estimate the explicit matrix parameters and the gain using the inference mechanisms considering the functional results and the noise properties, respectively.

4. Fuzzy Gains and Estimation Properties

In the fuzzy sense, [10] presented the parameters obtained by a controller, considering a fuzzy function vector for nonlinear systems. The MIMO system found firstly the linear representation formed by a collection of MISO systems with the same inputs, reducing and simplifying its analysis.

On [11], the hybrid combination required that the identification filter adjust the parameters automatically using fuzzy logic. This adjustment needs the selection of the best values with respect to the inference, minimizing the error convergence by using heuristic techniques or the Least Square Method (LSM).

The first step in the fuzzy estimation determines the reasoning levels in accordance to the proposed MF, identified through the reference signal statistical properties. There could be triangular, sinusoidal, and impulsive or Gaussian functions, among others, to define the ranges contained in the reference signal classification.

A set of fuzzy rules (if then) forms a Fuzzy Rule Base (FRB) to interpret what requirements and process conditions are needed. Previously, it is necessary to select and introduce the best values to the Knowledge Base (KB) according to the MF, actualizing the parameters according to the reference model limited by the filter error criteria.

Using the fuzzy logic connectors into the fuzzy stage, considering the desired signal (y n) and the region level with respect to (J n n), reducing the inference operational levels and indicators in to the MF, and selecting from the KB parameters A^n values actualise the hybrid filtering process. Each fuzzy filtering rule finds specific matrix parameters in each evolution [9, 12].

In the same sense, the hybrid filter considers the basic principles of a conventional Kalman digital filter using the Mean Least Square Criterion (MLSC) described as J n n = 〈e n, e n T(1/2) and, in agreement with [5], in its recursive form:

Jnn=enenT+Jn1n11/2R0,1. (14)

According to [9], (14) presents the adequate element describing the optimal matrix parameters.

5. Optimum Coefficient

For an ANN, to determine an optimal vector coefficient is necessary to consider minimizing the error and as the primary objective that the convergence tends to zero. One inconvenience is how long this optimal convergence will take to occur. A control for a recurrent NN, described in [13], was an optimum by adding an extra coefficient to compensate for the error within a small bound in an unknown necessary learning time.

Considering the fact that the last stage of a hybrid filter corresponds to the equivalent neural net from Figure 3, it is possible to determine the optimum parameters for the neural weights obtaining the best output approximation to the reference signal by an analytical process.

Based on BB concepts, the input signals uini=1,N- represented by the matrix x n [N × 1] and the output of the system y n are the known parameters. In this sense, we need a synthesis process to calculate the matrix values A [1 × N] representing the weights in the neural layer.

Having y n = Ax n as an ARMA model and the process considering stochastic properties, we use the mathematical expectation in the probabilistic sense obtaining information about the process. So A^nEynxnTExnxnT+, where the symbols T and + represent the transpose and pseudoinverse operators, respectively.

If y n is the reference signal which helps us get the parameters, then we apply these values to find the output y iden,n, and their comparison gives the identification error e ny ny iden,n and its functional error J n n≔〈e n, e n T〉 tending to zero due to the values being considered optimums.

To demonstrate this, from Figure 4, the output is observed as

yn=wI1u1n+wI2u2n++wINuNn. (15)

In addition, seeing y n as the reference or target signal defined as

AwI1wI2wIN1×N;xnTu1nu2nuNn1×N, (16)

we have the following form:

yn=Axn. (17)

Considering x n T is a stochastic input formed in distribution sense by uinTNμ,σ2<i=1,N-, the parameters are represented by A and the output signal is represented by y n; the BB system scheme allows estimating the parameters set through its time evolution in a probabilistic sense. Consider

Eyn1xn1×NT=EA1×NxnN×1xn1×NT. (18)

Due to the weights being constants for an instant of execution time n and considering the mathematical expectation properties, it is possible to obtain the matrix estimation known as A^n, indicating that this new array value is the matrix estimation. Consider

A^n1×NEyn1xn1×NTExnN×1xn1×NT+. (19)

For a discrete system (19) with infinite enumerable elements, the mathematical expectation has the following form:

A^n1×N1ni=1nyi1xi1×NT1×N·1ni=1nxiN×1xi1×NTN×N+. (20)

By replacing A^n1×N in (17), we obtain a new output state of y n which we call identification symbolically described as y iden,n; it represents the output including the effects of the estimated weights values.

The difference between the identified signal and the reference signal gives the following identified error:

en=ynyiden,n. (21)

In order to express (20) recursively, the first and second terms are replaced with P n and Q n, respectively, defined as follows:

Pn=1ni=1nyixiT, (22)
Qn=1ni=1nxixiT. (23)

Considering (22) and (23) in (20), (24) and its delayed form (25) for stable conditions are obtained:

A^n=PnQn+, (24)
A^n1=Pn1Qn1+. (25)

Developing (22) in recursive manner has the following equation:

Pn=1nynxnT+i=1n1yixiT. (26)

Considering stationary conditions for (22) delayed has

Pn1=1n1i=1n1yixiT. (27)

Rewriting (26) in terms of (27), we have (28) and its block diagram representation shown in Figure 5:

Pn=1nynxnT+n1Pn1. (28)

Expanding (28) and ordering with respect to P n−1, we have the following equation:

Pn=n1nPn1+1nynxnT. (29)

Now, applying (29) in (24), we have the following estimation:

A^n=n1nPn1Qn++1nynxnTQn+. (30)

Remembering that (25) in stationary conditions is the estimation delayed and applying it in (30) yields the following:

A^n=n1nA^n1Qn1Qn++1nynxnTQn+. (31)

Figure 5.

Figure 5

Obtainment of P n, recursive.

Using (31) in (24), we obtain the parameter vector in recursive form (32). The block diagram representing A^n parameter using (31) is in Figure 6. Consider

A^n=A^n1αn1+βn1, (32)

where α n−1 = ((n − 1)/n)Q n−1 Q n + and β n−1 = (1/n)y n x n T Q n +.

Figure 6.

Figure 6

Parameter A^n block diagram.

As (31) includes (23) in its description, it is necessary to build its recursive form similar to the obtainment of (28); then we have (33) and its block diagram representation shown in Figure 7. Consider

Qn=1nxnxnT+n1Qn1. (33)

Figure 7.

Figure 7

Obtainment of Q n, recursive.

Finally, replacing (32) in (17), the identified output is the following equation:

yiden,n=A^nxn. (34)

Figure 8 represents the interaction between the inputs and the resulting error, which has better convergence due to the null error, determined for an instant by the best parameters values.

Figure 8.

Figure 8

Block diagram of the analytical process.

6. Hybrid Mechanism of Inference

Figure 9 provides the block diagram of a hybrid filter that combines fuzzy inferences with the EANN ARMA model description, instead of the logical block from Figure 8, to determine the adequate matrix parameters. The reference model considered is a BB giving the reference signal y n. The distribution curve of this signal denotes the intervals where the MF must be; then, the degree of membership obtained by Mamdani with fuzzy inferences has access to the Knowledge Base (KB), determining the parameters of the model, making the convergence, and minimizing the error in a distribution sense.

Figure 9.

Figure 9

Block diagram of a hybrid filter with Fuzzy inference for an EANN model through ARMA description.

7. Results

The performed simulation considers a comparison between both methods giving a better idea of how they approximate to the reference signal. The reference model output y n considered nonstationary conditions, noise sequences bounded by a distribution function, and, on average, constant mean expected value and variance. The variations in the signal have a periodic signal with smooth random perturbations.

The first part of the simulation considered the hybrid filter, applying inferences obtaining (13) as the signal output identification. Figure 10 shows the fuzzy inference process, where it is possible to identify the functional error given by (14), useful to estimate the coefficients for the ANN. The distribution curves defined the MFs having different operational levels represented through three and seven MFs, corresponding to y n and J n n, respectively.

Figure 10.

Figure 10

Inference process in the hybrid filter.

These MFs are results of associated proper inference mechanisms to select parameters A^n and gain K n through the MFs and the KBs, affecting the final identified output y^n. As an example, Figure 11 presents a three-dimensional KB integrated by sets: gain {K n}, reference signal {y n}, and functional error {J n n}. This KB helps us determine gain K n through the reference and operational error considering our expertise. The KB for A^n has a similar structure.

Figure 11.

Figure 11

Example of a three-dimensional Knowledge Base to obtain the gain (K n) through the reference signal (y n) and the functional error (J n n).

The analytical method uses the block diagram presented in Figure 8 having a delay in execution time due to the time state operations but within fewer process stages due to the fact that it does not require feedback from the functional error.

Our objective was to determine the internal parameters; Figure 12 compares the reference signal parameters to those estimated with both methods. The polar representation allows observing the components of the parameters; where it is possible to see, none of them leaves the unit circle.

Figure 12.

Figure 12

Comparison of the internal system parameters in a polar graph: reference signal parameters A n (blue), estimated signal parameters (A^n) through hybrid estimation (red), and analytical estimation (magenta).

When applying the estimation into the hybrid system, the response is as shown in Figure 13, which presents the response following the tendency of the reference.

Figure 13.

Figure 13

Comparison of the system response: reference signal (blue) and its hybrid approximation (red).

The analytical method provides the response in Figure 14.

Figure 14.

Figure 14

Comparison of the system response: reference signal (blue) and its analytical approximation (magenta).

The previous graphics, Figures 1214, were obtained considering a reference system with variable parameters and random noise. In order to better identify how the approximations converge to the reference, we have Figure 15 which presents a graphic segment showing more clearly both approximations to a reference system response, with also variable parameters but without random noise.

Figure 15.

Figure 15

Comparison of the system response convergence for a system with variable parameters and no random noise.

From Figure 15, Figure 16 compares the convergence considering the functional error (14) from both methods. In this case, the reference is near to zero as a constant value due to the estimations considered as optimum.

Figure 16.

Figure 16

Convergence of the functional error. Comparison between the hybrid (magenta) and analytical (blue) evaluations.

8. Conclusion

An Equivalent Artificial Neural Net (EANN) was considered describing its parameter through a Black Box (BB) analysis using two different approximations, hybrid and analytical techniques.

For the fuzzy estimation, the best option was to consider the error properties and, in this method, the response signal was adjusted according to the reference. The fuzzy evaluation allowed the description of the coefficients and gain which affect the Kalman filter, improving the identification process according to the Multiple Inputs and Single Output (MISO) model changes with perturbations. The parameter and gain selection, using an intelligent system with classification levels, allowed selecting into the KBs the best coefficients that positively affected the filter evolution. This method does not have an exact approximation, but it is good enough on average, as shown in Figure 12, and in distribution (Figures 13 and 15) if we consider that its response converged to a particular region different from zero.

The second method used the analytical approximation, converging at almost all points to the system parameters and the reference (Figures 12 and 14) so that the expected results were a minimum functional error through time. We considered that the null error corresponded to the low energy limit, which is not zero in the neurons to avoid the total loss of connection. This method had a better approximation to the reference but achieved the minimum error only in the numerable infinite. Even though this estimation does not consider the error feedback as the first method does, its response continues being adequate when external perturbations affect the system.

A sophisticated ANN could be represented by the integration of more than one EANN due to the fact that its description allows it to consider more than one layer because it has recursive characteristics. In addition to this, the implementation of EANNs gives the possibility of having more control on the number of necessary iterations to reach a reference; this is relevant to systems where restrictions in time delays are considered essential.

Globally, both methods presented good approximations, as shown in Figure 16, with unique characteristics identifying differences between the hybrid and analytical methods.

Acknowledgments

The authors wish to thank the Instituto Politécnico Nacional (IPN) and the Consejo Nacional de Ciencia y Tecnología (CONACYT) for their support while carrying out their research work. The Instituto Politécnico Nacional supported the paper through Project no. SIP20160382.

Competing Interests

The authors declare that there are no competing interests regarding the publication of this paper.

References

  • 1.Romero Ugalde H. M., Carmona J.-C., Reyes-Reyes J., Alvarado V. M., Mantilla J. Computational cost improvement of neural network models in black box nonlinear system identification. Neurocomputing. 2015;166:96–108. doi: 10.1016/j.neucom.2015.04.022. [DOI] [Google Scholar]
  • 2.Witters M., Swevers J. Black-box model identification for a continuously variable, electro-hydraulic semi-active damper. Mechanical Systems and Signal Processing. 2010;24(1):4–18. doi: 10.1016/j.ymssp.2009.03.013. [DOI] [Google Scholar]
  • 3.Subudhi B., Jena D. A differential evolution based neural network approach to nonlinear system identification. Applied Soft Computing. 2011;11(1):861–871. doi: 10.1016/j.asoc.2010.01.006. [DOI] [Google Scholar]
  • 4.Passino K. M., Yurkovich S. Fuzzy Control. New York, NY, USA: Addison-Wesley; 1998. [Google Scholar]
  • 5.Infante J. C. G., Juárez J. J. M., López P. G. Filtrado digital difuso en tiempo real. Computación y Sistemas. 2008;11(4):390–401. [Google Scholar]
  • 6.Huang G.-B., Zhu Q.-Y., Siew C.-K. Real-time learning capability of neural networks. IEEE Transactions on Neural Networks. 2006;17(4):863–878. doi: 10.1109/tnn.2006.875974. [DOI] [PubMed] [Google Scholar]
  • 7.Kalman R. E. A new approach to linear filtering and prediction problems. Journal of Basic Engineering. 1960;82(1):35–45. doi: 10.1115/1.3662552. [DOI] [Google Scholar]
  • 8.Takagi T., Sugeno M. Fuzzy identification of systems and its applications to modeling and control. IEEE Transactions on Systems, Man and Cybernetics. 1985;15(1):116–132. [Google Scholar]
  • 9.Medel Juárez J. J., García Infante J. C., Guevara López P. Real-Time Fuzzy Digital Filters (RTFDF) properties for siso systems. Automatic Control and Computer Sciences. 2008;42(1):26–34. doi: 10.1007/s11950-008-1004-2. [DOI] [Google Scholar]
  • 10.Huaguang Z., Cai L., Bien Z. A fuzzy basis function vector-based multivariable adaptive controller for nonlinear systems. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics. 2000;30(1):210–217. doi: 10.1109/3477.826963. [DOI] [PubMed] [Google Scholar]
  • 11.Garcia J. C., Medel J. J., Sanchez J. C. Neural fuzzy digital filtering: multivariate identifier filters involving multiple inputs and multiple outputs (MIMO) Ingenieria e Investigacion. 2011;31(1):184–192. [Google Scholar]
  • 12.Dudley R. M. Real Analysis and Probability. Cambridge, UK: Cambridge University Press; 2004. [Google Scholar]
  • 13.Zhang H., Cui L., Zhang X., Luo Y. Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Transactions on Neural Networks. 2011;22(12):2226–2236. doi: 10.1109/TNN.2011.2168538. [DOI] [PubMed] [Google Scholar]

Articles from Computational Intelligence and Neuroscience are provided here courtesy of Wiley

RESOURCES