Manufacturing industry based on dynamic soft sensors in integrated with feature representation and classification using fuzzy logic and deep learning architecture

Shakir Khan; Tamanna Siddiqui; Azrour Mourade; Bayan Ibrahimm Alabduallah; Saad Abdullah Alajlan; Abrar almjally; Bader M Albahlal; Amani Alfaifi

doi:10.1007/s00170-023-11602-y

. 2023 Jun 6:1–13. Online ahead of print. doi: 10.1007/s00170-023-11602-y

Manufacturing industry based on dynamic soft sensors in integrated with feature representation and classification using fuzzy logic and deep learning architecture

Shakir Khan ^1,^2,^✉, Tamanna Siddiqui ³, Azrour Mourade ⁴, Bayan Ibrahimm Alabduallah ^5,^✉, Saad Abdullah Alajlan ¹, Abrar almjally ¹, Bader M Albahlal ¹, Amani Alfaifi ¹

PMCID: PMC10243703 PMID: 37360660

Abstract

Soft sensors are data-driven devices that allow for estimates of quantities that are either impossible to measure or prohibitively expensive to do so. DL (deep learning) is a relatively new feature representation method for data with complex structures that has a lot of promise for soft sensing of industrial processes. One of the most important aspects of building accurate soft sensors is feature representation. This research proposed novel technique in automation of manufacturing industry where dynamic soft sensors are used in feature representation and classification of the data. Here the input will be data collected from virtual sensors and their automation-based historical data. This data has been pre-processed to recognize the missing value and usual problems like hardware failures, communication errors, incorrect readings, and process working conditions. After this process, feature representation has been done using fuzzy logic-based stacked data-driven auto-encoder (FL_SDDAE). Using the fuzzy rules, the features of input data have been identified with general automation problems. Then, for this represented features, classification process has been carried out using least square error backpropagation neural network (LSEBPNN) in which the mean square error while classification will be minimized with loss function of the data. The experimental results have been carried out for various datasets in automation of manufacturing industry in terms of computational time of 34%, QoS of 64%, RMSE of 41%, MAE of 35%, prediction performance of 94%, and measurement accuracy of 85% by proposed technique.

Keywords: Soft sensors, Deep learning, Automation, FL_SDDAE, Classification, LSEBPNN

Introduction

Soft sensor is a virtual inferential prediction method that uses easily measured variables to forecast process variables that are difficult to measure directly due to technological, economic constraints as well as a complex environment. Soft sensor attempts to construct a regression prediction method between easily measured variables as well as difficultly measured variables, which is used to address issue that hinders measurements from being used as feedback signals in quality control methods [1]. For at least 10 years, there has been a growing trend in use of data-driven AI (artificial intelligence) approaches to enhance machines, processes, and products across several industrial domains [2]. In recent years, reducing emissions as a result of stronger environmental restrictions has also been a major motivator [3]. However, gathering the data required for such approaches is fraught with difficulties, one of which is the long life of industrial gear. Official depreciation estimates range from (rarely) 6 to more than 30 years, depending on the country, type of machinery, and industrial sector [4]. Experience suggests that, particularly in small and medium-sized businesses, resilient equipment can last even longer in regular usage. Soft sensor approaches are used more widely in industrial processes, and they have become a key emerging trend in both academics as well as industry [5]. Early academics proposed model predictive control like generalized predictive control, dynamic matrix predictive control and model control method, in light of model prediction in industrial production process [6]. However, these soft sensor prediction approaches have several flaws. ANN (Artificial neural networks), rough set, SVM (support vector machine) and hybrid techniques are some AI and ML methods based on data-driven technologies that have been proposed to solve issues where it is difficult to measure key processes as well as quality variables for soft sensor methods as a result of DL in soft sensor control method as well as continuous progress in engineering technology [7].

The contribution of this research is as follows:

To design novel techniques in automation of manufacturing industry where the dynamic soft sensors are used in feature representation and classification of the data
To collect the data cloud storage and create the virtual sensors dataset based on gear fault detection, spindle fault detection, and bearing fault detection in automation industry
To represent the feature using fuzzy logic-based stacked data driven auto-encoder (FL_SDDAE) where the features of input data have been identified with general automation problems.
Then, the features have been classified using least square error backpropagation neural network (LSEBPNN) in which the mean square error while classification will be minimized with loss function of the data
Here the experimental results have been carried out in terms of QoS, measurement accuracy, RMSE, MAE, prediction performance, and computational time.

Research organization is as follows. In Section 2, related works are described. Section 3 gives details of proposed method. proposed method performance, and the results are present in Section 4. Finally, Section 5 concludes the work.

Related works

DL-based techniques are recently exhibited solid representation competency and success in a variety of computer science domains, including image processing, computer vision, NLP, and more [8]. Stack autoencoder (SAE) [9], DBN (deep belief network) [10], CNN [11], and LSTM [12] are some of widely utilized deep network architectures. Greedy layer-wise unsupervised pre-training, as well as supervised fine-tuning, are highly important for DL architectures like SAE. The SAE weights evaluated during unsupervised pre-training step are used in supervised fine-tuning stage, which is a more significant method than random weight initialization [13]. As a result, various industrial applications of soft sensors based on SAE [14] are presented. Same authors improved this result significantly by utilizing a TDNN in [15].Mean error dropped to just 1.14 to 1.32% and 1.65° to 3.08°, in the same conditions utilized in [16, 17]. As a result, the type of network used in these two papers had a significant impact on the algorithms’ performance. In [18], an RNN is presented that collects information regarding air–fuel ratio λ, ignition angle, and turbocharger boost pressure in addition to rotational speed signal. Focus was on neural network design, which had a significant impact on the algorithm’s performance. To estimate cylinder pressure curves, [19] uses a NN with RBF (radial basis functions) as well as consequently no recurrence. Authors of [20] presents a novel convolutional, BiGRU, and Capsule network-based deep learning model, HCovBi-Caps, to classify the hate speech and authors of [21] introduce BiCHAT: a novel BiLSTM with deep CNN and hierarchical attention-based deep learning model for tweet representation learning toward hate speech detection. Authors of [15] do not use raw rotational speed signal, but instead translate it into frequency domain as well as process only first 20 harmonics, to earlier research are used an RBF network. They also employ structure-borne sound signal’s 21st–50th harmonics. As a result, the preparation of the given data is the most important aspect of this project. The typical errors for pMax as well as its position in crank angle range are 3.4% and 1.5°, respectively. Using a multi-layer perceptron, [22] predicts combustion parameters directly from crankshaft’s rotational speed as well as acceleration data, in contrast to the previous studies (MLP). The mean error lies between 1.38° and 9.1°, with a range of 4.1 to 8.0%. A deep learning-based R2DCNNMC model is proposed for detection and classification of COVID-19 employed chest X-ray images data [23]. Privacy of data driven uses on the k-anonymity and l-diversity supervised models classifies the healthcare data [24]. Virtualization for dynamics on cloud for network operation and management is discussed in [25] and proposed hybrid model on cloud ensures the maximum benefits from virtualization.

The effective implementations of SAE-based DL listed above reveal a significant capacity to extract features. Deep structures exceed typical soft-sensing prediction performance thanks to unsupervised layer-wise pre-training as well as supervised fine-tuning processes. Proposed industrial soft sensors are static methods based on notion of a static process as well as steady-state. However, the inherently dynamic nature of industrial processes cannot be neglected. Chemical processes, for example, are highly dynamic, with current state being linked to earlier ones. As a result, time-related characteristics of time-series recorded data are important.

System model

This section discusses the proposed design in automation of manufacturing industry based on dynamic soft sensors. Here the data has been processed to recognize the missing value and usual problems like hardware failures, incorrect readings, communication errors, process working conditions. Then their features have been represented in module 1 and the represented feature has been classified in module 2 using deep learning techniques. The overall research architecture is given in Fig. 1.

Fig. 1 — Overall Proposed diagram for virtual sensor-based fault detection in automation industry

Feature representation using fuzzy logic based stacked data driven auto-encoder (FL_SDDAE)

ENotice is replaced Eq. (2) represents overall input–output transfer function about general autoencoder (AE) structure. The input $(x^{[α]} \in R^{d})$ is supplied to hidden layer, whose output is utilized to reconstruct $({\overset{‵}{x}}^{|α ⌋)})$ input through output layer (y) as shown in Eq. (1).

{\overset{‵}{x}}^{[α]} = y_{(W^{^{'}}, b^{^{'}})} (h_{(W, b)}, (x^{[α]})) \equiv x^{[α]}

An encoder or recognition model is another name for this approach. Optimize variational parameters φ such that as shown in Eq. (2):

q_{ϕ} ((z|, x) \approx p_{θ} ((z|, x)

As stated in Eq. (3), inference models are any directed graphical model.

q_{ϕ} ((z|, x) (z_{1}, \dots, (z_{M}| x) = \prod_{j = 1}^{M} q_{ϕ} ((z_{j}| P a (z_{j}), x)

In the directed graph, $P a (z_{j})$ is set of parent variables of variable $z_{j}$ .

\log p_{θ} (x) = E_{q_{ϕ} ((z|, x)} [\log, p_{θ}, (x)] = E_{q_{ϕ} ((z|, x)} [\log, [\frac{p_{θ} (x, z)}{p_{θ} ((z|, x)}]] = E_{q_{ϕ} ((z|, x)} [\log, [\frac{p_{θ} (x, z)}{q_{ϕ} ((z|, x)}, \frac{q_{ϕ} ((z|, x)}{p_{θ} ((z|, x)}]] = \underset{= L_{θ, ϕ} (x)}{\underset{⏟}{E_{q_{ϕ} ((z|, x)} [\log, [\frac{p_{θ} (x, z)}{q_{ϕ} ((z|, x)}]]}} + \underset{= D_{KL} (((q_{ϕ}, ((z|, x)||, p_{θ}, ((z|, x))}{\underset{⏟}{E_{q_{ϕ} ((z|, x)} [\log, [\frac{q_{ϕ} ((z|, x)}{p_{θ} ((z|, x)}]]}}

The non-negative Kullback–Leibler (KL) divergence between $q_{ϕ} ((z|, x)$ and $p_{θ} ((z|, x)$ is the second term in Eq. (5):

D_{KL} (((q_{ϕ}, ((z|, x)||, p_{θ}, ((z|, x)) \geq 0

The variational lower bound, commonly known as ELBO, is the first term in Eq. (6):

L_{θ, ϕ} (x) = E_{q_{ϕ} ((z|, x)} [\log p_{θ} (x, z) - \log q_{ϕ} ((z|, x)]

Because the KL divergence is non-negative, ELBO shows a lower bound on data’s log-likelihood, as demonstrated in Eq. (7).

\begin{matrix} L_{θ, ϕ} (x) = \log p_{θ} (x) - D_{KL} (((q_{ϕ}, ((z|, x)||, p_{θ}, ((z|, x)) \\ \leq \log p_{θ} (x) \end{matrix}

\begin{matrix} \nabla_{θ} L_{θ, ϕ} (x) = \nabla_{θ} E_{q_{ϕ} ((z|, x)} [\log p_{θ} (x, z) - \log q_{ϕ} ((z|, x)] = E_{q_{ϕ} ((z|, x)} [\nabla_{θ}, (\log p_{θ} (x, z) - \log q_{ϕ} ((z|, x))] \\ ≃ \nabla_{θ} (\log p_{θ} (x, z) - \log q_{ϕ} ((z|, x)) = \nabla_{θ} (\log, p_{θ}, (x, z)) \end{matrix}

Because the ELBO’s expectation is taken $q_{ϕ} ((z|, x)$ , which is a function of φby Eq. (9):

\nabla_{ϕ} L_{θ, ϕ} (x) = \nabla_{ϕ} E_{q_{ϕ} ((z|, x)} [\log p_{θ} (x, z) - \log q_{ϕ} ((z|, x)] \neq E_{q_{ϕ} ((z|, x)} [\nabla_{ϕ}, (\log p_{θ} (x, z) - \log q_{ϕ} ((z|, x))]

Apply a reparameterization approach to compute unbiased estimates of $\nabla_{ϕ} L_{θ, ϕ} (x)$ , in case of continuous latent variables.

Replace an expectation w.r.t. $q_{ϕ} ((z|, x)$ with one w.r.t. $p_{θ}$ via reparameterization given by Eq. (10).

L_{θ, ϕ} (\log, x) = E_{q_{ϕ} ((z|, x)} [\log p_{θ} (x, z) - \log q_{ϕ} ((z|, x)] = E_{p (ϵ)} [\log p_{θ} (x, z) - \log q_{ϕ} ((z|, x)]

ϵ \sim p (ϵ) z = g (ϕ, x, ϵ) {\overset{‵}{L}}_{θ, ϕ} (x) = \log p_{θ} (x, z) - \log q_{ϕ} ((z|, x) \begin{matrix} E_{p (ϵ)} [\nabla_{θ, ϕ}, {\overset{‵}{L}}_{θ, ϕ}, (x ; ϵ)] & = E_{p (ϵ)} [\nabla_{θ, ϕ}, (\log p_{θ} (x, z) - \log q_{ϕ} ((z|, x))] \\ = \nabla_{θ, ϕ} (E_{p (ϵ)}, [\log p_{θ} (x, z) - \log q_{ϕ} ((z|, x)]) \\ = \nabla_{θ, ϕ} L_{θ, ϕ} (x) \end{matrix}

A simple factorized vGaussian encoder by Eq. (12)

q_{ϕ} ((z|, x) = N (z ; μ, diag (σ^{2})) : (μ, \log σ) = {EncoderNeuralNet}_{ϕ} (x) q_{ϕ} ((z|, x) = \prod_{i} q_{ϕ} ((z|, x) = \prod_{i} N (z_{i} ; μ_{i}, σ_{i}^{2})

z = μ + σ ⊙ ϵ

The log determinant of the Jacobian is given by Eq. (13):

\log d_{ϕ} (x, ϵ) = \log |\det, (\frac{\partial z}{\partial ϵ})| = \sum_{i} \log σ_{i}

and the posterior density is given by Eq. (14):

\begin{matrix} \log q_{ϕ} ((z|, x) = \log p (ϵ) - \log d_{ϕ} (x, ϵ) & = \sum_{i} \log N (ϵ_{i} ; 0, 1) - \log σ_{i} \end{matrix} when z = g (ϵ, ϕ, x)

From Eq. (15)

Σ = E [(z - E [(z|]), {(z - E [(z|])}^{T}] = E [L, ϵ, {(L, ϵ)}^{T}] = L E [ϵ, ϵ^{T}] L^{T} = {L L}^{T}

Let Gx be defined: X ⊂ Rn → R, that is, a function on compact set X = α1,1 × … × [αn,βn] and analytic formula of Gx be unknown.

Define $N_{j} (j = 1, 2, \dots, n)$ fuzzy sets $A_{j}^{1}, A_{j}^{2}, \dots, A_{j}^{N_{j}} \in [α_{j}, β_{j}]$ , which are normal, consistent, and complete with triangular MFs $μ_{A_{j}^{1}} (x_{j} ; a_{j}^{1}, b_{j}^{1}, c_{j}^{1}), \dots, {μ_{A_{j}}}^{N_{j}} (x_{j} ; a_{j}^{N_{j}}, {b_{j}}^{N_{j}}, {c_{j}}^{N_{j}})$ , and $A_{j}^{1} < A_{j}^{2} < \dots < A_{j}^{N_{j}}$ with $a_{j}^{1} = b_{j}^{1} = α_{j}$ and $b_{j}^{N_{j}} = c_{j}^{N_{j}} = β_{j}$ , which,

$e_{1}^{1} = α_{1}, e_{1}^{N_{1}} = β_{1}$ , and $e_{1}^{j} = b_{1}^{j}$ for $j = 2, 3, \dots, N_{1} - 1$ , - $e_{2}^{1} = α_{2}, e_{2}^{N_{2}} = β_{2}$ , and $e_{1}^{j} = b_{2}^{j}$ for $j = 2, 3, \dots, N_{2} - 1$ ,: $e_{n}^{1} = α_{n}, e_{n}^{N_{n}} = β_{n}$ , and $e_{1}^{j} = b_{1}^{^{'}}$ for $j = 2, 3, \dots, N_{n} - 1 .$
Construct $I = N_{1} \times N_{2} \times \dots \times N_{n}$ fuzzy if–then rules in following form:
$R_{X}^{j_{1} - j_{n}} : IF x_{1}$ is $A_{1}^{j_{1}}$ and $x_{2}$ is $A_{2}^{j_{2}}$ and … and $x_{n}$ is $A_{n}^{j_{n}}$ Then $y$ is $B^{j_{1} - j_{n}}$ , where $j_{1} = 1, 2, \dots, N_{1}, j_{2} = 1, 2, \dots, N_{2}, \dots, j_{n} = 1, 2, \dots, N_{n}$ , and center of the fuzzy set $B^{j_{1} \dots /_{n}}$ , denoted by ${\overset{↼}{y}}^{{^{'}}_{1} \dots /_{n}}$ , is chosen as Eq. (16):
${\overset{↼}{y}}^{j_{1} \dots / n} = G (e_{1}^{j_{1}}, \dots, e_{n}^{j_{n}})$
$ϑ_{l} = τ (μ_{A_{1}^{1 - j - i n, i}} (x_{1}), μ_{A_{2}^{1 - j n, i}} (x_{2}), \dots, μ_{A_{n}^{j 1 - j n, i}} (x_{n}))$ 16

Therefore, from $μ_{\bar{B^{4}}} (y) = t (ϑ_{i}, μ_{B^{i}} (y)), \forall y \in R$ , fuzzy inference produces fuzzy set of output by: $μ_{\bar{B / 1 - j n, A}} (y) = t (ϑ_{i}, μ_{B^{j 1 - j n, i}} (y)) \forall y \in R$ . $t (ϑ_{i}, μ_{B^{j 1 - j n, i}} (y)) \forall y \in R$ . $μ_{\overset{↼}{B} / 1 - / n} (y) = s (μ_{B / 1 - 1 n, 1} (y), μ_{B / 1 - j n, 2} (y), \dots, μ_{B / 1 - j n,} (y))$ . Inline graphic , where $a_{j}^{i}$ are parameters, and are evaluated by LSM.

(μ_{Q_{IM}} (x, y) = \min [μ_{A_{1}} (x), μ_{A_{2}} (y)], Q_{IM} \in X \times Y) μ_{B^{^{'}}} (y) = \max_{\forall i} [\sup_{x \in X}, \min, (μ_{A^{^{'}}} (x), μ_{A_{1}^{^{'}}} (x_{1}), \dots, μ_{A_{n}^{t}} (x_{n}), μ_{B^{i}} (y))] μ_{A^{^{'}}} (x) = \{\begin{matrix} 1 & if x = x^{*} \\ 0 & otherwice \end{matrix}) y^{*} = \frac{\sum_{i = 1}^{l} {\overset{↼}{y}}^{i} w_{l}}{\sum_{i = 1}^{I} w_{l}}

Since the fuzzy sets $A_{j}^{1}, \dots, A_{j}^{N_{j}}$ are complete at every $x \in X$ , then there exist $j_{1}, j_{2}, \dots, j_{n}$ such that: $\min (μ_{{A_{1}}^{^{'}}} (x_{1}), μ_{A_{2}^{^{'} 2}} (x_{2}), \dots, μ_{{A_{n}}^{n}} (x_{n})) \neq 0 .$ Let $f (x)$ be fuzzy system in (13) and $G (x)$ be unknown function in (18). If $G (x)$ is continuously differentiable on $X = [α_{1}, β_{1}] \times [α_{2}, β_{2}] \times \dots \times [α_{n}, β_{n}]$ , then:

{||G - f||}_{\infty} \leq {||\frac{\partial G}{\partial x_{1}}||}_{\infty} h_{1} + {||\frac{\partial G}{\partial x_{2}}||}_{\infty} h_{2} + \dots + {||\frac{\partial G}{\partial x_{n}}||}_{\infty} h_{n} .

where infinite norm ${||.||}_{\infty}$ is given as: ${||d, (x)||}_{\infty} = \sup_{x \in X} |d, (x)|$ and $h_{j} = \max_{1 \leq k \leq N_{j}} |e_{j}^{k + 1} - e_{j}^{k}|, (j = 1, 2, \dots, n$ . $\underline{Let X^{j 1} \dots /_{n}} = [e_{1}^{j_{1}}, e_{1}^{j_{1} + 1}] \times [e_{2}^{j_{2}}, e_{2}^{j_{2} + 1}] \times \dots \times [e_{n}^{j_{n}}, e_{n}^{j_{n} + 1}]$ , where $j_{1} = 1, 2, \dots, N_{1} - 1, j_{2} = 1, 2, \dots, N_{2} - 1, \dots$ , $j_{n} = 1, 2, \dots, N_{n} - 1 .$ Since $[α_{j}, β_{j}] = [e_{j}^{1}, e_{j}^{2}] \cup [e_{j}^{2}, e_{j}^{3}] \cup \dots \cup [e_{j}^{N_{j} - 1}, e_{j}^{N_{j}}], j = 1, 2, \dots, n .$ From Eq. (19):

f (x) = \frac{\sum_{k_{1} = j_{1}}^{j_{1} + 1} \dots \sum_{k_{n} = j_{n}}^{j_{n} + 1} {\bar{y}}^{k_{1} . k_{n}} (m, (μ_{A_{1}^{k_{1}}} (x_{1}), μ_{A_{2}^{k_{2}}} (x_{2}), \dots, μ_{A_{n}^{k_{n}}} (x_{n})))}{\sum_{k_{1} = j_{1}}^{j_{1} + 1} \dots \sum_{k_{n} = j_{n}}^{j_{n} + 1} m (μ_{A_{1}^{k_{1}}} (x_{1}), μ_{A_{2}^{k_{2}}} (x_{2}), \dots, μ_{A_{n}^{k_{n}}} (x_{n}))}

From (20), (21), (22), we obtain:

f (x) = \sum_{k_{1} = j_{1}}^{j_{1} + 1} \dots \sum_{k_{n} = j_{n}}^{j_{n} + 1} [\frac{m (μ_{A_{1}^{k_{1}}} (x_{1}), \dots, μ_{A_{n}^{k_{n}}} (x_{n}))}{\sum_{k_{1} = j_{1}}^{j_{1} + 1} \dots \sum_{k_{n} = j_{n}}^{j_{n} + 1} m ({μ_{A_{1}}}^{k_{1}} (x_{1}), \dots, μ_{A_{n}^{k_{n}}} (x_{n}))}] * G (e_{1}^{k_{1}}, \dots, e_{n}^{k_{n}})

\sum_{k_{1} = j_{1}}^{j_{1} + 1} \dots \sum_{k_{n} = j_{n}}^{j_{n} + 1} [\frac{m (μ_{A_{1}^{k_{1}}} (x_{1}), \dots, μ_{A_{n}^{k_{n}}} (x_{n}))}{\sum_{k_{1} = j_{1}}^{j_{1} + 1} \dots \sum_{k_{n} = j_{n}}^{j_{n} + 1} m (μ_{A_{1}^{k_{1}}} (x_{1}), \dots, μ_{A_{n}^{k_{n}}} (x_{n}))}] = 1

\begin{matrix} |G (x) - f x| & \leq \sum_{k_{1} = j_{1}}^{j_{1} + 1} \dots \sum_{k_{n} = j_{n}}^{j_{n} + 1} [\frac{m (μ_{A_{1}^{k_{1}}} (x_{1}), \dots, {μ_{A_{n}}}^{k_{n}} (x_{n}))}{\sum_{k_{1} = j_{1}}^{j_{1} + 1} \dots \sum_{k_{n} = j_{n}}^{j_{n} + 1} m (μ_{A_{1}^{k_{1}}} (x_{1}), \dots, μ_{{A_{n}}^{k_{n}}} (x_{n}))}] \\ \max_{k_{1} = j_{1} + 1} |G (x) - G (e_{1}^{k_{1}}, \dots, e_{n}^{k_{n}})| \end{matrix} * |G (x) - G (e_{1}^{k_{1}}, \dots, e_{n}^{k_{n}})|

From the Mean Value $^{k_{n} = f_{n} : /_{n} + 1}$

From the Mean Value model is given (23) as:

|G (x) - f (x)| \leq \max_{k_{1} = 1_{1} j_{1} + 1} ({||\frac{\partial G}{\partial x_{1}}||}_{\infty} |x_{1} - e_{1}^{k_{1}}| + {||\frac{\partial G}{\partial x_{2}}||}_{\infty} |x_{2} - e_{2}^{k_{2}}| + \dots + {||\frac{\partial G}{\partial x_{n}}||}_{\infty} |x_{n} - e_{n}^{k_{n}}|)

Since $x \in X^{j_{1} - j_{n}}$ , means that $x_{1} \in [e_{1}^{j 1}, e_{1}^{j 1 + 1}], x_{2} \in [e_{2}^{j 2}, e_{2}^{j 2 + 1}] \dots x_{n} \in [e_{n}^{jn}, e_{n}^{j n + 1}]$ , have by Eq. (24),

|x_{1} - e_{1}^{k_{1}}| \leq |e_{1}^{j_{1} + 1} - e_{1}^{j_{1}}|, |x_{2} - e_{2}^{k_{2}}| \leq |e_{2}^{j_{2} + 1} - e_{2}^{j_{2}}| \dots, and |x_{n} - e_{n}^{k_{n}}| \leq |e_{n}^{j_{n} + 1} - e_{n}^{j_{n}}| for k_{1} = j_{1}, j_{1} + 1, k_{2} = j_{2}, j_{2} + 1, \dots, and k_{n} = j_{n}, j_{n} + 1

Then, (25) becomes:

\begin{matrix} |G (x) - f (x)| \leq {||\frac{\partial G}{\partial x_{1}}||}_{\infty} |e_{1}^{j_{1} + 1} - e_{1}^{j_{1}}| + {||\frac{\partial G}{\partial x_{2}}||}_{\infty} |e_{2}^{j_{2} + 1} - e_{2}^{j_{2}}| + \dots + {||\frac{\partial G}{\partial x_{n}}||}_{\infty} |e_{n}^{j_{n} + 1} - e_{n}^{j_{n}}| \\ Since {||d, (x)||}_{\infty} = {sup}_{x \in X} |d, (x)| then {||G - f||}_{\infty} = {sup}_{x \in X} |G - f|, we get : \\ {||G - f||}_{\infty} \leq {||\frac{\partial G}{\partial x_{1}}||}_{\infty} \sum {max}_{1 \leq 1 \leq N_{1} - 1} |e_{1}^{j_{1} + 1} - e_{1}^{j_{1}}| + \dots + {||\frac{\partial G}{\partial x_{n}}||}_{\infty} 1 \leq {max}_{n \leq I_{n} - 1} |e_{n}^{j_{n} + 1} - e_{n}^{j_{n}}| \\ ∴ {||G - f||}_{\infty} \leq {||\frac{\partial G}{\partial x_{1}}||}_{\infty} h_{1} + {||\frac{\partial G}{\partial x_{2}}||}_{\infty} h_{2} + \dots + {||\frac{\partial G}{\partial x_{n}}||}_{\infty} h_{n} \end{matrix}

From (26), conclude that fuzzy systems in form.

${||\frac{\partial G}{\partial x_{1}}||}_{\infty}, {||\frac{\partial G}{\partial x_{2}}||}_{\infty}, \dots, {||\frac{\partial G}{\partial x_{n}}||}_{\infty}$ are finite numbers for any given $ε > 0$ , select $h_{1}, h_{2}, \dots, h_{n}$ small enough such that ${||\frac{\partial G}{\partial x_{1}}||}_{\infty} h_{1} + {||\frac{\partial G}{\partial x_{2}}||}_{\infty} h_{2} + \dots + {||\frac{\partial G}{\partial x_{n}}||}_{\infty} h_{n} < ε$ . Hence from (27):

\sup_{x \in X} |G - f| = {||G - f||}_{\infty} < ε

We can see from (28) that we need to know the boundaries of the derivatives of G(x) about $x_{1}, x_{2}, \dots, x_{n}$ to represent a fuzzy system with a pre-specified accuracy.

{||\frac{\partial G}{\partial x_{1}}||}_{\infty}, {||\frac{\partial G}{\partial x_{2}}||}_{\infty}, \dots, {||\frac{\partial G}{\partial x_{n}}||}_{\infty}

Select a fuzzy method with a MIS, an SF, a CADand a Triangular MF, which then derive using Eq. (29).

f (x) = \frac{\sum_{i = 1}^{l} {\overset{↼}{y}}^{i} (\min_{\forall j}, μ_{A_{j}}, (x_{j}))}{\sum_{i = 1}^{l} (\min_{\forall j}, μ_{A_{j}^{^{'}}}, (x_{j}))} = \frac{\sum_{i = 1}^{l} {\overset{↼}{y}}^{i} [\min_{\forall j}, (\max, (\min_{vj} (\frac{x_{j} - a_{j}^{i}}{b_{j}^{l} - a_{j}^{^{'}}}, \frac{c_{j}^{i} - x_{j}}{c_{j}^{l} - b_{j}^{l}}), 0))]}{\sum_{i = 1}^{l} [\min_{vj}, (\max, (\min_{\forall j} (\frac{x_{j} - a_{j}^{l}}{b_{j}^{l} - a_{j}^{i}}, \frac{c_{j}^{i} - x_{j}}{c_{j}^{l} - b_{j}^{l}}), 0))]}

The more rules you have, the more parameters you will have and the more computation you will have to do, but you will get better accuracy. When initial parameters yi (0), aji (0), bji (0), cji (0) are specified, the fuzzy system becomes by Eq. (30).

f (x) = \frac{\sum_{j_{1} = 1}^{N_{1}} \dots \sum_{j_{n}}^{N_{n}} {\overset{↼}{y}}^{j_{1} - j_{n}} (0) [m, (m, (\min_{\forall k} (\frac{x_{k 0}^{p} - a_{k}^{j_{1} - j_{n}} (0)}{b_{k}^{j_{1} j_{2} j_{3}} (0) - a_{k}^{j_{1} j_{2} j_{3}} {(0)}^{^{'}}}, \frac{c_{k}^{j_{1} - j_{n}} (0) - x_{k 0}^{p}}{c_{k}^{j_{1} - j_{n}} (0) - b_{k}^{j_{1} - j_{n}} (0)}), 0))]}{\sum_{j_{1} = 1}^{N_{1}} \dots \sum_{j_{n}}^{N_{n}} [m, (m, (\min_{\forall k} (\frac{x_{k 0}^{p} - a_{k}^{j_{1} - j_{n}} (0)}{b_{k}^{j_{1}, j_{n}} (0) - a_{k}^{j_{1} - j_{n}} {(0)}^{^{'}}}, \frac{c_{k}^{j_{1} - j_{n}} (0) - x_{k 0}^{p}}{c_{k}^{j_{1} - j_{n}} (0) - b_{k}^{j_{1}, j_{n}} (0)}), 0))]}

for a sigmoid activation function, it gives by Eq. (31):

\begin{matrix} h_{l}^{[γ] (t)} = \frac{1}{1 + \exp - (\sum_{u = 1}^{d^{^{'}}} w_{l, u}^{^{'}} x_{φ (u)}^{(t)} + b_{l}^{[γ]} (t))} w_{l, u}^{^{'}} = \sum_{f = 1}^{N_{f}} \sum_{p = 1}^{P} \sum_{q = 1}^{Q} K_{p, q}^{f} w_{l, u}^{[γ]} \\ Y_{u}^{(t)} \equiv Y_{i, j}^{(t)} = \sum_{f = 1}^{N_{f}} \sum_{p = 1}^{P} \sum_{q = 1}^{Q} K_{p, q}^{f} x_{φ (u)}^{(t)} h_{l}^{[γ] (t)} = σ (\sum_{u = 1}^{d^{^{'}}} w_{l, u}^{[γ]} Y_{u}^{(t)} + b_{l}^{[γ] (t)}), l \\ \in \{1, \dots, s\} y_{k}^{T} = Ψ_{k}^{T} (h^{[γ] (1)}, \dots, h^{[γ] (T)}), k \in {1, \dots, r} \\ h_{l}^{[ρ]} = σ (\sum_{k = 1}^{r} w_{l, k}^{[ρ]} y_{k}^{T} + b_{l}^{[ρ]}), l \in \{1, \dots, r^{^{'}}\} \end{matrix}

Thus, if consider $\overset{↼}{X} = {(0, \dots, 0)}^{^{'}}, b_{l}^{[γ] (t)} = 0 \forall t \in \{1, \dots, T\}$ , the Taylor series expansion of $h_{l}^{[γ] (t)}$ is given by Eq. (32):

h_{l}^{[γ] (t)} \approx h_{l}^{[γ] (t)} (\overset{↼}{X}) + \nabla h_{l}^{[γ] (t)} X^{(t)} = \frac{1}{2} + \sum_{u = 1}^{d^{^{'}}} \frac{\partial h_{l}^{[γ] (t)} (\overset{↼}{X})}{\partial x_{φ (u)}^{(t)}} x_{φ (u)}^{(t)} = \frac{1}{2} + \sum_{u = 1}^{d^{^{'}}} w_{l, u}^{^{'}} x_{φ (u)}^{(t)}

with w′ l u, given by Eq. (33), and $X^{(t)} = {(x_{φ (1)}^{(t)}, \dots, x_{φ (d,)}^{(t)})}^{^{'}}$ being a column vector of the input at time t. Let $H = {(h^{[γ] (1)}, \dots, h^{[γ] (T)})}^{^{'}}, for X = \overset{↼}{X}$ then:

\overset{↼}{H} = H (\overset{↼}{X}) = {\{{\{\frac{1}{2}\}}^{s}, \dots, {\{\frac{1}{2}\}}^{s}\}}^{T}

where s is number of hidden neurons. Taylor series expansion of Ψk T isgiven by Eq. (34):

y_{k}^{T} = Ψ_{k}^{T} (H) \approx Ψ_{k}^{T} (\overset{↼}{H}) (H - \overset{↼}{H}) = Ψ_{k}^{T} (\overset{↼}{H}) + \sum_{t = 1}^{T} \sum_{u = 1}^{s} \frac{\partial Ψ_{K}^{T} (\overset{↼}{H})}{\partial h_{u}^{[γ, (k))}} (h_{u}^{[γ] (t)} - \frac{1}{2})

By replacing $h_{u}^{[γ] (t)}$ of Eq. (35) with $h_{l}^{[γ] (t)}$ of Eq. (36):

Ψ_{k}^{T} (H) \approx Ψ_{k}^{T} (\overset{↼}{H}) + \frac{1}{4} \sum_{t = 1}^{T} \sum_{u = 1}^{s} \sum_{ν = 1}^{d^{^{'}}} \frac{\partial Ψ_{k}^{T} (\overset{↼}{H})}{\partial h_{u}^{{[γ]}^{(t)}}} w_{l, ν}^{^{'}} x_{φ (v)}^{(t)}

Finally, by substituting

\begin{matrix} h_{l}^{[ρ]} = σ (\sum_{k = 1}^{r} w_{l, k}^{[e]} [Ψ_{k}^{T} (\overset{↼}{H}) + \frac{1}{4} \sum_{t = 1}^{T} \sum_{u = 1}^{s} \sum_{ν = 1}^{d^{^{'}}} \frac{\partial Ψ_{k}^{T} (\overset{↼}{H})}{\partial h_{u}^{[γ]} (t)} w_{l, ν}^{^{'}} x_{φ (v)}^{(t)}] + b_{l}^{[ρ]}) \\ h_{l}^{[ϱ]} = σ (\sum_{k = 1}^{r}, w_{l, k}^{[ρ]}, Ψ_{k}^{T}, (\overset{↼}{H})) + \frac{1}{4} \sum_{k = 1}^{r} (\sum_{ν = 1}^{d^{^{'}}} \underset{w_{l, ν}^{(1)}}{\underset{⏟}{\sum_{u = 1}^{s} \frac{\partial Ψ_{K}^{T} (H)}{\partial h_{u}^{(Y) (1)}} w_{l, ν}^{^{'}} w_{l, k}^{[ρ]} x_{φ (ν)}^{(1)}}} +, \dots \cdot \cdot \\ + \sum_{ν = 1}^{d^{^{'}}} \underset{w_{l, ν}^{(T)}}{\underset{⏟}{\sum_{u = 1}^{s} \frac{\partial Ψ Ψ_{()} (\overset{↼}{H})}{\partial h_{u}^{[|(T)))}} w_{l, ν}^{^{'}} w_{l, k}^{[ρ]} x_{φ (v)}^{(T)}}} + b_{l}^{[ρ]} \end{matrix}

{w_{l, ν}^{^{'}^{'}}}^{(t)} = \sum_{u = 1}^{s} \sum_{f = 1}^{N_{f}} \sum_{p = 1}^{P} \sum_{q = 1}^{Q} \frac{\partial Ψ_{k}^{T} (\overset{↼}{X})}{\partial h_{u}^{[γ] (t)}} K_{p, q}^{f} w_{l, ν}^{[γ]} w_{l, k}^{[ϱ]}

derived features ${w_{l, ν}^{^{'}^{'}}}^{(t)}$ through summations on indexes f, p, and q combine features $((w_{l, ν}^{[γ]}))$ and $(w_{l, k}^{[ρ]})$ extracted from both Fuzzy based SAEs and gives compact representation of input over time.

Least square error back propagation neural network (LSEBPNN)

Let, training set in a C-class issue contains vector pairs $\{(\overset{‵}{x_{1}}, y_{1}), (x_{2}, y_{2}), \dots, (x_{P}, y_{P})\}$ where $x_{p} \in R^{N}$ refers to pth input pattern and $y_{p} \in \{(t_{c}|, \overset{‵}{c} = 1, 2, \dots, C ; t_{c} \in R^{c}\}$ refers to target output of c network corresponding to this input.

All weights and bias terms are included in LSEBPNN’s adaptive parameters. The training phase’s main aim is to establish the best weights and bias terms for minimizing difference between network output as well as target output. The difference is referred regarded as the network’s training error. MSE for pth input pattern in the traditional BP technique is $E_{p} = \frac{1}{2} \sum_{k = 1}^{C} {(t_{pk} - o_{pk}^{\circ})}^{2}$ . It shows that an input pattern’s target value could be several. To put it another way, any input pattern can have any target value with any membership value. To put it another way, the training problem can be thought of as a fuzzy constraint fulfillment problem.Suggested network modifies parameters throughout training phase to ensure that these limitations are overcome as efficiently as possible. The constraints for pth input pattern are stated mathematically as fuzzy MSE term, which is given by Eq. (38)

.^{f} = \frac{1}{2} \sum_{k = 1}^{C} \sum_{c = 1}^{C} μ_{c}^{q} (x_{p}) {(t_{ck} - o_{pk}^{o})}^{2}

The learning laws for networks are derived using same approach as traditional BP technique. Suppose that the weight update, Dw, happens after each input pattern has been presented. Assuming that all weight changes in network are made with same learning-rate parameter h, weight changes applied to weights w and w are k j ji determined according to the gradient-descent rules by Eq. (39), (40):

Δ w_{kj}^{o} = - η \frac{\partial E_{p}^{f}}{\partial w_{kj}^{o}} and Δ w_{ji}^{h} = - η \frac{\partial E_{p}^{f}}{\partial w_{ji}^{h}}

\begin{matrix} Δ w_{kj}^{o} = & η [μ_{k}^{q} (x_{p}) - \sum_{c = 1}^{C} μ_{c}^{q} (x_{p}) o_{pk}^{o}] \\ \times o_{pk}^{o} (1 - o_{pk}^{o}) o_{pj}^{h} \\ = & η δ_{pk}^{o} o_{pj}^{h} \end{matrix}

where by Eq. (41)

δ_{pk}^{o} = [μ_{k}^{q} (x_{p}) - \sum_{c = 1}^{C} μ_{c}^{q} (x_{p}) o_{pk}^{o}] o_{pk}^{o} (1 - o_{pk}^{o})

Again, from Eq. (42),

Δ w_{ji}^{h} = η f_{j}^{h} ({net}_{pj}^{h}) x_{pi} \sum_{k = 1}^{C} [μ_{k}^{q} ((x_{p}) - \sum_{c = 1}^{C} μ_{c}^{q} ((x_{p}) o_{pk}^{o}] o_{pk}^{o} (1 - o_{pk}^{o}) w_{kj}^{o} = η f_{j}^{h} ({net}_{pj}^{h}) x_{pi} \sum_{k = 1}^{C} δ_{pk}^{o} w_{kj}^{o} = η δ_{pj}^{h} x_{pi},

where by Eq. (43)

δ_{pj}^{h} = f_{j}^{h} ({net}_{pj}^{h}) \sum_{k = 1}^{C} δ_{pk}^{o} w_{kj}^{o} .

In many circumstances, the traditional BP technique may not converge quickly, when classes overlap. Because ambiguous vectors are assigned full weightage in one class, this is case. In suggested version, error to be back propagated is given more weight in the case of nodes with higher membership values.

The learning algorithm’s purpose is to reduce the squared error cost function, which is given by Eq. (44)

j_{i}^{(s)} = \frac{1}{2} \sum_{q = 1}^{m} {(d_{i, q}^{(s)} - v_{i} (s))}^{2}

Equation (45), where m is total number of vectors in training data set given by

j_{i}^{(s)} = \frac{1}{2} \sum_{q = 1}^{m} {(d_{i, q}^{(s)} - w_{i} (s) t . x_{o u t . q}^{(s - 1)})}^{2}

Partial derivative about w i (s) and equate it to zero to determine weight vector that minimizes cost function given by Eq. (46).

\frac{\partial j_{i}^{(s)}}{\partial w_{i} (s)} = \sum_{q = 1}^{m} (- d_{i, q}^{(s)} x_{out . q^{(s - 1)}} + x_{out . q^{(s - 1)}} x_{out . q^{(s - 1)}} t_{i} (s)) = 0

c_{i}^{(s)} = \sum_{q = 1}^{m} x_{out . q^{(s - 1)}} x_{out . q^{(s - 1)}} t p_{i}^{(s)} = d_{i, q}^{(s)} x_{{out . q}^{(s - 1)}}

In vector matrix form, Eq. (48) are rearranged as

c_{i}^{(s)} w_{i}^{(s)} = p_{i}^{(s)}

$w_{i}^{(s)}$ is weight vector to ith linear combiner in sth layer, Eq. (49) is given as deterministic normal equation

w_{i^{(s)}} = {[c^{(s)}]}^{- 1} p_{i}^{(s)}

By equating partial derivative of performance index $w_{i}^{(k)} (n)$ and setting it equal to zero, the performance index is minimised (50)

\frac{\partial J (n)}{\partial w_{i}^{(k)} (n)} = & 2 \sum_{t = 1}^{n} λ^{n - t} \times \sum_{j = 1}^{N_{L}} [ε_{j, R}^{(L)} (t) \frac{\partial ε_{j, R}^{(L)} (t)}{\partial w_{i}^{(k)} (n)} + ε_{j, I}^{(L)} (t) \frac{\partial ε_{j, I}^{(L)} (t)}{\partial w_{i}^{(k)} (n)}] = & - 2 \sum_{t = 1}^{n} λ^{n - t} \times \sum_{j = 1}^{N_{L}} [ζ_{j, R}^{(L)} (t) ε_{j, R}^{(L)} (t) \frac{\partial y_{j, R}^{(L)} (t)}{\partial w_{i}^{(k)} (n)} + ζ_{j, I}^{(L)} (t) ε_{j, I}^{(L)} (t) \frac{\partial y_{j, I}^{(L)} (t)}{\partial w_{i}^{(k)} (n)}] = 0

Equation (51), (52) is set to the following

\sum_{t = 1}^{n} λ^{n - t} [\{ψ_{i, R}^{(k)} (t) - y_{i, R}^{(k)} (t)\} ζ_{i, R}^{(k)} (t) f^{^{'}} (s_{i, R}^{(k)}, (t)) + & (\{ψ_{i, I}^{(k)} (t) - y_{i, I}^{(k)} (t)\}, ζ_{i, I}^{(k)}, (t), f^{^{'}}, (s_{i, I}^{(k)}, (t))] \times x^{(k) *} (t) = 0 .)

r_{i}^{(k)} (n) = R_{i}^{(k)} (n) w_{i}^{(k)} (n)

where by Eq. (53)

\begin{matrix} r_{i}^{(k)} (n) = & \sum_{t = 1}^{n} λ^{n - t} \\ \times [ζ_{i, R}^{(k)}, (t), ψ_{i, R}^{(k)}, (t), f^{^{'}}, (s_{i, R}^{(k)}, (n))) \\ (+ ȷ ζ_{i, I}^{(k)} (t) ψ_{i, I}^{(k)} (t) f^{^{'}} (s_{i, I}^{(k)}, (n))] \\ \times x^{(k) *} (t) \\ R_{i}^{(k)} (n) = & \sum_{t = 1}^{n} λ^{n - t} x^{(k) *} (t) \\ \times [ζ_{i, R}^{(k)}, (t), y_{i, R}^{(k)}, (t), f^{^{'}}, (s_{i, R}^{(k)}, (t))) \\ (+ ȷ ζ_{i, I}^{(k)} (t) y_{i, I}^{(k)} (t) f^{^{'}} (s_{i, I}^{(k)}, (t))] \\ \times s_{i}^{{(k)}^{- 1}} (t) x^{{(k)}^{T}} (t) . \end{matrix}

Now, define a matrix operation for simplicity $A ⊙ B ≐ A_{R} B_{R} + ȷ A_{I} B_{I}$ . The flow chart for LSEBPNN is represented in Fig. 2.

Performance analysis

Proposed method is implemented into a prototype software system utilizing Python 3.7 to evaluate and assess potential contribution of proposed strategy for future real-world applications. Resources utilized to combine proposed method were an Intel i7 processor (Intel(R) Core(TM) i7-3770 CPU @3.40 GHz 3.80 Ghz) and an eight (8) gigabyte RAM (Intel, Santa Clara, CA, USA) (Samsung, Seoul, Korea). Microsoft Windows 10 was the operating system on which the suggested system was hosted and tested.

Table 1 shows comparative analysis for various fault situations for proposed and existing techniques. Here the fault situation has been detected by virtual sensor-based datasets of automation industry. The parametric analysis has been carried out in terms of QoS, measurement accuracy, RMSE, MAE, prediction performance, and computational time.\

Table 1.

comparative analysis for various fault situation for proposed and existing technique

Virtual sensor-based datasets of automation industry	Techniques	Computational rate	QoS	RMSE	MAE	Prediction performance	Measurement accuracy
Spindle-based dataset	CNN	41	59	47	43	91	76
	RBF	36	61	45	40	93	79
	FL_SDDAE-LSEBPNN	34	64	41	35	94	85
Gear-based dataset	CNN	50	62	51	51	73	79
	RBF	46	63	48	45	76	81
	FL_SDDAE-LSEBPNN	43	67	43	39	79	85
Bearing-based dataset	CNN	59	63	53	49	79	73
	RBF	53	65	49	45	83	77
	FL_SDDAE-LSEBPNN	49	68	45	41	86	84

Open in a new tab

Figures 3, 4, and 5 show comparative analysis for various virtual sensor-based datasets from automation industry. The dataset collected from cloud is based on spindle fault detection-based data, gear fault detection-based data, and bearing fault detection-based data. For spindle fault detection data, the proposed technique obtained computational time of 34%, QoS of 64%, RMSE of 41%, MAE of 35%, prediction performance of 94%, measurement accuracy of 85%.The proposed technique obtained computational time of 43%, QoS of 67%, RMSE of 43%, MAE of 39%, prediction performance of 79%, and measurement accuracy of 85% by gear-based fault detection dataset. For bearing fault detection data, the proposed technique obtained computational time of 49%, QoS of 68%, RMSE of 45%, MAE of 41%, prediction performance of 86%, and measurement accuracy of 84%. From the above analysis, proposed technique obtained optimal results for all the fault detection based on automation industry data.

Fig. 3 — Comparative analysis of spindle-based dataset in terms of a computational time, b QoS, c RMSE, d MAE, e prediction performance, f measurement accuracy

Fig. 4 — Comparative analysis of gear-based dataset in terms of a computational time, b QoS, c RMSE, d MAE, e prediction performance, f measurement accuracy

Fig. 5 — Comparative analysis of bearing-based dataset in terms of a computational time, b QoS, c RMSE, d MAE, e prediction performance, f measurement accuracy

The fundamental challenge in dealing with soft sensor principles is a lack of understanding due to their novelty and, as a result, a lack of typical mathematical descriptions or structure. On the other hand, it allows for more creative expression. In general, vast arrays of statistics for calculations are required when working with soft sensors. It is vital to have a thorough understanding of the controlled process’s principles, physical characteristics, and the parameters’ relationships.

Conclusion

This research propose novel technique in virtual soft sensor-based fault detection in automation industry using deep learning technique integrated with cloud module. Here the aim is to design novel techniques in automation of manufacturing industry where the dynamic soft sensors are used in feature representation and classification of the data. The data has been collected from cloud storage and created the virtual sensors dataset based on gear fault detection, spindle fault detection, and bearing fault detection in automation industry. Then to represent the feature using fuzzy logic-based stacked data-driven auto- encoder (FL_SDDAE) where the features of input data have been identified with general automation problems. Then the features have been classified using least square error back propagation neural network (LSEBPNN) in which the mean square error while classification will be minimized with loss function of the data. Here the experimental results have been carried out in terms of computational time of 34%, QoS of 64%, RMSE of 41%, MAE of 35%, prediction performance of 94%, and measurement accuracy of 85% has been obtained by proposed technique. One is that nonlinear systems’ predictive control cannot be solved successfully. Another issue is that stability as well as resilience of multivariable predictive control algorithms must be addressed, and accurate principle models for complex systems are extremely difficult to construct. Despite the contributions made so far, there are still areas where future work might be improved. On the loss function, targeted-output regularizes would extract even better features, improving the suggested work. Another future intervention would be to use approaches on the unsupervised pre-training to identify dynamic-related aspects. In addition, industrial research scenarios were used to apply the proposed method, however developing a soft sensor proposal for a real-world industrial scenario could be challenging. Non-linearities, abnormalities, and highly complex ecosystems must all be taken into account. The industrial study cases have shown to be suitable and widely used in the implementation and evaluation of models, and they serve as the foundation for many contributions in this field of research.

Funding

The authors extend their appreciation to the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU) for funding and supporting this work through Research Partnership Program no. RP-21-07-06. The authors acknowledge the support from Princess Nourah Bint Abdulrahman University Researchers Supporting Project number (PNURSP2023R440), Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia.

Declarations

Ethical approval

This article does not contain any studies with animals performed by any of the authors.

Conflict of interest

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Shakir Khan, Email: sgkhan@imamu.edu.sa.

Tamanna Siddiqui, Email: tsiddiqui.cs@amu.ac.in.

Azrour Mourade, Email: mo.azrour@umi.ac.ma.

Bayan Ibrahimm Alabduallah, Email: Bialabdullah@pnu.edu.sa.

Saad Abdullah Alajlan, Email: saalajlan@imamu.edu.sa.

Bader M. Albahlal, Email: bmalbahlal@imamu.edu.sa

Amani Alfaifi, Email: Amani0004@hotmail.com.

References

1.Savytskyi O, Tymoshenko M, Hramm O, Romanov S (2020) Application of soft sensors in the automated process control of different industries. In E3S Web of Conferences (Vol. 166, p 05003). EDP Sciences. 10.1051/e3sconf/202016605003
2.Sun Q, Ge Z (2021) A survey on deep learning for data-driven soft sensors. IEEE Transactions on Industrial Informatics 17(9):5853–5866
3.Andronie M, Lăzăroiu G, Iatagan M, Uță C, Ștefănescu R, CocoȘatu M. Artificial intelligence-based decision-making algorithms, internet of things sensing networks, and deep learning-assisted smart process management in cyber-physical production systems. Electronics. 2021;10(20):2497. doi: 10.3390/electronics10202497. [DOI] [Google Scholar]
4.Kovacova M, Lăzăroiu G. Sustainable organizational performance, cyber-physical production networks, and deep learning-assisted smart process planning in Industry 4.0-based manufacturing systems. Econ Manag Fin Markets. 2021;16(3):41–54. [Google Scholar]
5.Bibi R, Saeed Y, Zeb A, Ghazal TM, Rahman T, Said RA, ... Khan MA (2021) Edge AI-based automated detection and classification of road anomalies in VANET using deep learning. Comput Intell Neurosci 2021:1–16 [DOI] [PMC free article] [PubMed]
6.Valaskova K, Ward P, Svabova L. Deep learning-assisted smart process planning, cognitive automation, and industrial big data analytics in sustainable cyber-physical production systems. J Self-Governance Manag Econ. 2021;9(2):9–20. doi: 10.22381/jsme9220211. [DOI] [Google Scholar]
7.Hndoosh RW, Kumar S, Saroa MS. Fuzzy mathematical models for the analysis of fuzzy systems with application to liver disorders. IOSR J Comput Eng. 2014;16(5):71–85. doi: 10.9790/0661-16577185. [DOI] [Google Scholar]
8.Kingma DP, Welling M (2019) An introduction to variational autoencoders. Foundations and Trends® in Machine Learning 12(4):307–392
9.D'Angelo G, Palmieri F. A stacked autoencoder-based convolutional and recurrent deep neural network for detecting cyberattacks in interconnected power control systems. Int J Intell Syst. 2021;36(12):7080–7102. doi: 10.1002/int.22581. [DOI] [Google Scholar]
10.Abrar S, Zerguine A, Bettayeb M. Recursive least-squares backpropagation algorithm for stop-and-go decision-directed blind equalization. IEEE Trans Neural Netw. 2002;13(6):1472–1481. doi: 10.1109/TNN.2002.804282. [DOI] [PubMed] [Google Scholar]
11.Bargellesi N, Beghi A, Rampazzo M, Susto GA (2021) AutoSS: a deep learning-based soft sensor for handling time-series input data. IEEE Robot Autom Lett 6(3):6100–6107
12.Thomopoulos SC (2021) Risk assessment and automated anomaly detection using a deep learning architecture. IntechOpen. 10.5772/intechopen.96209
13.Senthilkumar P, Rajesh K (2021) Design of a model based engineering deep learning scheduler in cloud computing environment using Industrial Internet of Things (IIOT). J Ambient Intell Humanized Comput 1–9
14.Yao L, Ge Z. Dynamic features incorporated locally weighted deep learning model for soft sensor development. IEEE Trans Instrum Meas. 2021;70:1–11. [Google Scholar]
15.Moreira de Lima JM, Ugulino de Araújo FM. Industrial semi-supervised dynamic soft-sensor modeling approach based on deep relevant representation learning. Sensors. 2021;21(10):3430. doi: 10.3390/s21103430. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Sahoo SR, Gupta BB. Multiple features based approach for automatic fake news detection on social networks using deep learning. Appl Soft Comput. 2021;100:106983. doi: 10.1016/j.asoc.2020.106983. [DOI] [Google Scholar]
17.Fazil M, Khan S, Albahlal BM, Alotaibi RM, Siddiqui T, Shah MA. Attentional multi-channel convolution with bidirectional LSTM cell toward hate speech prediction. IEEE Access. 2023;11:16801–16811. doi: 10.1109/ACCESS.2023.3246388. [DOI] [Google Scholar]
18.Zhang J, Gao RX. Deep learning-driven data curation and model interpretation for smart manufacturing. Chin J Mech Eng. 2021;34(1):1–21. doi: 10.1186/s10033-021-00587-y. [DOI] [Google Scholar]
19.Xia K, Sacco C, Kirkpatrick M, Saidy C, Nguyen L, Kircaliali A, Harik R. A digital twin to train deep reinforcement learning agent for smart manufacturing plants: environment, interfaces and intelligence. J Manuf Syst. 2021;58:210–230. doi: 10.1016/j.jmsy.2020.06.012. [DOI] [Google Scholar]
20.Khan S, et al. HCovBi-caps: hate speech detection using convolutional and Bi-directional gated recurrent unit with Capsule network. IEEE Access. 2022;10:7881–7894. doi: 10.1109/ACCESS.2022.3143799. [DOI] [Google Scholar]
21.Khan S, et al. BiCHAT: BiLSTM with deep CNN and hierarchical attention for hate speech detection. J King Saud Univ-Comput Inform Sci. 2022;34(7):4335–4344. [Google Scholar]
22.Maschler B, Ganssloser S, Hablizel A, Weyrich M. Deep learning based soft sensors for industrial machinery. Procedia CIRP. 2021;99:662–667. doi: 10.1016/j.procir.2021.03.115. [DOI] [Google Scholar]
23.Khan S, Saravanan V, Lakshmi TJ, Deb N, Othman NA. Privacy protection of healthcare data over social networks using machine learning algorithms. Comput Intell Neurosci. 2022;2022(9985933):8. doi: 10.1155/2022/9985933. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
24.Haq AU, Li JP, Ahmad S, Khan S, Alshara MA, Alotaibi RM. Diagnostic approach for accurate diagnosis of COVID-19 employing deep learning and transfer learning techniques through chest X-ray images clinical data in E-healthcare. Sensors. 2021;21(24):8219. doi: 10.3390/s21248219. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Khan S, Al-Mogren AS, AlAjmi MF (2015) Using cloud computing to improve network operations and management. 2015 5th National Symposium on Information Technology: Towards New Smart World (NSITNSW), Riyadh, Saudi Arabia, 2015, pp. 1–6. 10.1109/NSITNSW.2015.7176418

[CR1] 1.Savytskyi O, Tymoshenko M, Hramm O, Romanov S (2020) Application of soft sensors in the automated process control of different industries. In E3S Web of Conferences (Vol. 166, p 05003). EDP Sciences. 10.1051/e3sconf/202016605003

[CR2] 2.Sun Q, Ge Z (2021) A survey on deep learning for data-driven soft sensors. IEEE Transactions on Industrial Informatics 17(9):5853–5866

[CR3] 3.Andronie M, Lăzăroiu G, Iatagan M, Uță C, Ștefănescu R, CocoȘatu M. Artificial intelligence-based decision-making algorithms, internet of things sensing networks, and deep learning-assisted smart process management in cyber-physical production systems. Electronics. 2021;10(20):2497. doi: 10.3390/electronics10202497. [DOI] [Google Scholar]

[CR4] 4.Kovacova M, Lăzăroiu G. Sustainable organizational performance, cyber-physical production networks, and deep learning-assisted smart process planning in Industry 4.0-based manufacturing systems. Econ Manag Fin Markets. 2021;16(3):41–54. [Google Scholar]

[CR5] 5.Bibi R, Saeed Y, Zeb A, Ghazal TM, Rahman T, Said RA, ... Khan MA (2021) Edge AI-based automated detection and classification of road anomalies in VANET using deep learning. Comput Intell Neurosci 2021:1–16 [DOI] [PMC free article] [PubMed]

[CR6] 6.Valaskova K, Ward P, Svabova L. Deep learning-assisted smart process planning, cognitive automation, and industrial big data analytics in sustainable cyber-physical production systems. J Self-Governance Manag Econ. 2021;9(2):9–20. doi: 10.22381/jsme9220211. [DOI] [Google Scholar]

[CR7] 7.Hndoosh RW, Kumar S, Saroa MS. Fuzzy mathematical models for the analysis of fuzzy systems with application to liver disorders. IOSR J Comput Eng. 2014;16(5):71–85. doi: 10.9790/0661-16577185. [DOI] [Google Scholar]

[CR8] 8.Kingma DP, Welling M (2019) An introduction to variational autoencoders. Foundations and Trends® in Machine Learning 12(4):307–392

[CR9] 9.D'Angelo G, Palmieri F. A stacked autoencoder-based convolutional and recurrent deep neural network for detecting cyberattacks in interconnected power control systems. Int J Intell Syst. 2021;36(12):7080–7102. doi: 10.1002/int.22581. [DOI] [Google Scholar]

[CR10] 10.Abrar S, Zerguine A, Bettayeb M. Recursive least-squares backpropagation algorithm for stop-and-go decision-directed blind equalization. IEEE Trans Neural Netw. 2002;13(6):1472–1481. doi: 10.1109/TNN.2002.804282. [DOI] [PubMed] [Google Scholar]

[CR11] 11.Bargellesi N, Beghi A, Rampazzo M, Susto GA (2021) AutoSS: a deep learning-based soft sensor for handling time-series input data. IEEE Robot Autom Lett 6(3):6100–6107

[CR12] 12.Thomopoulos SC (2021) Risk assessment and automated anomaly detection using a deep learning architecture. IntechOpen. 10.5772/intechopen.96209

[CR13] 13.Senthilkumar P, Rajesh K (2021) Design of a model based engineering deep learning scheduler in cloud computing environment using Industrial Internet of Things (IIOT). J Ambient Intell Humanized Comput 1–9

[CR14] 14.Yao L, Ge Z. Dynamic features incorporated locally weighted deep learning model for soft sensor development. IEEE Trans Instrum Meas. 2021;70:1–11. [Google Scholar]

[CR15] 15.Moreira de Lima JM, Ugulino de Araújo FM. Industrial semi-supervised dynamic soft-sensor modeling approach based on deep relevant representation learning. Sensors. 2021;21(10):3430. doi: 10.3390/s21103430. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Sahoo SR, Gupta BB. Multiple features based approach for automatic fake news detection on social networks using deep learning. Appl Soft Comput. 2021;100:106983. doi: 10.1016/j.asoc.2020.106983. [DOI] [Google Scholar]

[CR17] 17.Fazil M, Khan S, Albahlal BM, Alotaibi RM, Siddiqui T, Shah MA. Attentional multi-channel convolution with bidirectional LSTM cell toward hate speech prediction. IEEE Access. 2023;11:16801–16811. doi: 10.1109/ACCESS.2023.3246388. [DOI] [Google Scholar]

[CR18] 18.Zhang J, Gao RX. Deep learning-driven data curation and model interpretation for smart manufacturing. Chin J Mech Eng. 2021;34(1):1–21. doi: 10.1186/s10033-021-00587-y. [DOI] [Google Scholar]

[CR19] 19.Xia K, Sacco C, Kirkpatrick M, Saidy C, Nguyen L, Kircaliali A, Harik R. A digital twin to train deep reinforcement learning agent for smart manufacturing plants: environment, interfaces and intelligence. J Manuf Syst. 2021;58:210–230. doi: 10.1016/j.jmsy.2020.06.012. [DOI] [Google Scholar]

[CR20] 20.Khan S, et al. HCovBi-caps: hate speech detection using convolutional and Bi-directional gated recurrent unit with Capsule network. IEEE Access. 2022;10:7881–7894. doi: 10.1109/ACCESS.2022.3143799. [DOI] [Google Scholar]

[CR21] 21.Khan S, et al. BiCHAT: BiLSTM with deep CNN and hierarchical attention for hate speech detection. J King Saud Univ-Comput Inform Sci. 2022;34(7):4335–4344. [Google Scholar]

[CR22] 22.Maschler B, Ganssloser S, Hablizel A, Weyrich M. Deep learning based soft sensors for industrial machinery. Procedia CIRP. 2021;99:662–667. doi: 10.1016/j.procir.2021.03.115. [DOI] [Google Scholar]

[CR23] 23.Khan S, Saravanan V, Lakshmi TJ, Deb N, Othman NA. Privacy protection of healthcare data over social networks using machine learning algorithms. Comput Intell Neurosci. 2022;2022(9985933):8. doi: 10.1155/2022/9985933. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]

[CR24] 24.Haq AU, Li JP, Ahmad S, Khan S, Alshara MA, Alotaibi RM. Diagnostic approach for accurate diagnosis of COVID-19 employing deep learning and transfer learning techniques through chest X-ray images clinical data in E-healthcare. Sensors. 2021;21(24):8219. doi: 10.3390/s21248219. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR25] 25.Khan S, Al-Mogren AS, AlAjmi MF (2015) Using cloud computing to improve network operations and management. 2015 5th National Symposium on Information Technology: Towards New Smart World (NSITNSW), Riyadh, Saudi Arabia, 2015, pp. 1–6. 10.1109/NSITNSW.2015.7176418

PERMALINK

Manufacturing industry based on dynamic soft sensors in integrated with feature representation and classification using fuzzy logic and deep learning architecture

Shakir Khan

Tamanna Siddiqui

Azrour Mourade

Bayan Ibrahimm Alabduallah

Saad Abdullah Alajlan

Abrar almjally

Bader M Albahlal

Amani Alfaifi

Abstract

Introduction

Related works

System model

Fig. 1.

Feature representation using fuzzy logic based stacked data driven auto-encoder (FL_SDDAE)

Least square error back propagation neural network (LSEBPNN)

Fig. 2.