A Combined Method for MEMS Gyroscope Error Compensation Using a Long Short-Term Memory Network and Kalman Filter in Random Vibration Environments

Chenhao Zhu; Sheng Cai; Yifan Yang; Wei Xu; Honghai Shen; Hairong Chu

doi:10.3390/s21041181

. 2021 Feb 8;21(4):1181. doi: 10.3390/s21041181

A Combined Method for MEMS Gyroscope Error Compensation Using a Long Short-Term Memory Network and Kalman Filter in Random Vibration Environments

Chenhao Zhu ^1,², Sheng Cai ¹, Yifan Yang ^1,², Wei Xu ¹, Honghai Shen ³, Hairong Chu ^1,^*

Editor: Stefano Mariani

PMCID: PMC7914848 PMID: 33567557

Abstract

In applications such as carrier attitude control and mobile device navigation, a micro-electro-mechanical-system (MEMS) gyroscope will inevitably be affected by random vibration, which significantly affects the performance of the MEMS gyroscope. In order to solve the degradation of MEMS gyroscope performance in random vibration environments, in this paper, a combined method of a long short-term memory (LSTM) network and Kalman filter (KF) is proposed for error compensation, where Kalman filter parameters are iteratively optimized using the Kalman smoother and expectation-maximization (EM) algorithm. In order to verify the effectiveness of the proposed method, we performed a linear random vibration test to acquire MEMS gyroscope data. Subsequently, an analysis of the effects of input data step size and network topology on gyroscope error compensation performance is presented. Furthermore, the autoregressive moving average-Kalman filter (ARMA-KF) model, which is commonly used in gyroscope error compensation, was also combined with the LSTM network as a comparison method. The results show that, for the x-axis data, the proposed combined method reduces the standard deviation (STD) by 51.58% and 31.92% compared to the bidirectional LSTM (BiLSTM) network, and EM-KF method, respectively. For the z-axis data, the proposed combined method reduces the standard deviation by 29.19% and 12.75% compared to the BiLSTM network and EM-KF method, respectively. Furthermore, for x-axis data and z-axis data, the proposed combined method reduces the standard deviation by 46.54% and 22.30% compared to the BiLSTM-ARMA-KF method, respectively, and the output is smoother, proving the effectiveness of the proposed method.

Keywords: MEMS gyroscope, random vibration environments, long short-term memory network, Kalman filter, expectation-maximization algorithm

1. Introduction

Fiber optic gyroscopes and laser gyroscopes have excellent performance, but they are too large and expensive for portable devices [1,2]. Micro-electro-mechanical-system (MEMS) gyroscopes have, in recent years, been used in low-cost inertial navigation systems (INS) due to their small size and low cost. However, the MEMS gyroscope has a significant error due to the manufacturing technology and structural composition [3,4]. The error of the MEMS gyroscope can be divided into deterministic error and random error. The deterministic error mainly refers to perturbation errors such as zero offsets and the scale factor, which can be corrected by a calibration test [5,6]. Random error refers to the random drift caused by uncertain factors, usually determined by the device’s accuracy level [7], with no precise repeatability. Therefore, it is difficult to accurately compensate for random error, which hinders the further improvement of MEMS gyroscope performance.

In MEMS gyroscope error compensation research, the MEMS gyroscope data are generally treated as time-series data. Scholars have proposed methods such as the autoregressive moving average (ARMA) model, the Allan variance (AV), the wavelet threshold (WT), the support vector machine (SVM), and the artificial neural network (ANN), and all of them have achieved excellent results [7,8,9,10,11,12,13,14]. Recently, various variants of the recurrent neural network (RNN), which has strong processing power for time-series data, have been shown to be superior to traditional methods in the research of error compensation in MEMS gyroscopes [15,16,17,18].

However, most of the research mentioned above has acquired data by placing the MEMS gyroscope in a static environment. In practical applications of the MEMS gyroscope, it is inevitably that it is affected by random vibration [19]. In random vibration environments, the MEMS gyroscope is interfered with by both internal device noise and external vibration noise [20], which dramatically affects the performance of the MEMS gyroscope. The degradation of performance in vibrating environments is a fatal problem for MEMS gyroscopes [21,22], so it is essential to research error compensation methods in random vibration environments.

Most of the current research on improving the performance of the MEMS gyroscope in random vibration environments is to fix the MEMS gyroscope on a vibration isolation platform [23,24,25]. However, this kind of method is not universal [26]. There is not much research based on time-series models—the windowed measurement error covariance (WMEC) method has been applied to compensate for the effects of the vibration environments [27], singular spectrum analysis (SSA) was proposed to remove the low-frequency vibration noise perturbations of MEMS accelerometers [28], and the third-order autoregressive (AR) model was used to estimate the Kalman filter to compensate for the MEMS gyroscope’s attitude angle error caused by random vibration [29].

Considering the dramatic perturbation of the MEMS gyroscope in random vibration environments, in this paper, a combined method of a long short-term memory (LSTM) network and Kalman filter is proposed for error compensation, with the Kalman smoother and expectation-maximization (EM) algorithm to dynamically adjust the predicted values of the LSTM network to improve the performance in error compensation. The main contributions of this paper are as follows:

(1)
The combination of LSTM network and Kalman filter is applied to MEMS gyroscope error compensation in random vibration environments;
(2)
The proper input data step and the network topology are explored, and the error compensation performance of the bidirectional LSTM (BiLSTM) network and other recurrent neural network (RNN) variants are compared;
(3)
In designing the Kalman filter, the EM algorithm is used to estimate the parameters. It is compared with the ARMA model, a parameter estimation method commonly used in research of the MEMS gyroscope error compensation problem.

The remainder of this paper is organized as follows: (1) Section 2 introduces the methods, including BiLSTM network, Kalman filter, ARMA-KF model, and EM-KF model, and gives the illustration of this paper proposed method; (2) Section 3 presents the experiment, results, and comparisons; and (3) the remaining sections are the conclusion, appendix, and references.

2. Method

2.1. Multi-Layer BiLSTM Network and Kalman Filter

The long short-term memory network is a variant of the recurrent neural network used to solve the gradient vanishing or gradient explosion problem of RNNs [30,31]. A detailed description of LSTM units can be found in references [15,16,17,18].

The basic LSTM network only considers the historical and current inputs and ignores future inputs [32]. Therefore, the LSTM network can perform the reverse operation, superimpose the forward and reverse information flows, and fully utilize the front and back inputs at the current time to improve the error compensation performance. In addition, the previous hidden layer’s output is used as the input of the following layer to explore the more in-depth features of the time-series data, thus enhancing the model’s nonlinear fitting ability. The multi-layer BiLSTM network information flow is shown in Figure 1.

Information flow of multi-layer bidirectional long short-term memory (BiLSTM) network.

The cell state of the Layer n BiLSTM network at time t can be presented as:

[\begin{matrix} i_{t}^{(n)} \\ f_{t}^{(n)} \\ o_{t}^{(n)} \\ {\tilde{c}}_{t}^{(n)} \end{matrix}] = [\begin{matrix} σ \\ σ \\ σ \\ t a n h \end{matrix}] [\begin{matrix} W_{i, x}^{(n)} & W_{i, h}^{(n)} \\ W_{f, x}^{(n)} & W_{f, h}^{(n)} \\ W_{o, x}^{(n)} & W_{o, h}^{(n)} \\ W_{\tilde{c}, x}^{(n)} & W_{\tilde{c}, h}^{(n)} \end{matrix}] [\begin{matrix} h_{t}^{(n - 1)} \\ h_{t}^{(n)} \end{matrix}]

(1)

where $h_{t}^{(n - 1)}$ is the hidden state of the Layer $n - 1$ at time $t$ . Each hidden state is composed of forward and reverse superposition. The related equations are denoted as follows:

h_{t}^{(n)} = {\vec{h}}_{t}^{(n)} \oplus {\overset{\leftarrow}{h}}_{t}^{(n)}

(2)

The Kalman filter is an optimal state estimation method that can be applied to dynamic systems with random disturbances. It estimates the system state based on discrete measurement that contain noise [33,34]. Suppose the state–space model is built as:

{\hat{x}}_{k} = Φ {\hat{x}}_{k - 1} + Γ ω_{k - 1}

(3)

y_{k} = H x_{k} + v_{k}

(4)

where $Φ$ is the system state transition matrix, $Γ$ is the system noise-driven matrix, $H$ is the measurement matrix, ${\hat{x}}_{k}$ is the system state vector, $y_{k}$ is the measurement vector, $ω_{k}$ is the system noise vector, and $v_{k}$ is the measurement noise vector.

The noise of the system models and measurement models are assumed to have normal distribution in the Kalman filter, such that [35]:

ω_{k} \sim N (0, Q)

(5)

v_{k} \sim N (0, R)

(6)

where $Q$ is the covariance matrix of the system models and $R$ is the covariance matrix of measurement models.

The Kalman filter is composed of two-stage optimization. In the predicted stage, the current system state vector is predicted based on the system state vector at the previous time, such that:

{\hat{x}}_{k / k - 1} = Φ {\hat{x}}_{k - 1}

(7)

P_{k / k - 1} = Φ P_{k - 1} Φ^{T} + Γ Q Γ^{T}

(8)

where ${\hat{x}}_{k / k - 1}$ is the predicted value of the system state vector and $P_{k / k - 1}$ is the predicted covariance matrix of the system state vector.

In the updated stage of the Kalman filter, the current system state vector is updated by using the measurement vector, such that:

K_{k} = P_{k / k - 1} H^{T} {(H P_{k / k - 1} H^{T} + R)}^{- 1}

(9)

{\hat{x}}_{k} = {\hat{x}}_{k / k - 1} + K_{k} (y_{k} - H {\hat{x}}_{k / k - 1})

(10)

P_{k} = (I - K_{k} H) P_{k / k - 1}

(11)

where $K_{k}$ is the Kalman filter gain matrix and $P_{k}$ is the updated covariance matrix of the system state vector.

2.2. Kalman Filter Design with ARMA Model

The autoregressive moving average model is the most widespread model used in time-series analysis, and it is derived and developed on the basis of the linear regression model. The ARMA model can be described as [36]:

x_{t} = \sum_{i = 1}^{p} φ_{i} x_{t - i} - \sum_{j = 1}^{q} θ_{j} ε_{t - j} + ε_{t}, ε_{t} \sim N (0, δ_{ε}^{2})

(12)

where $p$ and $q$ are the order of the ARMA model; $φ_{i}$ and $θ_{j}$ are coefficients that satisfy stationary and invertible conditions, respectively [37]; $and ε_{t}$ is white noise, which is an uncorrelated random variable with mean zero and constant variance. The model expresses that the measured values of the stochastic process ${x_{t}}$ at time $t$ are correlated with the previous $p$ measurements and the previous $q$ white noise.

The steps to design a Kalman filter using the ARMA model as follows: (1) test the stationarity and normality of the measurement data, (2) determine the model type according to the autocorrelation function and partial autocorrelation function, (3) determine the order and parameters of the model according to the Akaike information criterion (AIC) [38], and (4) perform adaptive testing of the designed model.

2.3. Kalman Filter Design with EM Algorithm

The expectation-maximization algorithm is an iterative method proposed by Shumway and Stoffer to compute maximum likelihood estimates based on incomplete data [39]. It is convergent and can identify parameters and states in the model [40]. Andrieu and Doucet introduced the EM algorithm for parameter estimation for linear state–space models was introduced [41]. The EM algorithm is an iterative numerical algorithm for computing the maximum likelihood estimation (MLE). The linear Gaussian state–space model used for the EM algorithm can be expressed as follows [42]:

x_{k} = Φ x_{k - 1} + ω_{k - 1}

(13)

y_{k} = H x_{k} + υ_{k}

(14)

The conditional probability densities of the state equation and the measurement equation are obtained from Equations (13) and (14), respectively:

p (x_{k} | x_{k - 1}) = \exp {- \frac{1}{2} {[x_{k} - Φ x_{k - 1}]}^{T} Q^{- 1} [x_{k} - Φ x_{k - 1}]} {(2 π)}^{- \frac{n}{2}} {| Q |}^{- \frac{1}{2}}

(15)

p (y_{k} | x_{k}) = \exp {- \frac{1}{2} {[y_{k} - H x_{k}]}^{T} R^{- 1} [y_{k} - H x_{k}]} {(2 π)}^{- \frac{m}{2}} {| R |}^{- \frac{1}{2}}

(16)

It is assumed that the likelihood of the system state data and the evolution of the states is Gaussian, which are defined by the following equations [43]:

p (Y, X | Θ) = p (x_{1}) \prod_{k = 1}^{N} p (y_{k} | x_{k}) \prod_{k = 2}^{N} p (x_{k} | x_{k - 1})

(17)

where $Y$ is the measurement data, $X$ is the unknown system state data, and $Θ$ is the parameter set of linear Gaussian state–space model. $Θ$ can be represented as follows:

Θ = {Φ, H, Q, R}

(18)

By taking the log of the likelihood we arrive at the following formula:

\ln p (Y, X | Θ) = - \frac{1}{2} \sum_{k = 2}^{N} {\ln | Q | + {[x_{k} - Φ x_{k - 1}]}^{T} Q^{- 1} [x_{k} - Φ x_{k - 1}]} - \frac{1}{2} \sum_{k = 1}^{N} {\ln | R | + {[y_{k} - H x_{k}]}^{T} R^{- 1} [y_{k} - H x_{k}]}

(19)

Depending on the maximum likelihood method, the linear Gaussian state–space model can be identified through an EM algorithm [42]. The algorithm alternates between two steps—the E-step (expectation) and the M-step (maximization) [44]. In general, the likelihood density function based on the measurement data, denoted by $p (Θ | Y)$ , is called the posterior distribution of the measurement. The EM algorithm aims to compute the maximum likelihood estimation of $p (Θ | Y)$ . $Θ_{i}$ is denoted as the estimate of the likelihood function at the beginning of the $i$ th iteration.

In the E-step, the expectation for the conditional distribution of $\ln p (Y, X | Θ)$ concerning $X$ , is calculated such that:

Ω (Θ | Θ_{i}, Y) ≙ E_{X} {\ln p (Θ | Y, X) | Θ_{i}, Y} = \int [\ln p (Θ | Y, X)] (X | Θ_{i}, Y) d X

(20)

In the M-step, $Ω (Θ | Θ_{i}, Y)$ is maximized to find $Θ_{i + 1}$ such that:

Ω (Θ_{i + 1} | Θ_{i}, Y) = a r g \max_{Θ} [Ω (Θ | Θ_{i}, Y)]

(21)

The E-step and the M-step are iterated until,

‖ L (Θ_{i + 1}) - L (Θ_{i}) ‖ < τ

(22)

where $τ$ is the predefined threshold. Equation (22) means that it has satisfied the convergence criterion. The specific process of designing a Kalman filter using the EM algorithm is given as follows:

The value of $Ω (Θ | Θ_{i}, Y)$ is determined by the following [45]:

E_{X} (x_{k} | Y) = {\hat{x}}_{k | N}

(23)

E_{X} (x_{k} x_{k - 1}^{T} | Y) = P_{k, k - 1 | N} + {\hat{x}}_{k | N} {\hat{x}}_{k - 1 | N}^{T}

(24)

E_{X} (x_{k} x_{k}^{T} | Y) = P_{k | N} + {\hat{x}}_{k | N} {\hat{x}}_{k | N}^{T}

(25)

where ${\hat{x}}_{k | N}$ is the smoothed value of the system state vector and $P_{k | N}$ is the smoothed covariance matrix of the system state vector. $P_{k, k - 1 | N}$ is initialized by:

P_{k, k - 1 | k} = (I - K_{k} H) Φ P_{k - 1}

(26)

P_{k, k - 1 | N} = P_{k, k - 1 | k} + [P_{k | N} - P_{k}] P_{k | k}^{- 1} P_{k, k - 1 | k}

(27)

${\hat{x}}_{k | N}$ and $P_{k | N}$ can be obtained by smoothing the outputs of Kalman filter using backward-pass methods such as the Rauch–Tung–Striebel (RTS) smoother [46]. This method is summarized in the following equations:

J_{k - 1} = P_{k - 1} Φ^{T} P_{k | k - 1}^{- 1}

(28)

{\hat{x}}_{k - 1 | N} = {\hat{x}}_{k - 1} + J_{k - 1} ({\hat{x}}_{k | N} - {\hat{x}}_{k / k - 1})

(29)

P_{k - 1 | N} = P_{k - 1} - J_{k - 1} (P_{k | N} - P_{k | k - 1}) J_{k - 1}^{T}

(30)

Then, the model parameters are re-estimated by maximizing the $Ω (Θ | Θ_{i}, Y)$ over $Θ$ using partial derivatives of $Ω (Θ | Θ_{i}, Y)$ and setting them to zero. Solving these equations yields the updated parameters (in the $i$ th iteration) as follows:

\frac{\partial L (Θ)}{\partial Φ} = - \sum_{k = 2}^{N} Q^{- 1} (P_{k, k - 1 | N} + {\hat{x}}_{k | N} {\hat{x}}_{k | N}^{T}) + \sum_{k = 2}^{N} Q^{- 1} Φ (P_{k - 1 | N} + {\hat{x}}_{k - 1 | N} {\hat{x}}_{k - 1 | N}^{T}) = 0

(31)

Φ_{i + 1} = (\sum_{k = 2}^{N} P_{k, k - 1 | N} + {\hat{x}}_{k | N} {\hat{x}}_{k | N}^{T}) {(\sum_{k = 2}^{N} P_{k - 1 | N} + {\hat{x}}_{k - 1 | N} {\hat{x}}_{k - 1 | N}^{T})}^{- 1}

(32)

\frac{\partial L (Θ)}{\partial H} = - \sum_{k = 1}^{N} R^{- 1} y_{k} {\hat{x}}_{k | N}^{T} + \sum_{k = 1}^{N} R^{- 1} H (P_{k | N} + {\hat{x}}_{k | N} {\hat{x}}_{k | N}^{T}) = 0

(33)

H_{i + 1} = (\sum_{k = 1}^{N} y_{k} {\hat{x}}_{k | N}^{T}) {[\sum_{k = 1}^{N} (P_{k | N} + {\hat{x}}_{k | N} {\hat{x}}_{k | N}^{T})]}^{- 1}

(34)

\frac{\partial L (Θ)}{\partial Q^{- 1}} = \frac{N}{2} Q - \frac{1}{2} \sum_{k = 1}^{N} (P_{k | N} + {\hat{x}}_{k | N} {\hat{x}}_{k | N}^{T}) + Φ [\frac{1}{N} \sum_{k = 2}^{N} (P_{k, k - 1 | N} + {\hat{x}}_{k | N} {\hat{x}}_{k - 1 | N}^{T})] = 0

(35)

Q_{i + 1} = \frac{1}{N} (\sum_{k = 1}^{N} (P_{k | N} + {\hat{x}}_{k | N} {\hat{x}}_{k | N}^{T}) - Φ_{i + 1} \sum_{k = 2}^{N} (P_{k, k - 1 | N} + {\hat{x}}_{k | N} {\hat{x}}_{k - 1 | N}^{T}))

(36)

\frac{\partial L (Θ)}{\partial R^{- 1}} = \frac{N + 1}{2} R - \sum_{k = 1}^{N} (\frac{1}{2} y_{k} y_{k}^{T} - H {\hat{x}}_{k | N} y_{k}^{T} + \frac{1}{2} H (P_{k | N} + {\hat{x}}_{k | N} {\hat{x}}_{k | N}^{T}) H^{T}) = 0

(37)

R_{i + 1} = \frac{1}{N + 1} (\sum_{k = 1}^{N} y_{k} y_{k}^{T}) - H_{i + 1} {(\frac{1}{N + 1} \sum_{k = 1}^{N} y_{k} {\hat{x}}_{k | N})}^{T}

(38)

In this paper, based on the EM algorithm, the proposed LSTM and Kalman filter combination method is illustrated in Figure 2.

An illustration of this paper’s proposed method.

3. Experiments and Results

In this section, the designed experiments and the analysis of the results are presented to verify the effectiveness of the proposed method.

3.1. Data Acquisition

The MSI320H MEMS Inertial Measurement Unit (IMU) was employed for experiments. This consists of a three-axis MEMS gyroscope and a three-axis MEMS accelerometer. The real picture and the gyroscope specifications of MSI320H are shown in Figure 3a and Table 1, respectively. The MSI320H was fixed on the vibration table. A picture of the vibration table is shown in Figure 3b. The data acquisition procedure of the MSI320H is shown in Figure 3c. Data from the MSI320H was sent to the xPC via the RS422 communication interface with a Baud of 921,600 bps. The xPC decoded the gyroscope data and sent it to the host computer via the network cable. The MSI320H was preheated at room temperature with power for 20 minutes. Then, linear vibration experiments were performed. The vibration direction of the vibration table is the y-axis of the gyroscope, and the power spectral density (PSD) of the linear random vibration loads is shown in Figure 3d.

Experimental system. (a) MSI3200H inertial measurement unit, (b) vibration table, (c) data acquisition procedure, and (d) power spectral density of linear random vibration loads.

Table 1.

Specifications of MSI320H gyroscope.

GYRO	Input range	±1800°/s
	Bias instability (Allan variance)	36°/h
	Angular random walk (Allan variance)	0.4°/√h
	Bandwidth (−3 dB)	≥220 Hz
GENERAL	Sample rate	100 $\sim$ 1000 Hz
	Weight	≤25 g
	Supply voltage	5.0 ± 0.5 V
	RS422 transmission bit rate	921600 bps
	Mechanical shock, any direction	≥20,000 g

The output dimension of dense layer	1
Activation function of dense layer	ReLU
Dropout rate	0.5
Batch size	256
Training epoch	50
Learning rate	0.001

Number of Hidden Layers	Number of Hidden Units	Input Data Step	STD (°/s)	Time/Epoch
10	64	5	0.1551	23 s
10	64	10	0.1481	38 s
10	64	15	0.1483	60 s
10	64	20	0.1346	82 s
10	64	25	0.1368	98 s
10	64	30	0.1501	115 s

Number of Hidden Layers	Number of Hidden Units	Input Data Step	STD (°/s)	Time/Epoch
10	8	20	0.1504	82 s
10	16	20	0.1459	82 s
10	32	20	0.1559	81 s
10	64	20	0.1346	82 s
10	128	20	0.1326	81 s
10	256	20	0.1468	88 s

Number of Hidden Layers	Number of Hidden Units	Input Data Step	STD (°/s)	Time/Epoch
1	128	20	0.1493	10 s
2	128	20	0.1513	19 s
3	128	20	0.1597	27 s
4	128	20	0.1557	34 s
5	128	20	0.1658	42 s
6	128	20	0.1505	50 s
7	128	20	0.1470	58 s
8	128	20	0.1459	66 s
9	128	20	0.1542	75 s
10	128	20	0.1326	81 s
11	128	20	0.1405	90 s
12	128	20	0.1393	98 s

x-axis	STD (°/s)	Percentage
Raw data	0.2493	$-$
BiLSTM	0.1326	53.19%
LSTM	0.1543	61.89%
BiGRU	0.1501	60.21%
GRU	0.1604	64.34%

z-axis	STD (°/s)	Percentage
Raw data	0.2400	$-$
BiLSTM	0.1353	56.38%
LSTM	0.1550	64.58%
BiGRU	0.1504	62.67%
GRU	0.1574	65.58%

	$Φ$	$Γ$	$H$	$Q$	$R$
x-axis raw data	$[\begin{matrix} - 0.3126 & 0.8168 & 0.1520 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{matrix}]$	$[\begin{matrix} 1 & 0.6885 & - 0.3241 & - 0.0366 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{matrix}]$	$[\begin{matrix} 1 & 0 & 0 & 0 \end{matrix}]$	$[\begin{matrix} 0.0370 & 0 & 0 & 0 \\ 0 & 0.0370 & 0 & 0 \\ 0 & 0 & 0.0370 & 0 \\ 0 & 0 & 0 & 0.0370 \end{matrix}]$	$0.0604$
x-axis BiLSTM	$[\begin{matrix} 0.8505 & 0.8891 & - 0.7745 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{matrix}]$	$[\begin{matrix} 1 & 0.1354 & - 0.8476 & - 0.0815 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{matrix}]$	$[\begin{matrix} 1 & 0 & 0 & 0 \end{matrix}]$	$[\begin{matrix} 0.0036 & 0 & 0 & 0 \\ 0 & 0.0036 & 0 & 0 \\ 0 & 0 & 0.0036 & 0 \\ 0 & 0 & 0 & 0.0036 \end{matrix}]$	$0.0175$
z-axis raw data	$[\begin{matrix} 0.8593 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{matrix}]$	$[\begin{matrix} 1 & - 0.5191 & 0.0328 & 0.0236 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{matrix}]$	$[\begin{matrix} 1 & 0 & 0 & 0 \end{matrix}]$	$[\begin{matrix} 0.0405 & 0 & 0 & 0 \\ 0 & 0.0405 & 0 & 0 \\ 0 & 0 & 0.0405 & 0 \\ 0 & 0 & 0 & 0.0405 \end{matrix}]$	$0.0596$
z-axis BiLSTM	$[\begin{matrix} 2.0000 & - 1.2268 & 0.2031 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{matrix}]$	$[\begin{matrix} 1 & - 1.0446 & 0.0809 & 0.1086 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{matrix}]$	$[\begin{matrix} 1 & 0 & 0 & 0 \end{matrix}]$	$[\begin{matrix} 0.0043 & 0 & 0 & 0 \\ 0 & 0.0043 & 0 & 0 \\ 0 & 0 & 0.0043 & 0 \\ 0 & 0 & 0 & 0.0043 \end{matrix}]$	$0.0183$

	$Φ$	$H$	$Q$	$R$
x-axis raw data	0.9723	0.1350	0.0016	0.1181
x-axis BiLSTM	0.9738	0.3422	0.0016	0.2852
z-axis raw data	0.9471	0.1327	0.0065	0.1155
z-axis BiLSTM	0.9530	0.2983	0.0044	0.2430

$σ_{m e a n} = 0.0604$ , $n_{1} = 8$ , $n_{2} = 12$ , $r = 14$ , Significance Level $α = 0.05$ , Confidence Interval $[6, 16]$ .
	$X_{1}$	$X_{2}$	$X_{3}$	$X_{4}$	$X_{5}$	$X_{6}$	$X_{7}$	$X_{8}$	$X_{9}$	$X_{10}$
$σ_{i}$	0.0615	0.0583	0.0516	0.0621	0.0664	0.0572	0.0640	0.0545	0.0571	0.0628
$σ_{a i}$	0.0011	−0.0021	−0.0088	0.0017	0.0060	−0.0032	0.0036	−0.0059	−0.0033	0.0024
r	1	2		3		4	5	6		7
	$X_{11}$	$X_{12}$	$X_{13}$	$X_{14}$	$X_{15}$	$X_{16}$	$X_{17}$	$X_{18}$	$X_{19}$	$X_{20}$
$σ_{i}$	0.0601	0.0602	0.0565	0.0743	0.0585	0.0655	0.0532	0.0693	0.0588	0.0557
$σ_{a i}$	−0.0003	−0.0002	−0.0039	0.0139	−0.0019	0.0051	−0.0072	0.0089	−0.0016	−0.0047
$r$	8			9	10	11	12	13	14

	x-axis Raw Data	x-axis BiLSTM Network Results	z-axis Raw Data	z-axis BiLSTM Network Results
Skewness ξ	2.8329	2.7283	2.7294	2.8156
Kurtosis υ	0.0177	−0.1077	−0.0631	−0.0170

	p = 0	p = 1	p = 2	p = 3
q = 0	$-$	−4350.7006	−5373.4026	−5483.9134
q = 1	−2378.8661	−5480.4267	−5503.6485	−5503.2634
q = 2	−3784.2642	−5501.9738	−5502.2446	−5501.9462
q = 3	−4524.5464	−5504.1306	−5502.9798	−5514.9860

	p = 0	p = 1	p = 2	p = 3
q = 0	$-$	−33245.8540	−33390.6679	−33390.1658
q = 1	−24407.6844	−33392.3127	−33390.3534	−33387.4822
q = 2	−28837.8009	−33390.3543	−33391.2919	−33390.0297
q = 3	−30742.3669	−33388.3649	−33390.0548	−33456.4704

	p = 0	p = 1	p = 2	p = 3
q = 0	$-$	−3384.5655	−4241.9726	−4367.2696
q = 1	−1992.4063	−4410.0138	−4414.9553	−4417.2529
q = 2	−3107.6730	−4414.3979	−4414.5702	−4415.3240
q = 3	−3612.4223	−4417.6140	−4415.7666	−4414.6351

	p = 0	p = 1	p = 2	p = 3
q = 0	$-$	−31059.1193	−31186.5670	−31212.5235
q = 1	−23478.3934	−31198.8844	−31202.3767	−31212.2053
q = 2	−27430.5011	−31206.1886	−31221.9048	−31228.0632
q = 3	−29112.3646	−31212.9557	−31211.5881	−31262.0166

PERMALINK

A Combined Method for MEMS Gyroscope Error Compensation Using a Long Short-Term Memory Network and Kalman Filter in Random Vibration Environments

Chenhao Zhu

Sheng Cai

Yifan Yang

Wei Xu

Honghai Shen

Hairong Chu

Roles

Abstract

1. Introduction

2. Method

2.1. Multi-Layer BiLSTM Network and Kalman Filter

Figure 1.

2.2. Kalman Filter Design with ARMA Model

2.3. Kalman Filter Design with EM Algorithm

Figure 2.

3. Experiments and Results

3.1. Data Acquisition

Figure 3.

Table 1.

Figure 4.

3.2. Comparison of BiLSTM and Other RNN Variants

Figure 5.

Table 2.

Table 3.

Table 4.

Table 5.

Figure 6.

Figure 7.

Table 6.

Table 7.

3.3. Comparison of LSTM-EM-KF and LSTM-ARMA-KF

3.3.1. Estimating Kalman Filter Parameters Using the ARMA Model

Table 8.

3.3.2. Estimating Kalman Filter Parameters Using the EM Algorithm

Figure 8.

Table 9.

3.3.3. Kalman Filtering Results

Table 10.

Table 11.

Figure 9.

4. Conclusions

Appendix A. Stationarity and Normality Tests

Table A1.

Table A2.

Table A3.

Table A4.

Table A5.

Appendix B. Determine the ARMA Model Order

Figure A1.

Table A6.

Table A7.

Table A8.

Table A9.

Table A10.

Author Contributions

Funding

Conflicts of Interest

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases