Skip to main content
PLOS One logoLink to PLOS One
. 2024 Mar 7;19(3):e0298524. doi: 10.1371/journal.pone.0298524

Prediction model of land surface settlement deformation based on improved LSTM method: CEEMDAN-ICA-AM-LSTM (CIAL) prediction model

Shengchao Zhu 1, Yongjun Qin 1,2,*, Xin Meng 1, Liangfu Xie 1,2, Yongkang Zhang 3, Yangchun Yuan 1
Editor: Muhammad Usman Tariq4
PMCID: PMC10919871  PMID: 38452152

Abstract

The uneven settlement of the surrounding ground surface caused by subway construction is not only complicated but also liable to cause casualties and property damage, so a timely understanding of the ground settlement deformation in the subway excavation and its prediction in real time is of practical significance. Due to the complex nonlinear relationship between subway settlement deformation and numerous influencing factors, as well as the existence of a time lag effect and the influence of various factors in the process, the prediction performance and accuracy of traditional prediction methods can no longer meet industry demands. Therefore, this paper proposes a surface settlement deformation prediction model by combining noise reduction and attention mechanism (AM) with the long short-term memory (LSTM). The complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and independent component analysis (ICA) methods are used to denoise the input original data and then combined with AM and LSTM for prediction to obtain the CEEMDAN-ICA-AM-LSTM (CIAL) prediction model. Taking the settlement monitoring data of the construction site of Urumqi Rail Transit Line 1 as an example for analysis reveals that the model in this paper has better effectiveness and applicability in the prediction of surface settlement deformation than multiple prediction models. The RMSE, MAE, and MAPE values of the CIAL model are 0.041, 0.033 and 0.384%; R2 is the largest; the prediction effect is the best; the prediction accuracy is the highest; and its reliability is good. The new method is effective for monitoring the safety of surface settlement deformation.

1 Introduction

With the growth in population and increasing urbanization, the development and use of urban underground space are rapidly increasing. Due to its geological complexity, lengthy construction cycle, variety of influencing factors, and high professionalism requirements, this type of construction involves high risks. Therefore, the accurate prediction and effective control of surface settlement is crucial.

Scholars at home and abroad have achieved remarkable results in research on ground settlement deformation caused by subway excavation. Traditionally, theoretical analysis and calculation, empirical formula fitting, numerical simulation, and physical model experimentation have been employed. In recent years, artificial intelligence algorithms, including BP neural network [1], recurrent neural network (RNN) [2], support vector machines [3,4], grey prediction model [5], and random forest method [6], have been widely used in various fields due to their impressive speed and accuracy. Backpropagation neural network (BPNN) [7], as a representative of the prediction algorithm of surface subsidence, has become mainstream.

The current improvement methods involve two primary considerations. First, geotechnical engineering has significant uncertainty and fuzziness [8]. The relationship between surface settlement deformation and numerous influencing factors is long-term complex nonlinear, which means it cannot be described by simple functions. Second, the construction process data may be impacted by external factors and human involvement. Nonetheless, several methods do not eliminate such interference, thus requiring pre-analytical noise reduction processing of the resulting data. Therefore, a single algorithm is no longer sufficient to meet requirements [9], hence the emergence of combined algorithms.

Moghaddasi et al. [10] utilized an Independent Component Analysis- Artificial Neural Network(ICA-ANN) model to predict maximum surface settlement to enhance reliability. The ICA was employed to optimize the ANN and determining the most desirable values of weights and biases of the neural network layers for more accurate MSS prediction and to avoid trapping in local optimum. Han et al. [11] established the Simulated Annealing- Regularized Extreme Learning Machines(SA-RELM) model, which combines the sinusoidal algorithm and the regularized limit learning machine, to enhance the precision of multifactor prediction. However, these two methods have the disadvantage of few input parameters. Xiao et al. [12] established the AdaBoost gate recurrent unit(GRU) prediction model to improve fitting ability, but they did not consider the impact of human interference. Kim et al. [13] utilized extreme gradient boosting(XGB) integration to learn the super-parameters of the ML algorithm, which achieved optimal prediction performance, strong search ability, easy realization, and superior prediction results. Many scholars have attempted to optimize LSTM models using various methods. Alotaibi et al. [14] utilized convolutional neural networks(CNN) to optimize LSTM models to improve the low learning rate problem. Wang et al.[15] developed a dual contouring—long short-term memory(DC-LSTM) model that can achieve the multistep prediction of trends in time series. Li et al. [16] utilized an Adam-optimized LSTM network to effectively improve prediction accuracy and training speed. Qian Jiangu et al. [17] proposed a wavelet-optimized long short-term memory–auto regressive moving average(LSTM-ARMA) model for ultradeep foundation pit ground settlement analysis, which effectively considered the noise factor and predicted the trend and noise terms separately to reduce the prediction error. However, the influence of the input factor weights was ignored. Khataei et al. [18] utilized the spotted hyena optimizer (SHO) algorithm to optimize the LSTM network for multilabel text classification, and the accuracy rate was significantly higher than that of other models. Yang et al. [19] established an LSTM model that utilizes an attention mechanism to predict the deformation of concrete dams, which effectively considered the influence of significant factors on deformation and temporal variation and improved the prediction accuracy.

The following two main problems have not been addressed in the existing research. First, subsidence monitoring data contain noise pollution because of various factors, such as instrument error, manual error, and the surrounding environment. Second, the prediction process cannot simultaneously consider the variation in the time series and the influence of the characteristics of the input data and the training speed.

To overcome the above problems and obtain subway settlement prediction results faster and more accurately, this paper takes Urumqi Rail Transit Line 1 as an example and combines the field settlement monitoring data. Its main contents are as follows.

1. A subway settlement deformation noise reduction method that fully integrates the adaptive noise of empirical mode decomposition and independent component analysis (ICA) is proposed. The complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) algorithm is implemented as a dimension raising tool to decompose multidimensional signals, the ICA algorithm functions as a dimension reducing tool to separate the noise reduction signals, and a combined algorithm is utilized to smooth the error in the monitoring process and reduce the impact of noise on the prediction.

2. After a series of developmental steps, the attention mechanism can confirm the allocation weight of input factors through the relationship between input and output. This paper combines an attention mechanism with an LSTM network to construct the AM-LSTM surface settlement deformation prediction model and adds an Adam optimization algorithm with high calculation efficiency to improve prediction speed and accuracy. It can not only consider the influence of long-term dependence in prediction but also capture the influence characteristics during prediction in time while assigning appropriate weights. Compared to several other models, this model has the highest prediction accuracy and the best effect, and thus, it can serve as a reference for the prevention and control of ground settlement deformation caused by subway excavation.

2 Theoretical basis

2.1 CEEMDAN algorithm

To solve the problem of modal aliasing and end effect caused by the EMD algorithm signal decomposition and the residual white noise problem of EEMD algorithm and CEEMD decomposition, Torres [20] proposed CEEMDAN. Colominas et al. [21] compared EEMD with CEEMDAN and found that only the latter recovered completeness. Zhao et al. [22] found that CEEMDAN both overcame the modal aliasing problem caused by EMD and reduced the residual noise problem caused by EEMD.

The CEEMDAN method differs from other algorithms in that white noise is added to the original signal, but adaptive noise is added to the IMF component several times after decomposition using the EEMD algorithm. The first-order component IMF of the decomposition is obtained as the overall average value of the first eigenmode component. The same white noise is then added to the residual value to find the first eigenmode component, and all eigenmode components are found iteratively. The original signal is X(t), the Gaussian white noise is Nf(t), and the number of times white noise is added is f.

The specific decomposition steps of the CEEMDAN algorithm are as follows.

1. Repeatedly add adaptive noise N(t) to the original signal X(t) to obtain the first noisy signal X(t)=X(t)+Nf(t), and then conduct EMD decomposition to obtain the first order eigenmode component IMF1;

2. The overall average value of IMF1, which is the first-order eigenmode component of the X(t) decomposition IMF1 is obtained, after which the remainder R1(t) is obtained.

IMF1=1fIMF1, (1)
R1(t)=X(t)IMF1. (2)

3. A new decomposed signal X1(t) is constructed, and noise is added to obtain X1′(t). The second order intrinsic mode component IMF2 is then obtained via EMD decomposition.

X1(t)=R1(t), (3)
X1(t)=X1(t)+Nf(t). (4)

4. The overall average of IMF2, which is the second-order eigenmode component IMF2 of the X(t) decomposition, and the remaining term R2(t) can be obtained.

IMF2=1fIMF2, (5)
R2(t)=X1(t)IMF2. (6)

5. Repeat steps (1) and (2) until the program terminates to obtain all the eigenmodal components IMFi.

2.2 ICA algorithm

ICA is a method of separating independent source signals by applying statistical principles to the original observed signals. It plays a prominent role in blind source separation [23,24], feature recognition [25], and signal separation [26]. The source signal S(t)={S1(t),S2(t),,Sn(t)} is estimated from the known mixed signal X(t)={X1(t),X2(t),,Xn(t)}, and a certain linear relationship between X(t) and S(t) exists, which can be expressed as

X(t)=AS(t). (7)

The principle underlying the ICA algorithm is that for a nonzero mean independent source signal S(t), the actual observed signal X(t) is obtained from X(t) = A S(t) after data preprocessing, and the unmixing matrix A is obtained, as detailed in the process shown in Fig 1.

Fig 1. The model of ICA algorithm.

Fig 1

Among the numerous ICA algorithms, the Fast-ICA algorithm, which is based on negative entropy maximization, is a popular choice for signal processing owing to its computational simplicity, fast and robust convergence, and good separation effect. The implementation steps of the Fast-ICA algorithm are as follows:

1. Transform the original observed signal into a matrix X(t) with n rows and m columns;

2. Decentralize the X(t) data;

X¯j=Xji=1mXj(i),j=1,2,3,,n. (8)

3. Whiten the data after centralization;

① First, the covariance matrix CX=1mXXT of the solved data X(t).

② Perform feature solving CXV = VΛ.

where V = [v1, v2,…,vn] is the column matrix of eigenvectors, and Λ = diag(λ1, λ2,…,λn) is the diagonal matrix consisting of eigenvalues of the covariance matrix CX.

③ The final result after whitening of the original observed data signal is obtained as follows.

Xw(n×m)=Λ(12)(n×n)VT(n×n)X(n×m) (9)

4. Set the value of the magnitude of the parameter learning rate α;

5. Solve for the unmixing matrix W at moment i;

6. Solve the estimated independent component signal Sn×1(i) = Wn×nXn×1 at moment i;

7. Repeat steps (3)–(6) to obtain the independent component signals Sn×m = [s(1), s(2),…,s(m)] for all moments.

2.3 CEEMDAN-ICA noise reduction model

The CEEMDAN algorithm has its advantages and is suitable for feature extraction and signal noise reduction during time domain analysis of nonsmooth, nonlinear data [2729] but still suffers from small amounts of noise, low effective signal loss, and slow iteration speed [30,31]. The ICA algorithm can improve the accuracy and speed of signal separation, and CEEMDAN-ICA joint noise reduction combines the advantages of both CEEMDAN and ICA to build on their strengths and avoid their weaknesses. It has been widely applied, including in electromagnetic signals [32], EEG signals [33,34], and LIDAR systems [35] but has rarely been used in surface settlement deformation noise reduction.

Principally, the CEEMDAN algorithm is used to decompose the initial observation signal to obtain a series of intrinsic mode components. According to the correlation criterion, the IMF components with large noise are judged, and then the noise NIMF components obtained from the sum form a dual channel input signal with the original observation signal. Fast-ICA algorithm is used to separate and reconstruct the signal, and the effective signal after noise reduction can then finally be obtained. The steps are as follows, and the process is detailed in Fig 2.

Fig 2. CEEMDAN-ICA noise reduction principle.

Fig 2

1. The CEEMDAN decomposition of the initial observed signal X(t) is performed to obtain the eigenmodal components IMF1, IMF2,… …,., and IMFn of the original observed signal.

2. The correlation between the original observation signal X(t) and each IMF is calculated, and the Spearman correlation coefficient is adopted. The principle is shown in Formula (1), and its size is [–1,1].

ρ(X(t),IMF)=i=1n(X(t)X(t)¯)(IMFn(t)IMFn¯)i=1n(X(t)X(t)¯)2i=1n(IMFn(t)IMFn¯)2 (10)

where n is the sample size, ρ is the correlation coefficient, X(t) is the data of the initial observed signal, IMFn(t) is the IMF component, and IMFn¯ is the mean of the IMF component.

3. The virtual noise component NIMF signal is derived based on the fact that the noise component is the sum of the first K IMF components when the correlation coefficient has a minimal value in the previous K terms [36].

4. The NIMF signal and original observation signal X(t) are used as inputs of the Fast-ICA algorithm for signal noise separation, and source signals S1 and S2 are output.

5. The application based on the MDP criterion [37,38] can revise and eliminate the problems of incomplete recovery of the source signal amplitude, uncertainty of the source signal order, and uncertainty of the positive and negative sign of the source signal resulting from the independent components estimated by the ICA algorithm to obtain an effective noise reduction signal SX and noise signal SN.

2.4 LSTM neural network

Gers et al. [39] conducted a review that indicated that the standard LSTM algorithm outperforms other RNN algorithms on illustrative benchmark problems. The LSTM network structure is an improved RNN, which introduces the idea of gating. It remembers information through forgetting, memory, and output gates and is used to add, remove, retain, and transmit the state and information of the incoming and outgoing units. The four neural network layers interact in a special way, which can effectively avoid the long-term dependence of an RNN. The LSTM model first filters the subway settlement deformation input variables from the memory cell through the forgetting gate, and the input gate identifies important information. Finally, the output gate outputs the subway settlement deformation data and transmits it to the next LSTM structural unit. The modeling process inside each LSTM structural unit is as follows, and the process is detailed in Fig 3.

Fig 3. LSTM unit structure [40].

Fig 3

1. Forgetting gate: receives the output information ht−1 of the previous moment and the input information X(t) of the current moment and performs forgetting screening by multiplying them together to retain the desired content.

ft=σ(Wf[ht1,xt]+bf). (11)

2. Input gate: determines that information retained by the forgetting gate that needs to be updated and stored in the immediate cell state C˜t and filters the information, updates the cell state using the sigmoid activation function, and maps the output information to the [0,1] interval.

it=σ(Wi[ht1,xt]+bi), (12)
C˜t=tanh(WC[ht1,xt]+bC), (13)
Ct=ft*Ct1+it*C˜t. (14)
Pt=σ(Wo[ht1,xt]+bo) (15)
ht=Pt*tanh(Ct) (16)

3. Output gate: determines ht−1 and X(t) information that will be output. The cell state Ct is compressed to the (-1, 1) interval by the tanh activation function, and the hidden state ht at the current moment is obtained as the output through the output gate.

where · represents the matrix product and * denotes the Hadamard product, that is, the product of elements.

2.5 Attention mechanism

The attention mechanism can assign different weights from a large amount of information depending on the characteristics of each layer of the network to highlight the more important information and constrain the less important information [41]. The specific calculation process of the attention mechanism can be divided into three steps: first, the correlation or similarity between query and key is calculated according to their vector dot product; second, the Softmax function is introduced to normalize the weights obtained in the first stage, and the internal mechanism of the Softmax function is used to highlight the key information; finally, the attention value can be obtained by weighting the sum of the weight and its corresponding value.

Attention(q,k,v)=Softmax(qkTdf). (17)

Here, q is the abbreviation of query, which is a query vector and is used to query; k is the abbreviation of key, which is a key vector and serves as an index; v is the abbreviation of value, which is the value vector, and its role is the main content; and df represents the characteristic dimension of the input information, which is used to enhance the stability of the Softmax function during calculation. The calculation process of q, k, v is as follows:

q=WqX, (18)
k=WkX, (19)
v=WvX. (20)

The monitoring data X are multiplied by Wq, Wk, and Wv, respectively, and linearly transformed to obtain q, k, v. Wq, Wk, and Wv need to be customized at the time of calculation and continuously debugged according to the results. Different weight assignments will give different results, which in turn affect the prediction effect, so the best combination should be selected.

2.6 AM-LSTM prediction model

Zheng et al. [5] illustrated the underlying rationale behind the excellent performance of the attention mechanism in learning long-term dependencies by studying the memory properties of LSTM networks with the attention mechanism, and the results revealed that the decline due to the attention mechanism is significantly slower than that of the LSTM. Moreover, the LSTM model containing the attention mechanism not only performs better in time series prediction but also maintains long-time memory while significantly lowering the training time. The validation was performed successively by Chen and Zhang [31].

The structure of an AM-LSTM model in this paper can be divided into five parts: input, LSTM, attention, fully connected, and output layers. The role of the LSTM layer is to sense the state, memorize the information, and perform feature learning; the attention layer redistributes different features and highlights the key information, and the fully connected layer performs local feature integration to achieve the final prediction.

Input layer: The signal SX after noise reduction is normalized to obtain I(t) and input to the model. The normalization process aims to speed up the convergence and improve the accuracy of the prediction model. The data are scaled and linearly transformed through min-max normalization so that the result is between 0 and 1. The principle formula is as follows.

I(t)=SXSminSmaxSmin (21)

where I(t) is the data after data normalization, Smax is the maximum value of the data SX after noise reduction, and Smin is the minimum value of the data SX after noise reduction.

LSTM layer: The LSTM layer is used to learn the input sequence I(t) to get the output sequence L(t) of the LSTM at moment t.

Attention layer: The input vector L(t) of the LSTM layer is used as the input of the attention layer, and the feature vectors are weighted and summed to highlight the key information to obtain the output sequence αt of attention at moment t.

Fully connected layer: The output result αt of the attention layer is used as the input of the fully connected layer, the normalized data result of each input data is introduced to train the model, and the normalized prediction value Y(t) of the model output is obtained.

Output layer: The output prediction value Y(t) of the fully connected layer is denormalized to obtain the final prediction result P(t), and the process is detailed in Fig 4.

Fig 4. Attention–LSTM model diagram.

Fig 4

3 Process and analysis

3.1 Dataset acquisition

The data in this paper were obtained from the Wangjialiang Station Project of Urumqi Rail Transit Line 1. The station body is an 11 m platform island-type underground excavation station with two underground layers. The standard section adopts a box frame structure with a width of 20.1 m and a height of 15.29 m. During the excavation of the foundation pit, the main strata are the pebble layer, silty clay, mudstone, and sandstone, among others, and bedrock fissure water is present, which makes construction challenging.

The layout of the surface settlement measuring points of Wangjialiang Station is shown in Fig 5(A) and 5(B). At present, the settlement of each measuring point has converged, the settlement rate is generally less than ±2.00mm/d, and the cumulative settlement is between -21.00 and 6.00mm. The prediction model in this paper collected the 201 phase settlement monitoring data points of test point DB-37-03 to establish a data set. The ratio of the training set to the test set is 5:1. The first 176 phases of data are used as the training set of the prediction model, whereas the last 26 phases of data are used as the test set of the prediction model to verify the feasibility of the prediction model proposed in this paper. The longitudinal geological profile of the location of test point DB-37-03 is shown in Fig 5(C).

Fig 5. Wangjialiang station diagram.

Fig 5

(a) Urumqi rail transit line 1 project line plane diagram; (b) Wangjialiang station surface subsidence measuring point layout; (c) Geological profile.

3.2 Evaluating indicator

To verify the effectiveness of the algorithm, the correlation coefficient and root mean square error (RMSE) were used to evaluate the effect of noise reduction. RMSE, mean absolute error (MAE), mean absolute percentage error (MAPE), and sample regression determination coefficient (R2) were used as the evaluation indicators of CEEMDAN-ICA-AM-LSTM prediction model in this paper. RMSE is used to measure the deviation between the noise-reduced data and the original data and the deviation between the predicted and observed values, reflecting the accuracy of the prediction. MAE is a measure of the average error in prediction; MAPE reflects the degree of error fluctuation in prediction, which can be used to evaluate the stability of the model; and R2 reflects the degree of fitting of the predicted data to the original measured data. The larger the correlation coefficient, the smaller the RMSE; the better the noise reduction effect; and the smaller the RMSE, MAE, and MAPE, whereas the larger the R2, the better the prediction effect.

RMSE=1nt=1n(P(t)X(t))2 (22)
MAE=t=1n|P(t)X(t)|n (23)
MAPE=100%ni=1n|P(t)X(t)X(t)| (24)
R2=1t=1n(P(t)X(t))2t=1n(X(t)X¯)2 (25)

where n is the number of data samples, P(t) is the predicted data, X(t) is the measured data, and X¯ is the average of the measured data X(t).

3.3 Data noise reduction process

The original observation signal of measuring point DB-37-03 collected at Wangjialiang station was decomposed using the CEEMDAN algorithm, which can be decomposed into five eigenmode component IMFs (IMF1IMF5) and one residual component (IMF6), as shown in Fig 6.

Fig 6. IMFS components of DB-37-03 signal decomposition in Wangjialiang station.

Fig 6

Then, according to the principle of Spearman’s correlation coefficient, the correlation coefficient of each IMF component with the original observed data signal was calculated, as shown in Fig 7. The first minimal value point appears in the third eigenmode component IMF3, and the first three IMF components can be judged as noise components reconstructed to obtain the virtual noise component NIMF.

Fig 7. Correlation coefficient between IMFS components.

Fig 7

Next, the virtual noise component NIMF signal and the original observation data of Wangjialiang station are used as the dual input channels of the Fast-ICA algorithm, and the source signals S1 and S2 are output (Fig 8). The correlation coefficient between signal S1 and the original observation signal is 0.9995 according to the Spearman correlation coefficient principle, and the correlation is very high, so the signal can be judged as the noise reduction signal. However, signal S1 has the problem that amplitude and phase cannot be fully recovered, and revision by applying the MDP criterion is necessary to finally obtain the effective noise reduction signal SX (Fig 9) and the noise signal SX (Fig 10).

Fig 8. Source signals.

Fig 8

Fig 9. Denoising signal SX.

Fig 9

Fig 10. Noise signal SN.

Fig 10

Compared with the original observation data X(t) of measuring point DB-37-03 at Wangjialiang Station, the observation data SX after noise reduction by CEEMDAN-ICA model filtering have less noise and a smoothed curve trend. Meanwhile, the evaluation indicators are used for quantitative comparison. See Table 1 for details. The median of the data after noise reduction is largely unchanged compared to the original observed data, the minimum value becomes larger, and the variance exhibits a significant downward trend, decreasing by 0.139. The results reveal that the CEEMDAN-ICA noise reduction model can reduce the adverse effects of noise while ensuring that the overall trend characteristics of the original observed data remain unchanged.

Table 1. Analysis indexes before and after noise reduction.

Analysis indicators X(t)/mm SX/mm
Maximum value 0 0
Minimum value -10.400 -10.226
Median -3.735 -3.738
Square difference 2.733 2.705
Variance 9.720 9.581

To further illustrate the effectiveness of the method in this paper, the data of several identical stations and the same measurement points in Jiaqi Zhang’s paper were selected to compare it with the improved wavelet noise reduction method. Table 2 indicates that the RMSE values of the observed data after the noise reduction of the CEEMDAN–ICA model are prominently lower, indicating that the component contains more features of the original signal, and the noise reduction effect is pronounced, which confirms the effectiveness and feasibility of the CEEMDAN–ICA noise reduction model in the metro deformation monitoring data.

Table 2. RMSE comparison of noise reduction models at different sites.

Site name Wavelet noise reduction Method of this article
Nanhu North Road station 0.2759 0.0612
Wangjialiang station 0.2785 0.1038
Santunbei—Xinjiang University 0.1161 0.0636
Nanhu North Road—Wangjialiang 0.2603 0.2427

3.4 Model training

To increase the validity and feasibility of the prediction models, the prediction effects of LSTM, AM-LSTM (hereinafter referred to as AL), CEEMDAN-ICA-LSTM (hereinafter referred to as CIL), and the CEEMDAN-ICA-AM-LSTM (hereinafter referred to as CIAL) prediction models were analyzed in a cross-sectional comparison using the same data set in this paper. After pre-debugging, the training times of all four models were 700, the training steps were 5, the prediction step was 1, the selected optimizer was the Adam algorithm, and the loss function of the prediction model was the RMSE loss function to facilitate gradient descent and function convergence. The specific process can be seen in Fig 11.

Fig 11. The training specific process and methods of the model.

Fig 11

After determining the parameters of the LSTM, AL, CIL, and CIAL prediction models, the training pairs were trained iteratively using the training set, and the prediction results were compared and analyzed on the test set; the results are shown in Fig 12.

Fig 12. Model prediction results.

Fig 12

4 Result and discussion

4.1 Performance comparison

The prediction performance of the four prediction models, LSTM, AL, CIL, and CIAL on the training set of Wangjialiang station is listed in Table 3.

Table 3. Four model prediction evaluation indicators.

Predictive Models RMSE (mm) MAE (mm) R2 MAPE (%)
LSTM 0.139 0.111 0.798 1.319
AL 0.065 0.061 0.955 0.729
CIL 0.097 0.079 0.903 0.938
CIAL 0.041 0.033 0.983 0.384

Compared with the LSTM prediction model, the RMSE value of CIAL model decreased from 0.139 to 0.065 (52.98%), the MAE value decreased from 0.111 to 0.061 (45.27%), the R2 value increased from 0.798 to 0.955 (19.67%), and the MAPE value decreased from 1.1319% to 0.729% (44.73%). Compared to the CIL prediction model, the RMSE value of CIAL model decreased from 0.097 to 0.041 (58.03%), the MAE value decreased from 0.079 to 0.033 (58.30%), the R2 value increased from 0.903 to 0.983 (8.86%), and the MAPE value decreased from 0.938% to 0.384% (59.06% year-on-year decrease). Comparing the two groups of models reveals that the prediction ability and effect of the model with the attention mechanism is significantly better than that of the model without the attention mechanism, indicating that the attention mechanism can effectively improve the prediction accuracy of the model.

Compared to the LSTM prediction model, the CIL prediction model shows a 30.63% decrease in RMSE values from 0.139 to 0.097, a 28.85% decrease in MAE values from 0.111 to 0.079, a 13.10% increase in R2 values from 0.798 to 0.903, and a 13.15% decrease in MAPE values from 1.1319% to 0.983%. Compared to the AL prediction model, the RMSE value decreased by 38.07% from 0.065 to 0.041, the MAE value decreased by 45.80% from 0.061 to 0.033, the R2 value increased by 8.89% from 0.955 to 0.983, and the MAPE value decreased by 47.32% from 0.729% to 0.384%. Comparing the two groups of models reveals that the prediction ability and effect of the filtered noise-reduced model are significantly better than those of the original model. This, combined with the visualization results in Fig 13, suggests that the original metro deformation monitoring sequence containing random noise and monitoring anomalies will interfere with the robustness of the prediction model to a certain extent and have some effect on its prediction accuracy. It also indicates that the CEEMDAN-ICA noise reduction algorithm can effectively smooth the subway settlement data with nonlinearity and nonsmoothness, which can effectively improve the prediction accuracy of the model.

Fig 13. Comparison of prediction results.

Fig 13

(a) LSTM model prediction results; (b) AL model prediction results; (c) CIL model prediction results; (d) CIAL model prediction results.

Table 3 and Fig 13 reveal that the performances of the four prediction models in terms of the decision coefficient R2 are largely the same, and CIAL has the best prediction performance. Compared with the other three prediction models, the CIAL prediction model has the smallest residual value; the smallest RMSE, MAE, and MAPE values; and the largest R2. Each index is the best, and the prediction accuracy is optimal.

4.2 Robustness testing

Junqi Yu et al. [42] used absolute error comparison analysis to verify that the TSA-RBF-LSTM model has good stability. In this paper, the absolute error AE = |predicted value-observed value| was obtained for the four prediction models and plotted in a box plot. Fig 14 shows that the CIAL prediction model has the smallest error magnitude range, and the median and mean values are closest to “0” compared to those in the other three prediction models, which indicates that the prediction model can control the absolute error of the prediction results within a small value interval. These results demonstrate that the model is robust and has strong prediction performance.

Fig 14. Absolute error diagram.

Fig 14

4.3 Evaluation of fit

The distribution of the four models LSTM, AL, CIL, and CIAL in Fig 15 on the test set showed that all the points were on both sides of the function y = x, which is relatively consistent, and the correlation coefficient was more than 0.79. No overfitting or underfitting occurred. However, compared with other prediction models, the CIAL prediction model had the smallest deviation from the true value in the prediction of subway settlement, and it performed best among all networks. This confirms that the model has better prediction accuracy and prediction performance than the other models.

Fig 15. Comparison of predicted and observed values of four models.

Fig 15

5 Conclusion

This paper adopts various methods, such as complete noise-assisted aggregation empirical mode decomposition, independent component analysis, LSTM, and attention mechanism and proposes a joint noise reduction model with good effect and a dynamic prediction model with good prediction. Taking Wangjialiang Station of Urumqi Metro Line 1 as an example, different prediction models are compared and the following conclusions drawn.

First, the prediction performances of four prediction models, LSTM, AL, CIL, and CIAL, were comprehensively compared using four evaluation indexes, RMSE, MAE, R2, and MAPE. The prediction accuracy on the test set was CIAL, AL, CIL, and LSTM in order, with CIAL having the best prediction performance with an RMSE of 0.041 mm, MAE of 0.033 mm, R2 of 0.983, and MAPE of 0.276% and achieves the dynamic prediction of settlement data. Using this model, the construction site data can be fed back in advance, so that timely optimization and adjustment can be made in the subsequent construction process, which can provide a reference for similar projects.

Second, utilizing a CEEMDAN–ICA noise reduction model can reduce the error between the real and observed data of subway settlement deformation generated by the outside world, weaken the interference of external environment and human factors, and ensure the validity and authenticity of the data, compared with the prediction results obtained from data without noise reduction. Accordingly, RMSE, MAE, R2, and MAPE indicated improved prediction performance and improved accuracy.

Third, the addition of the attention mechanism overcomes the shortcomings of the traditional LSTM model training process, in which the information is critical and unavailable for a long time and improves the attention to the important factors associated with surface subsidence and deformation. The addition of an Adam optimization algorithm effectively improves the prediction speed. The prediction results, as indicated by RMSE, MAE, R2, and MAPE revealed that omitting the attention mechanism from the data resulted in improvements in these measures, with good reliability and stability being achieved.

The CIAL model can effectively mitigate environmental disturbances and has good reliability and stability. However, the pre-processing of data for this model is still quite complex, and a simpler and more reliable model should be studied. In the future, researching more optimization algorithms to improve settlement prediction accuracy would support novel modeling and simulation techniques, and provide a safer subway settlement monitoring platform.

Supporting information

S1 Dataset. Original data.

(XLSX)

pone.0298524.s001.xlsx (54.4KB, xlsx)
S2 Dataset. Predicted results.

(XLSX)

pone.0298524.s002.xlsx (18.9KB, xlsx)
S1 Code. The code for CEEDAM.

(PY)

S2 Code. The code for FastICA.

(PY)

pone.0298524.s004.py (4.5KB, py)
S3 Code. The code for AL.

(PY)

pone.0298524.s005.py (5.2KB, py)
S4 Code. The code for LSTM.

(PY)

pone.0298524.s006.py (4.9KB, py)
S5 Code. The code for evaluating.

(PY)

pone.0298524.s007.py (5.2KB, py)

Acknowledgments

We are grateful to students Meng Xin and Yuanyang Chun for help, to Professors Qin Yongjun and Xie Liangfu for guidance, and to Dr. Hua for improving the language.

Data Availability

All relevant data are within the manuscript and its Supporting Information files.

Funding Statement

This work was supported by the Natural Science Foundation of Xinjiang Autonomous Region of China.[grant number 2021D01C073]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Zhang J, Qin Y, Xie L, editors. Predicting the settlement of Urumqi subway based on wavelet denoising and BP neural network. IOP Conference Series: Earth and Environmental Science; 2020: IOP Publishing. [Google Scholar]
  • 2.Kumar S, Kumar D, Donta PK, Amgoth T. Land subsidence prediction using recurrent neural networks. Stochastic Environmental Research and Risk Assessment. 2022;36(2):373–88. [Google Scholar]
  • 3.Song Z, Liu S, Jiang M, Yao S, Sun L. Research on the Settlement Prediction Model of Foundation Pit Based on the Improved PSO-SVM Model. Scientific Programming. 2022;2022:1–9. doi: 10.1155/2022/1921378 [DOI] [Google Scholar]
  • 4.Tang JC, Peng L, Chen Z. A Computational Approach of Displacement Prediction in an Engineering Project. Journal of Physics: Conference Series. 2022;2218(1). doi: 10.1088/1742-6596/2218/1/012042 [DOI] [Google Scholar]
  • 5.Zheng W, Zhao P, Huang K, Chen G. Understanding the Property of Long Term Memory for the LSTM with Attention Mechanism. Proceedings of the 30th ACM International Conference on Information & Knowledge Management2021. p. 2708–17. [Google Scholar]
  • 6.Ling X, Kong X, Tang L, Zhao Y, Tang W, Zhang Y. Predicting earth pressure balance (EPB) shield tunneling-induced ground settlement in compound strata using random forest. Transportation Geotechnics. 2022;35. doi: 10.1016/j.trgeo.2022.100771 [DOI] [Google Scholar]
  • 7.Ye X-W, Jin T, Chen Y-M. Machine learning-based forecasting of soil settlement induced by shield tunneling construction. Tunnelling and Underground Space Technology. 2022;124. doi: 10.1016/j.tust.2022.104452 [DOI] [Google Scholar]
  • 8.Phoon K-K, Ching J, Shuku T. Challenges in data-driven site characterization. Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards. 2022;16(1):114–26. [Google Scholar]
  • 9.Cao Y, Zhou X, Yan K. Deep learning neural network model for tunnel ground surface settlement prediction based on sensor data. Mathematical Problems in Engineering. 2021;2021. [Google Scholar]
  • 10.Moghaddasi MR, Noorian-Bidgoli M. ICA-ANN, ANN and multiple regression models for prediction of surface settlement caused by tunneling. Tunnelling and Underground Space Technology. 2018;79:197–209. doi: 10.1016/j.tust.2018.04.016 [DOI] [Google Scholar]
  • 11.Han Y, Wang Y, Liu C, Hu X, Du L. Application of regularized ELM optimized by sine algorithm in prediction of ground settlement around foundation pit. Environmental Earth Sciences. 2022;81(16). doi: 10.1007/s12665-022-10542-2 [DOI] [Google Scholar]
  • 12.Xiao H, Chen Z, Cao R, Cao Y, Zhao L, Zhao Y. Prediction of shield machine posture using the GRU algorithm with adaptive boosting: A case study of Chengdu Subway project. Transportation Geotechnics. 2022;37. doi: 10.1016/j.trgeo.2022.100837 [DOI] [Google Scholar]
  • 13.Kim D, Kwon K, Pham K, Oh J-Y, Choi H. Surface settlement prediction for urban tunneling using machine learning algorithms with Bayesian optimization. Automation in Construction. 2022;140. doi: 10.1016/j.autcon.2022.104331 [DOI] [Google Scholar]
  • 14.Alotaibi FM, Asghar MZ, Ahmad S. A hybrid CNN-LSTM model for psychopathic class detection from tweeter users. Cognitive Computation. 2021;13(3):709–23. [Google Scholar]
  • 15.Wang R, Peng C, Gao J, Gao Z, Jiang H. A dilated convolution network-based LSTM model for multi-step prediction of chaotic time series. Computational and Applied Mathematics. 2020;39(1):1–22. [Google Scholar]
  • 16.Li H, Zhao Z, Du X. Research and Application of Deformation Prediction Model for Deep Foundation Pit Based on LSTM. Wireless Communications and Mobile Computing. 2022;2022. [Google Scholar]
  • 17.Jiangu Q, Anhai W, Jun J, Long C, Wei X. Prediction for Nonlinear Time Series of Geotechnical Engineering Based on Wavelet-Optimized LSTM-ARMA Model. Journal of Tongji University (Natural Science Edition). 2021. [Google Scholar]
  • 18.Khataei Maragheh H, Gharehchopogh FS, Majidzadeh K, Sangar AB. A new hybrid based on long Short-term memory network with spotted Hyena optimization algorithm for multi-label text classification. Mathematics. 2022;10(3):488. [Google Scholar]
  • 19.Yang D, Gu C, Zhu Y, Dai B, Zhang K, Zhang Z, et al. A concrete dam deformation prediction method based on LSTM with attention mechanism. IEEE Access. 2020;8:185177–86. [Google Scholar]
  • 20.Torres ME, Colominas MA, Schlotthauer G, Flandrin P, editors. A complete ensemble empirical mode decomposition with adaptive noise. 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP); 2011: IEEE. [Google Scholar]
  • 21.Colominas MA, Schlotthauer G, Torres ME, Flandrin P. Noise-Assisted Emd Methods in Action. Advances in Adaptive Data Analysis. 2013;04(04). doi: 10.1142/s1793536912500252 [DOI] [Google Scholar]
  • 22.Zhao D, Li K, Li H. A New Method for Separating EMI Signal Based on CEEMDAN and ICA. Neural Processing Letters. 2021;53(3):2243–59. doi: 10.1007/s11063-021-10432-x [DOI] [Google Scholar]
  • 23.Artoni F, Delorme A, Makeig S. Applying dimension reduction to EEG data by Principal Component Analysis reduces the quality of its subsequent Independent Component decomposition. Neuroimage. 2018;175:176–87. Epub 2018/03/13. doi: 10.1016/j.neuroimage.2018.03.016 ; PubMed Central PMCID: PMC6650744. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Barborica A, Mindruta I, Sheybani L, Spinelli L, Oane I, Pistol C, et al. Extracting seizure onset from surface EEG with independent component analysis: Insights from simultaneous scalp and intracerebral EEG. Neuroimage Clin. 2021;32:102838. Epub 2021/10/09. doi: 10.1016/j.nicl.2021.102838 ; PubMed Central PMCID: PMC8503578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Shi T, Fukuda Y, Doi K, Okuno Ji Extraction of GRACE/GRACE-FO observed mass change patterns across Antarctica via independent component analysis (ICA). Geophysical Journal International. 2022;229(3):1914–26. doi: 10.1093/gji/ggac033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Captier N, Merlevede J, Molkenov A, Seisenova A, Zhubanchaliyev A, Nazarov PV, et al. BIODICA: a computational environment for Independent Component Analysis of omics data. Bioinformatics. 2022;38(10):2963–4. doi: 10.1093/bioinformatics/btac204 [DOI] [PubMed] [Google Scholar]
  • 27.Gao B, Huang X, Shi J, Tai Y, Zhang J. Hourly forecasting of solar irradiance based on CEEMDAN and multi-strategy CNN-LSTM neural networks. Renewable Energy. 2020;162:1665–83. doi: 10.1016/j.renene.2020.09.141 [DOI] [Google Scholar]
  • 28.Li S, Cai M, Han M, Dai Z. Noise Reduction Based on CEEMDAN-ICA and Cross-Spectral Analysis for Leak Location in Water-supply Pipelines. IEEE Sensors Journal. 2022. [Google Scholar]
  • 29.Lin G, Lin A, Cao J. Multidimensional KNN algorithm based on EEMD and complexity measures in financial time series forecasting. Expert Systems with Applications. 2021;168. doi: 10.1016/j.eswa.2020.114443 [DOI] [Google Scholar]
  • 30.Gao Y, Hang Y, Yang M. A cooling load prediction method using improved CEEMDAN and Markov Chains correction. Journal of Building Engineering. 2021;42. doi: 10.1016/j.jobe.2021.103041 [DOI] [Google Scholar]
  • 31.Zhang Y, Chen Y. Application of hybrid model based on CEEMDAN, SVD, PSO to wind energy prediction. Environ Sci Pollut Res Int. 2022;29(15):22661–74. Epub 2021/11/20. doi: 10.1007/s11356-021-16997-3 . [DOI] [PubMed] [Google Scholar]
  • 32.Luo Z, Yan Z, Fu W. Electroencephalogram artifact filtering method of single channel EEG based on CEEMDAN-ICA. Chin J Sens Actuators. 2018;31(8):1211–6. [Google Scholar]
  • 33.Li Q, Wei L, Xu Y, Yang B. Ocular Artifact Removal Algorithm of Single Channel EEG Based on CEEMDAN-ICA-WTD. 2021 IEEE 6th International Conference on Signal and Image Processing (ICSIP) 2021. p. 451–5. [Google Scholar]
  • 34.Longxin Z, Minmin M, Wenjun H. Research on Removing Ocular Artifacts from Multi-Channel EEG signals. 2021 7th International Conference on Computer and Communications (ICCC) 2021. p. 2280–6. [Google Scholar]
  • 35.Lin X, Yang S, Liao Y. Backward scattering suppression in an underwater LiDAR signal processing based on CEEMDAN-fast ICA algorithm. Optics Express. 2022;30(13):23270–83. doi: 10.1364/OE.461007 [DOI] [PubMed] [Google Scholar]
  • 36.Liu D, Deng A, Liu Z. De-noising method for fault acoustic emission signals based on the EMD and correlation coefficient. J Vib Shock. 2017;36(19):71–7. [Google Scholar]
  • 37.Matsuoka K, editor Minimal distortion principle for blind source separation. Proceedings of the 41st SICE Annual Conference SICE 2002; 2002: IEEE. [Google Scholar]
  • 38.Xu C, Fan Q. Denoising Method for Deformation Monitoring Data Based on ICEEMD-ICA and MDP Principle. Geomatics and Information Science of Wuhan University. 2021;46(11):1658–65. CSCD:7112193. [Google Scholar]
  • 39.Felix AG, Jürgen S, Fred C. Learning to forget: Continual prediction with LSTM. Neural computation. 2000;12(10):2451–71. doi: 10.1162/089976600300015015 [DOI] [PubMed] [Google Scholar]
  • 40.Beibei Y, Kunlong Y, Juan D. A model for predicting landslide displacement based on time series and long and short term memory neural network. Chinese Journal of Rock Mechanics and Engineering. 2018;37(10):2334–43. doi: 10.13722/j.cnki.jrme.2018.0468 [DOI] [Google Scholar]
  • 41.Liu J, Liu J, Luo X. Research progress in attention mechanism in deep learning. Chinese Journal of Engineering. 2021;43(11):1499–511. CSCD:7085612. [Google Scholar]
  • 42.Yu J, Yang S, Zhao A, Gao Z. Hybrid prediction model of building energy consumption based on neural network. Journal of Zhejiang University Engineering Science. 2022;56(6):1220–31. CSCD:7244492. [Google Scholar]

Decision Letter 0

Muhammad Usman Tariq

21 Dec 2023

PONE-D-23-35669Prediction model of land surface settlement deformation based on improved LSTM method: CEEMDAN-ICA-AM-LSTM (CIAL) prediction modelPLOS ONE

Dear Dr. Qin,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Feb 04 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Dr. Muhammad Usman Tariq

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, all author-generated code must be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse.

3. Note from Emily Chenette, Editor in Chief of PLOS ONE, and Iain Hrynaszkiewicz, Director of Open Research Solutions at PLOS: Did you know that depositing data in a repository is associated with up to a 25% citation advantage (https://doi.org/10.1371/journal.pone.0230416)? If you’ve not already done so, consider depositing your raw data in a repository to ensure your work is read, appreciated and cited by the largest possible audience. You’ll also earn an Accessible Data icon on your published paper if you deposit your data in any participating repository (https://plos.org/open-science/open-data/#accessible-data).

4. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. 

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

5. Thank you for stating the following financial disclosure: 

"This work was supported by the Natural Science Foundation of Xinjiang Autonomous Region of China.[grant number 2021D01C073]" 

Please state what role the funders took in the study.  If the funders had no role, please state: ""The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."" 

If this statement is not correct you must amend it as needed. 

Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

6. Thank you for stating the following in the Acknowledgments Section of your manuscript: 

"We are grateful to the National Natural Science Foundation of China for support, to students Meng Xin and Yuanyang Chun for help, to Professors Qin Yongjun and Xie Liangfu for guidance, and to Dr. Hua for improving the language."

We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. 

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: 

"This work was supported by the Natural Science Foundation of Xinjiang Autonomous Region of China.[grant number 2021D01C073]"

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

7. When completing the data availability statement of the submission form, you indicated that you will make your data available on acceptance. We strongly recommend all authors decide on a data sharing plan before acceptance, as the process can be lengthy and hold up publication timelines. Please note that, though access restrictions are acceptable now, your entire data will need to be made freely accessible if your manuscript is accepted for publication. This policy applies to all data except where public deposition would breach compliance with the protocol approved by your research ethics board. If you are unable to adhere to our open data policy, please kindly revise your statement to explain your reasoning and we will seek the editor's input on an exemption. Please be assured that, once you have provided your new statement, the assessment of your exemption will not hold up the peer review process.

8. PLOS requires an ORCID iD for the corresponding author in Editorial Manager on papers submitted after December 6th, 2016. Please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field. This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager. Please see the following video for instructions on linking an ORCID iD to your Editorial Manager account: https://www.youtube.com/watch?v=_xcclfuvtxQ

9. Please amend either the title on the online submission form (via Edit Submission) or the title in the manuscript so that they are identical.

10. Please ensure that you refer to Figure 1, 6, 7, 8, 9, 10, 13 and 14 in your text as, if accepted, production will need this reference to link the reader to the figure.

11. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Additional Editor Comments:

==============================

ACADEMIC EDITOR:

  • Improve the structure of the manuscript

  • Recheck all equations

  • Explain evaluating indicators and the reason to use them

  • Provide more clarity on model training. How dataset was organised and what approach was followed. Provide the flowchart

  • While conclusion is provided, also provide managerial implications, recommendations, and limitations

==============================

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: 1. This study is very significant and the finding is also very impressive but there are fews errors that you need to fix:

i. Error at line 55 "Moghaddasi et al.Error! Reference source not found.[10] utilized", line 137, line 304, line 309, line 314, line 386, and line 394

ii. Avoid to use number at you conclusion such as 1, 2,3. Replace it with for example "First, second, third" and so on

Reviewer #2: In Section 4.1 Performance comparison:

There is no baseline or featureless model.

For regression tasks, a baseline model always predicts the mean of the training set ignoring any input features.

In Section 3.2 Evaluating Indicator Section:

RMSE, MAPE and the other indicators mentioned there are known widely in this field.

There is no need to explain in detail and write formulas.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Daniel Agyapong

**********

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2024 Mar 7;19(3):e0298524. doi: 10.1371/journal.pone.0298524.r002

Author response to Decision Letter 0


23 Jan 2024

Academic editor:

1. Improve the structure of the manuscript.

Thank the editor for the valuable comments on the article. We have refined the structure of the article in the style of PLoS One.

2. Recheck all equations.

Thank you for your reminding. We've checked all the equations and made sure they're correct.

3. Explain evaluating indicators and the reason to use them.

Thank the editor for the valuable comments. We have already discussed the evaluating indicator in Section 3.2 and explained their meaning with equations.

4. Provide more clarity on model training. How dataset was organized and what approach was followed. Provide the flowchart.

Thank the editor for the valuable comments. The training process and specific methods of the model have been provided in the paper, as shown in Fig 11. (Line 353)

5. While conclusion is provided, also provide managerial implications, recommendations, and limitations.

Your suggestion really means a lot to us. The managerial implications, recommendations, and limitations have been provided after the conclusion. (Line 442~446)

Reviewer #1:

1. Error at line 55 "Moghaddasi et al.Error! Reference source not found.[10] utilized", line 137, line 304, line 309, line 314, line 386, and line 394.

Thanks for your careful checks. We have reinserted the reference [10] (Line 55) to ensure it is discovered. We have fixed the errors in line 137, line 309, line 314, line 386 and line 394. We apologize for not finding an error in line 304.

2. Avoid to use number at you conclusion such as 1,2,3. Replace it with for example "First, second, third" and so on.

Thanks for your suggestions. We have replaced the number with "First, second, third" and so on. (Line 424,431,436)

Reviewer #2:

1. In Section 4.1 Performance comparison:

There is no baseline or featureless model.

For regression tasks, a baseline model always predicts the mean of the training set ignoring any input features.

Thank the reviewer for the valuable comments. The data obtained in the experiment already possesses temporal characteristics, and since our model does not contain other features, we believe that there is no need to compare it with a featureless model.

2. In Section 3.2 Evaluating Indicator Section:

RMSE, MAPE and the other indicators mentioned there are known widely in this field.

There is no need to explain in detail and write formulas.

Thank you again for your positive comments and valuable suggestions to improve the quality of our manuscript. The description of the evaluation indicators in Section 3.2 is provided to account for the specific indicators used in this paper to assess the model, hence the need for explanation.

Attachment

Submitted filename: Responses to Reviewers.doc

pone.0298524.s008.doc (83KB, doc)

Decision Letter 1

Muhammad Usman Tariq

26 Jan 2024

Prediction model of land surface settlement deformation based on improved LSTM method: CEEMDAN-ICA-AM-LSTM (CIAL) prediction model

PONE-D-23-35669R1

Dear Dr. Qin,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Dr. Muhammad Usman Tariq

Academic Editor

PLOS ONE

Acceptance letter

Muhammad Usman Tariq

27 Feb 2024

PONE-D-23-35669R1

PLOS ONE

Dear Dr. Qin,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Muhammad Usman Tariq

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Dataset. Original data.

    (XLSX)

    pone.0298524.s001.xlsx (54.4KB, xlsx)
    S2 Dataset. Predicted results.

    (XLSX)

    pone.0298524.s002.xlsx (18.9KB, xlsx)
    S1 Code. The code for CEEDAM.

    (PY)

    S2 Code. The code for FastICA.

    (PY)

    pone.0298524.s004.py (4.5KB, py)
    S3 Code. The code for AL.

    (PY)

    pone.0298524.s005.py (5.2KB, py)
    S4 Code. The code for LSTM.

    (PY)

    pone.0298524.s006.py (4.9KB, py)
    S5 Code. The code for evaluating.

    (PY)

    pone.0298524.s007.py (5.2KB, py)
    Attachment

    Submitted filename: Responses to Reviewers.doc

    pone.0298524.s008.doc (83KB, doc)

    Data Availability Statement

    All relevant data are within the manuscript and its Supporting Information files.


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES