Abstract
To address the demand for high-precision deformation monitoring during mine exploitation, this paper proposes a high-precision mine deformation prediction method based on stacking ensemble learning, which leverages Global Navigation Satellite System Real-Time Kinematic (GNSS RTK) data. First, a fusion filtering preprocessing module integrating median filtering, Butterworth filtering, Savitzky-Golay filtering, and the Adaptive Kalman Filter (AKF) is established to suppress various types of noise in the original data. Second, a cumulative deformation time series is constructed and decomposed into trend, seasonal, and residual components. Concurrently, the deformation rate and acceleration are calculated, with all these variables jointly serving as input features for the deformation prediction model. Finally, a stacking ensemble module is developed by integrating three time series models and four machine learning models, where Elastic Net Regression (ENR) is employed as the meta-model to realize dynamic weight optimization, thereby enabling high-precision prediction of cumulative deformation. Experimental results demonstrate that the fusion filtering preprocessing significantly improves the quality of the original data. Additionally, under different prediction time scales of the measured dataset, the Root Mean Squared Error (RMSE) of predictions generated by the stacking ensemble module can be maintained below 0.3 mm, and the module exhibits excellent trend consistency and response capability. In summary, the proposed method provides efficient and reliable technical support for the active prevention and control of mine deformation disasters.
Keywords: Mine safety, GNSS RTK, Fused filtering, Deformation prediction, Stacking ensemble learning
Subject terms: Engineering, Mathematics and computing
Introduction
Mine safety production constitutes a critical component of the stable development of the national economy. In recent years, mining-induced disasters such as landslides, collapses, cracks, and rockbursts have occurred frequently, posing a severe threat to the life and property safety of personnel in mining areas as well as the stability of the ecological environment1–4. Therefore, high-precision prediction of mine deformation has become a key measure for preventing the occurrence of such disasters. Currently, in the field of mine deformation prediction, the Global Navigation Satellite System (GNSS) has established itself as one of the core monitoring technologies, owing to its advantages of all-weather operation, high precision, and strong real-time performance5,6. Among these technologies, GNSS monitoring stations based on Real-Time Kinematic (RTK) technology can provide centimeter-level 3D position information, laying a robust data foundation for capturing minor mine deformations7. In addition, the GNSS time series data from each monitoring station contain abundant information regarding deformation evolution trends, which can reveal the dynamic process of deformation, predict future development trends, and identify potential risks.
However, the utilization of Global Navigation Satellite System Real-Time Kinematic (GNSS RTK) data for high-precision mine deformation prediction still confronts numerous challenges8–10. For instance, influenced by multipath effects, atmospheric delays, satellite orbit and clock errors, as well as external environmental interference, the original observation data contain significant levels of noise that severely degrades the extraction accuracy of deformation signals. Additionally, the mine deformation process is comprehensively governed by factors including geological structures, mining activities, and hydrological conditions, typically exhibiting nonlinear and non-stationary characteristics. Further, the accurate capture of such complex evolutionary trends is a critical requirement in practical prediction workflows. While existing studies have also addressed these two core challenges, they still encounter notable limitations in practical applications. With respect to mixed noise suppression, the complex mine environment features diverse noise sources with varying characteristics, meaning that commonly used noise suppression algorithms generally fail to deliver optimal denoising performance for such mixed, highly time-varying complex noise. For deformation prediction, traditional prediction methods struggle to simultaneously capture complex mapping relationships within deformation sequences, including long-term trends and periodic variations. These methods also lack sufficient generalization ability when confronted with data offsets, noise, or unknown states, and demonstrate significant error accumulation particularly in medium-to-long-term prediction scenarios.
To address the aforementioned issues, this paper proposes a high-precision mine deformation prediction method based on stacking ensemble learning, and its core contributions are outlined as follows:
Established a fusion filtering preprocessing module. This module innovatively integrates median filtering, Butterworth filtering, Savitzky-Golay filtering, and the Adaptive Kalman Filter (AKF) to form a cascaded processing workflow. The workflow effectively suppresses various types of noise in the original GNSS RTK data, significantly improves data quality, and lays a robust data foundation for subsequent deformation prediction.
Proposed a stacking ensemble deformation prediction module. Based on the cumulative deformation time series, trend, seasonal, and residual components are decomposed. Concurrently, the deformation rate and acceleration are calculated, all of which jointly serve as input features for the deformation prediction model. A base model pool is constructed, comprising three time series models: Autoregressive Integrated Moving Average (ARIMA), Seasonal ARIMA (SARIMA), and Automatic ARIMA (AutoARIMA); alongside four machine learning models: Support Vector Regression (SVR), Random Forest (RF), Multilayer Perceptron (MLP), and eXtreme Gradient Boosting (XGBoost). A stacking ensemble strategy is adopted, with Elastic Net Regression (ENR) introduced as the meta-model to dynamically optimize the combination weights of each base model, thereby achieving high-precision prediction of cumulative deformation.
The structure of this paper is organized as follows: Sect. 2 elaborates on the proposed method and workflow; Sect. 3 presents the experimental data, designs relevant experiments, and conducts analysis of the experimental results; Sect. 4 summarizes the advantages and limitations of this study and provides an outlook on future research.
Related works
Presently, the field of GNSS-based mine deformation prediction confronts two core challenges: mixed noise suppression and complex deformation prediction. This chapter provides a systematic review of relevant research progress from the perspectives of technical advantages and application limitations.
Mixed noise suppression
GNSS observation data are inevitably affected by various noise interferences during acquisition. Such noise can mask true deformation signals, rendering noise suppression the primary step in data processing. Presently, commonly used noise suppression methods primarily fall into two categories: those based on statistical tests and those based on time series analysis.
For methods based on statistical tests, Schmid et al.11 employed the Cumulative SUM (CUSUM) control chart to detect abrupt change points in sequences and investigated the distribution characteristics of the Average Run Length (ARL). Yu et al.12 compared the performance of the Exponential Weighted Moving Average (EWMA) control chart with that of the CUSUM control chart via Monte Carlo simulation, concluding that the EWMA control chart exhibits higher efficiency. Wang et al.13 proposed a GPS cycle slip detection method based on the CUSUM control chart. This method significantly improves the detection sensitivity for small cycle slips by amplifying offset effects through cycle slip accumulation. Yin et al.14 applied the control chart theory in Statistical Process Control (SPC) to the anomaly detection of GNSS data sequences, further enhancing the suppression effect on abrupt noise. Methods of this type exhibit remarkable performance in the detection of abrupt noise. However, they lack sensitivity to continuous small-offset fluctuations and generally require the pre-setting of parameter thresholds, which introduces a certain degree of subjectivity in practical applications.
For methods based on time series analysis, they can be subdivided into three categories according to specific technical approaches: wavelet decomposition methods, principal component analysis methods, and moving average methods. Duan et al.15 investigated the issues of boundary oscillation and frequency aliasing in wavelet transform, proposing a noise signal processing method that combines the Volterra series model with the anti-aliasing lifting wavelet packet. Zhang et al.16 proposed an Ensemble Empirical Mode Decomposition (EEMD)-based noise identification and extraction method for GNSS data, which prevents the misclassification of physically meaningful periodic components as noise. Li et al.17 developed a Principal Component Analysis (PCA) method that accounts for formal errors to construct weight factors for GNSS time series processing, extracting common mode errors and reducing the average error of residual time series. Verbesselt et al.18 proposed the Breaks for Additive Season and Trend (BFAST) algorithm, which effectively distinguishes between trend components, seasonal components, and residual components. Burrell et al.19 combined the BFAST algorithm with residual trend analysis, proposing a time series segmentation and residual trend analysis method to address the problem of missing critical deformation information. Methods of this type demonstrate significant advantages in noise suppression and effective signal extraction for GNSS time series. However, they possess inherent drawbacks such as sensitivity to parameter settings, insufficient real-time performance, and poor dynamic adaptability. Furthermore, in scenarios where slow deformation and sudden deformation coexist, these methods generally yield suboptimal noise suppression results.
Complex deformation prediction
In recent years, scholars worldwide have carried out extensive research on the application of GNSS in deformation prediction fields such as mines, dams, and the earth’s surface, leading to two major research directions primarily dominated by physical models and data-driven models.
For physical models, relevant research efforts primarily focus on the intrinsic mechanisms of geological hazards and reveal deformation laws through the construction of mechanical or dynamic models. Saito et al.20 were the first to propose a physical model-based slope displacement prediction method, which clarifies the relationship between landslide failure time and deformation rate. Cina et al.21 utilized low-cost GNSS receivers and verified the applicability of RTK technology in landslide monitoring via a landslide simulation device. The results demonstrate that the monitoring accuracy of this approach can reach the millimeter level. Dietrich et al.22 developed a prediction model that integrates physical mechanisms and sliding surface depth, and they systematically analyzed the effects of topographic, geological, and hydrological characteristics on landslide occurrence. Bathrellos et al.23 combined physical models with statistical methods to identify four distinct deformation stages, and they emphasized the regulatory effect of groundwater dynamic processes on deformation magnitude. Methods of this type can reflect the physical essence of deformation processes. However, their modeling processes are highly dependent on accurate knowledge of geological structures and mechanical parameters. In complex scenarios involving the coupling of multiple factors such as rainfall, earthquakes, and human activities, the corresponding prediction accuracy is subject to significant limitations.
As practical application conditions grow increasingly complex, data-driven models based on statistical theory or machine learning have gradually supplanted the aforementioned physical models. These models do not require explicit mechanistic assumptions and can directly mine implicit patterns embedded in data to perform prediction tasks. Wang et al.24 constructed a temporal prediction framework for land subsidence based on ARIMA, which effectively captured the linear trends and short-term fluctuations in the subsidence process. Liu et al.25 proposed an Equilibrium Optimization Algorithm-SVR (EOA-SVR) model for the reliable prediction of reservoir landslide displacement. Xu et al.26 developed a landslide displacement prediction model optimized by the Genetic Algorithm (GA), which integrates XGBoost, SVR, and the Recurrent Neural Network (RNN). Huang et al.27 integrated the Discrete Wavelet Transform (DWT) with the Extreme Learning Machine (ELM) to construct a hybrid landslide displacement prediction model. Wang et al.28 comparatively evaluated the applicability of multiple machine learning methods in deformation prediction. The aforementioned studies have advanced the application depth of data-driven models in deformation prediction through strategies such as algorithm optimization, method integration, and efficiency improvement. However, most of these studies focus on single-model improvement or mechanical integration modes, leaving considerable room for enhancement in dynamic adaptation to complex scenarios and in fully exploiting the collaborative potential of multiple models.
Methodology
The workflow of the proposed high-precision mine deformation prediction method based on stacking ensemble learning is illustrated in Fig. 1. The overall framework comprises two core components: fusion filtering preprocessing and stacking ensemble deformation prediction, corresponding to Sect. 3.1 and Sect. 3.2 respectively. Additionally, based on the deformation prediction results generated by the proposed method, further early warning is implemented for abnormal deformation signals, as detailed in Sect. 3.3.
Fig. 1.
Flowchart of the high-precision mine deformation prediction method based on stacking ensemble learning and abnormal early warning.
Fusion filtering preprocessing
GNSS RTK raw data are subject to factors including multipath effects, equipment noise, and electromagnetic interference, which gives rise to random errors and abnormal fluctuations. To accurately extract true deformation signals, this section designs a two-stage filtering architecture comprising multi-stage filtering processing and AKF optimization.
Multi-stage filtering processing
The integration of median filtering29, Butterworth filtering30, and Savitzky-Golay filtering31 prioritizes noise suppression and the preservation of true deformation features, thereby providing high-quality data for subsequent processing procedures.
First, median filtering is employed to eliminate impulse noise in the data. To facilitate subsequent processing, the RTK observation data of the survey station are converted into
,
,
Gaussian plane rectangular coordinate sequences. A dynamic sliding filtering window is established, with its size set as
and adaptively adjusted based on the sampling frequency
. By taking the median of the data within the window as the output, this approach effectively identifies impulse noise induced by instantaneous signal loss of lock and electromagnetic pulse interference. Since the aforementioned process is only sensitive to extreme outliers, it prevents over-smoothing of the true deformation trend while eliminating noise Subsequently, Butterworth filtering is employed to extract low-frequency deformation trends. Taking advantage of its maximally flat passband characteristic inherent to analog filter design, cutoff frequencies are assigned separately for statically stable zones and dynamically active zones within the mining area. For static monitoring regions such as long-term stable industrial plazas and unmined ore body zones in the mining area, the cutoff frequency is set to 0.01 Hz. This setting enables the passage of slowly accumulated low-frequency signals while effectively attenuating high-frequency fluctuations such as equipment circuit noise and environmental micro-vibration interference, thereby aligning the data trend more closely with the motion state assumed by the Kalman filter state equation. For dynamic monitoring areas including slopes and stope blasting-affected zones, the cutoff frequency is increased to 0.1 Hz to accommodate mid-low frequency abrupt deformations triggered by factors such as rainfall and blasting. This configuration suppresses high-frequency noise while preserving valuable mid-low frequency deformation information, ensuring that subsequent processing steps can accurately capture trend variations in dynamic deformations.
Lastly, Savitzky-Golay filtering is employed to preserve the local features of deformation. Building on the least squares method, a local polynomial fitting model is established to smooth differential data. This model preserves the high-order moment properties of the signal while accurately retaining the local features of deformation. This process primarily performs fine-grained optimization on the data that has undergone median filtering and Butterworth filtering, allowing the deformation data to exhibit both excellent smoothness and accurate reflection of the true deformation state. Thereby, it supplies high-quality data that more closely aligns with the actual deformation for subsequent Kalman filtering processing.
Adaptive Kalman filter processing
Building upon the data processed via the aforementioned multi-stage filtering, precise adaptation to both static and dynamic deformation scenarios is realized through dynamic adjustment of AKF parameters32,33, ultimately outputting real deformation information that is both stable and timely. For the case where deformation is reflected using
,
, and
coordinates, position differencing is employed to estimate the motion state. Combined with the adaptive adjustment strategy for noise parameters, an adaptive filtering model suitable for mine deformation prediction is constructed.
Core model construction
Firstly, state variables are established based on the coordinates of the survey station at epoch
; meanwhile, the velocity is indirectly derived via the position difference at epoch
. The state equation, constructed on the basis of the motion model, is presented as follows:
![]() |
1 |
Where,
denotes the process noise vector, which characterizes the deviation between the model and the actual motion, and its covariance matrix
is the core object of adaptive adjustment.
Subsequently, the coordinates processed via multi-stage filtering are adopted as the observed values
,
, and
, and the observation equation is constructed as follows:
![]() |
2 |
Where,
denotes the observation noise vector.
Adaptive mechanism design
By dynamically adjusting
through calculating the deformation rate, the average deformation rate
between 6 consecutive epochs is calculated as follows:
![]() |
3 |
Where,
denotes the time interval; when
, the current survey station is determined to be in a relatively stable state; when
, the current survey station is determined to be in a relatively active state. Regarding the aforementioned threshold setting method, based on relevant studies, the critical threshold for distinguishing between stable and active deformation states in mining areas generally ranges from 0.3 to 0.8 mm/day34,35. Additionally, through statistical analysis of long-term measured data in the study area, the median daily average deformation is 0.28 mm/day, and the 90th percentile is 0.53 mm/day. Therefore, 0.5 mm/day is adopted as the threshold for determining the survey station state in this paper. Furthermore, preliminary experimental results show that when 0.3 mm/day and 0.7 mm/day are used as the aforementioned thresholds respectively, the Standard Deviations (SD) of the three-axis data after filtering are 0.029652 m, 0.005086 m, 0.044895 m and 0.029576 m, 0.005173 m, 0.045021 m respectively, both of which are lower than the filtering effect corresponding to the current threshold. Thus, this threshold needs to be selected within the range of 0.3 to 0.8 mm/day according to the actual on-site deformation conditions.
For the relatively stable state,
is decreased to strengthen the filter’s dependence on historical states and suppress minor noise; for the relatively active state,
is increased to raise the weight of new observations and accelerate the response speed to abrupt deformations.
Execution of iterative operations
Based on the construction of the core model and the design of the adaptive mechanism, data optimization is achieved through the cyclic iteration of prediction, update, and adaptive adjustment. State values and covariance are initialized separately for each survey station:
,
, and
adopt the coordinates of the first epoch after multi-stage filtering, and the initial value of the covariance matrix is set as
.
Based on the state and velocity estimation of epoch
, the current position is predicted as follows:
![]() |
4 |
Prediction covariance update is performed as follows:
![]() |
5 |
Subsequently, the filter gain is calculated as follows:
![]() |
6 |
On this basis, the observed values are fused to correct the predicted results, as follows:
![]() |
7 |
Update the covariance matrix as follows:
![]() |
8 |
Within this process,
is calculated and
is adaptively updated every 100 epochs; additionally, a smooth filtering strategy is employed in the transition phase to avoid data fluctuations induced by abrupt parameter changes.
Stacking ensemble deformation prediction
In this section, based on the coordinate sequences obtained from fusion filtering preprocessing, the cumulative deformation sequences relative to the initial epoch are computed, followed by the decomposition of trend components, seasonal components, and residual components. Concurrently, deformation rates and accelerations are derived to train an appropriate cumulative deformation prediction model, thereby furnishing technical support for the proactive prevention and control of deformation-related disasters.
Calculation of cumulative deformation
Based on high-precision coordinates, the cumulative deformation sequences for each survey station are established separately from the three single dimensions
,
, and
, as well as the overall 3D dimension, thereby providing the most fundamental data support for the deformation prediction model.
Taking a single survey station as an example, the displacements
,
, and
of epoch
relative to epoch
are calculated as follows:
![]() |
9 |
Based on the calculation results of
,
, and
, the cumulative deformation variables
,
, and
of epoch
relative to the initial epoch are calculated as follows:
![]() |
10 |
Similarly, the cumulative deformation at each epoch is calculated, and the time series of relative cumulative deformation in each dimension for the current survey station is established, which are denoted sequentially as
,
, and
.
Based on the Euclidean Distance (ED), the three-dimensional overall relative displacement
of the current survey station at epoch
relative to epoch
is calculated as follows:
![]() |
11 |
Similarly, the 3D overall cumulative deformation of each epoch of the current survey station relative to the initial epoch is calculated, and the corresponding time series is constructed, which is denoted as
.
Analysis of deformation components and extraction of kinematic characteristics
Mine deformation processes are typically driven by the combined action of trend, periodic, and random factors. Accordingly, decomposing the cumulative deformation sequence into a trend component, a seasonal component, and a residual component enables a clear elucidation of the physical significance of each component, thereby facilitating the separation of abnormal signals36. Among them, the decomposition of the trend component in this paper is realized using the Locally Weighted Scatterplot Smoothing (LOWESS) algorithm37. The optimal window width is determined through cross-validation, and a polynomial for local data subsets is fitted based on weighted least squares to adaptively capture the long-term gradual variation law of deformation, effectively filter out short-term fluctuation interference, and accurately reflect the macro-scale deformation trend. For the decomposition of the seasonal component, a combined algorithm of Fast Fourier Transform (FFT)38 and Ensemble Empirical Mode Decomposition (EEMD)39 is adopted. First, FFT is used for frequency-domain analysis of the cumulative deformation sequence to identify characteristic frequencies dominated by environmental factors and mining operation rules, thereby determining potential periodic components. Then, the sequence is decomposed into multiple Intrinsic Mode Functions (IMFs) based on EEMD. IMFs with significant periodicity are screened and superimposed in combination with characteristic frequencies to obtain the seasonal component. This process effectively avoids the limitation that a single Fourier transform is difficult to handle nonlinear periodic signals. The residual component can be regarded as the remaining part of the cumulative deformation sequence after subtracting the trend component and the seasonal component, usually corresponding to irregular random deformation signals. The Ljung-Box test is applied to the decomposition result of this component to verify its randomness, ensuring that the residual only contains unpredictable components such as measurement noise and instantaneous interference.
For the extraction of kinematic characteristics, this study primarily computes the deformation rate and deformation acceleration to objectively characterize the intensity and trend of deformation changes at the survey station. Based on the data acquisition time interval and the cumulative deformation sequence
, the first-order central difference method is employed to compute the deformation rate. Taking epoch
of a single survey station as an example, the corresponding cumulative deformation is denoted as
, the cumulative deformation at its previous epoch as
, and the cumulative deformation at its next epoch as
; the deformation rate at epoch, denoted as
, is calculated as follows:
![]() |
12 |
Where,
denotes the time interval, a positive value of
indicates that the deformation trend intensifies over time, while a negative value indicates that the deformation trend weakens over time.
Since the first epoch
and the last epoch
in the sequence lack complete cumulative deformation information of the preceding and subsequent epochs they are calculated using forward difference and backward difference methods respectively as follows:
![]() |
13 |
Furthermore, three-point moving average processing is performed on the calculation results to eliminate instantaneous fluctuations induced by sampling noise, while retaining the authentic variation trend of the deformation rate. Based on the above procedure, the deformation rate at each epoch of the survey station is computed, thereby constructing the deformation rate sequence
.
Similarly, taking epoch
of the survey station as an example, the second-order central difference method is employed to compute the deformation acceleration
, as follows:
![]() |
14 |
Where,
being a positive value indicates that the degree of deformation intensifies over time and a negative value indicates that the degree of deformation weakens over time.
Similarly, for the first epoch
and the last epoch
, calculations are performed using forward difference and backward difference methods respectively as follows:
![]() |
15 |
Based on the above process, the deformation acceleration at each epoch of the survey station is calculated, and the deformation acceleration sequence
is constructed.
Construction of deformation prediction model
Although both the deformation rate and deformation acceleration are derived from cumulative deformation, these three variables characterize distinct deformation features from the perspectives of static accumulation, dynamic trend, and evolutionary dynamics, thereby endowing them with notable synergistic research value. Accordingly, this study integrates both time series models and machine learning models to develop a stacking ensemble framework of base models, with the aim of realizing full-coverage prediction spanning from linear trends to nonlinear abrupt variations.
Construction of time series base model pool
Based on the autocorrelation and seasonality of time series, the traditional ARIMA model40, along with two enhanced models, namely SARIMA41 and AutoARIMA42, are selected as prediction models, which are primarily applicable to the basic prediction and mechanism analysis of deformation trends.
(1) The ARIMA model takes
as its core input and is suitable for scenarios such as uniform settlement and stable creep in mines. Non-stationarity is addressed through differencing, and by combining analyses of the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF), the orders of Autoregressive (AR) and Moving Average (MA) terms are determined, thereby capturing linear trends. Meanwhile,
is added as an auxiliary feature, and its derivative relationship with cumulative deformation is leveraged to improve prediction accuracy during stable phases of deformation rates and assist in analyzing the mechanism of linear deformation.
(2) The SARIMA model is tailored for deformations driven by periodic factors and integrates seasonal differencing within the ARIMA framework. Drawing on the separation results of the trend components, seasonal components, and residual components, it focuses on capturing the periodic peaks of deformation rates and leverages the derivative relationship between deformation rate and cumulative deformation to deduce the laws governing periodic deformation. It is capable of accommodating the coupling between mining-induced cycles and environmental factors, thereby providing basic predictions for the prevention and control of seasonal risks.
(3) The AutoARIMA model takes
and
as its inputs, automatically optimizes the differencing order, AR order, and MA order via grid search, and screens parameters by integrating the Akaike Information Criterion (AIC). It utilizes the Augmented Dickey-Fuller (ADF) test to address non-stationarity, and leverages the sensitivity of the deformation rate to the derivative of cumulative deformation to improve the model’s efficiency in identifying trend turning points, while reducing the cost of random parameter tuning.
Construction of machine learning base model pool
Given that cumulative deformation, deformation rate, and deformation acceleration exhibit a nonlinear numerical relationship among themselves, the cumulative deformation at the current epoch is defined as the prediction target. The trend components, seasonal components, and residual components derived from the decomposition of cumulative deformation, as well as the deformation rate and acceleration, from the 30 preceding epochs are employed as input features to train the SVR model43, RF model44, MLP model45, and XGBoost model46.
(1) The SVR model takes the components of
and
as its core inputs, incorporates deformation acceleration to quantify the variation trend of the deformation rate, and utilizes the Radial Basis Function (RBF) for high-dimensional space mapping. By constructing an optimal hyperplane, it accurately fits the coupling pattern of the nonlinear accumulation of deformation, the increase in deformation rate, and the abrupt increase in deformation acceleration.
(2) The RF model takes the components of
,
, and
as its input features, utilizes the fitting capability of decision trees for nonlinear numerical correlations to capture the characteristics of stages with nonlinear variation in cumulative deformation, and quantifies the contributions of the three variables in different stages based on feature importance evaluation, thereby providing an interpretable basis for deformation prediction.
(3) The MLP model also takes the components of
,
, and
as its inputs. Via a fully connected network, it learns the nonlinear variation characteristics of cumulative deformation and the complex patterns of coupled responses between deformation rate and deformation acceleration. It focuses on capturing pre-catastrophe signals such as cumulative deformation exceeding critical thresholds, abrupt increases in deformation rate, and steep rises in deformation acceleration, fits deformations with large-scale or complex patterns, and enhances the accuracy of regional trend prediction.
(4) The XGBoost model, with the same feature inputs, enhances its sensitivity to nonlinear abrupt changes in cumulative deformation. Built on gradient boosting trees, it independently learns the interactive relationships between features through regularization constraints and parallel computing optimization, while maintaining favorable processing efficiency.
Construction of base model stacking ensemble framework
This study integrates the three aforementioned time series models and four machine learning models to construct a base model layer. It leverages the advantages of multiple models to build a stacking framework that integrates the prediction results of base models with meta-model fusion. Through the dynamic optimization of weight allocation, the robustness and deformation prediction accuracy of the stacking model in complex scenarios are enhanced.
Taking epoch
of a single survey station as an example, the seven base models independently predict the cumulative deformation at this epoch. Among them, time series prediction models primarily capture linear trends and periodicity, while machine learning models mainly enhance the capability to capture nonlinear correlations and abrupt change features, thereby ensuring the prediction performance of the stacking model across different deformation states. The meta-model layer is built on ENR47, which takes the prediction results of each base model as feature inputs and the actual deformation as the true value
. Through training, it obtains the corresponding optimal weights for each base model. The convergence condition of the model is set to minimizing the prediction error, and a regularized loss function is constructed based on this condition as follows:
![]() |
16 |
Where,
denotes the weight vector,
denotes the number of samples in the training set,
denotes the number of base models,
denotes the weight of the
-th base model,
denotes the prediction result of the
-th base model,
denotes the residual correction term, and
and
both denote regularization parameters.
The
in Eq. (16) and the reference true values of deformation adopted in subsequent experiments are both the 3D overall cumulative deformation calculated based on the GNSS RTK data after fusion filtering processing. Among them, the fusion filtering effectively mitigates the masking of real deformation signals by various types of noise, but may simultaneously induce a certain degree of over-smoothing for instantaneous mutations caused by blasting, strong winds, and other factors, resulting in the filtered data slightly lacking detail in depicting such deformation signals. However, from the perspective of application value, the aforementioned phenomenon can just reduce the impact of such random external interference on the subsequent prediction process, while effectively lowering the risk of misjudgment in the early warning process. Therefore, adopting the deformation calculated from the filtered data as the true value fully meets the practical application requirements of mine deformation prediction and disaster prevention early warning.
Early warning of abnormal deformation signals
To address the practical demand for abnormal deformation early warning in mines, this study designs a dynamic deformation early warning framework based on the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Z-score detection. By adaptively adjusting the early warning thresholds for different deformation stages, the framework improves the accuracy of abnormal deformation early warning.
Deformation stage classification
The mine deformation process is comprehensively affected by complex factors and typically exhibits multi-stage evolution characteristics. Accordingly, this study takes the corresponding deformation rate and deformation acceleration at each epoch as indicators to construct a two-dimensional point set for a single survey station. On this basis, the DBSCAN algorithm is employed for clustering analysis48, which can automatically identify clusters with different density distributions and separate noise points. Among them, the k-distance algorithm is used to determine the neighborhood radius parameter, and the minimum sample size parameter is dynamically adjusted according to the number of samples. Based on the clustering results, the centers of each cluster are calculated, and the deformation stages corresponding to the centers of each cluster are analyzed in accordance with the classification criteria in Table 1.
Table 1.
Classification criteria for deformation stages.
| Deformation stage | Condition (/mm) |
|---|---|
| Stable creep stage |
|
| Slow acceleration stage |
|
| Rapid instability stage |
|
| Noise or transient anomaly | Other cases |
Z-score anomaly detection based on dynamic thresholds
Taking epoch
of a single survey station as an example, a sliding window is built using the cumulative deformation of the 30 preceding epochs of epoch
. The mean value
and standard deviation
of cumulative deformation within this window are calculated as follows:
![]() |
17 |
Where,
denotes the cumulative deformation of epoch
.
Based on the cumulative deformation
of the current epoch, the corresponding Z-score value
is calculated49 as follows:
![]() |
18 |
If value
is greater than value
, an abnormal deformation warning is triggered for current epoch. Where, if the current epoch belongs to the stable creep stage,
is set to avoid false alarms caused by minor cumulative changes; if the current epoch belongs to the slow acceleration stage,
is set to enhance sensitivity to potential anomalies; if the current epoch belongs to the rapid instability stage,
is set to ensure timely warning of high-risk cumulative deformation. Through adaptive adjustment of the warning threshold, this framework not only avoids the missed detection of true anomalies due to overly strict thresholds but also prevents a large number of false alarms caused by overly loose thresholds, thus ensuring the reliability of abnormal deformation early warning.
Experiments and analysis
Experimental data sources
This paper focuses on the Zhundongxiheishan Mining Area located in the Xinjiang Uygur Autonomous Region, China, which is administratively affiliated with the Zhundong Economic and Technological Development Zone of Qitai County. The geographic coordinates of the mining area center are 90°21′E and 44°30′N, and it is situated on the southeastern margin of the Junggar Basin. The overall terrain is higher in the south and lower in the north, with mountains stretching in an east-west direction and a moderate degree of topographic dissection. The surface is predominantly covered by Quaternary loose sediments, with local outcrops of Jurassic coal-bearing strata. The vegetation is mainly composed of drought-tolerant shrubs, representing a typical temperate desert ecosystem. The mining area has a temperate continental arid climate, with a multi-year average temperature of 6.8 °C, an average annual precipitation of 152.3 mm, and an annual evaporation capacity as high as 2800 mm. It is characterized by severe cold in winter, extreme heat in summer, scarce precipitation, and intense evaporation.
GNSS RTK data were acquired from 19 GNSS survey stations deployed in an open-pit mine within the mining area, and their specific distribution is illustrated in Fig. 2. The detailed location information of these GNSS survey stations is presented in Table 2, where the Gauss plane rectangular coordinates have been subjected to appropriate conversion processing to balance data confidentiality and usability.
Fig. 2.
Detailed distribution map of GNSS survey stations.
Table 2.
Distribution information of GNSS survey stations.
| Survey station | X coordinate (/m) | Y coordinate (/m) | Z coordinate (/m) |
|---|---|---|---|
| NPS7602 | 4931344.1025 | 526129.9465 | 586.6406 |
| WPE7206 | 4933790.6572 | 525693.4056 | 547.4409 |
| WPS7403 | 4931814.6114 | 526618.7134 | 542.9016 |
| WPE7604 | 4931698.9523 | 526220.0845 | 586.4834 |
| NPS6801 | 4930902.0315 | 526611.6671 | 507.4848 |
| WPE7205 | 4933739.6985 | 526038.4851 | 547.1491 |
| WPE7405 | 4932762.9342 | 525889.5877 | 586.6129 |
| NPS6802 | 4933653.7820 | 526253.0425 | 547.7804 |
| WPE7006 | 4933853.4511 | 525658.1723 | 528.1484 |
| WPS7603 | 4931674.2162 | 526539.3795 | 583.6568 |
| WPE7001 | 4932756.7021 | 526230.0651 | 528.8952 |
| WPW7208 | 4932351.2935 | 525529.5157 | 548.2544 |
| NPN7602 | 4931619.7468 | 526804.9332 | 583.1649 |
| WPE7203 | 4933555.8574 | 526341.3757 | 546.9416 |
| WPW7607 | 4932753.0542 | 525763.9673 | 585.9279 |
| WPW7008 | 4932356.2854 | 525469.42920 | 529.5165 |
| NPN7002 | 4931790.9382 | 526834.5780 | 523.4092 |
| NPS7601 | 4931403.5850 | 526757.3346 | 584.3491 |
| WPW7609 | 4931956.4956 | 525670.5664 | 585.8832 |
This study extracts GNSS RTK data from each GNSS survey station spanning the period from July 13, 2023 to March 31, 2025 as the basis for experimental data. Specifically, each survey station adopts a 24-hour continuous observation mode with a data sampling interval of one sample per hour. The collected data cover mine dynamic monitoring information under various working conditions, thus ensuring the provision of continuous time-series data support.
Validation experiment of the fusion filtering preprocessing module
This experiment is primarily designed to verify the practical effectiveness of the proposed fusion filtering preprocessing module. Two comparative methods are selected from References14,19, namely the control chart-based noise suppression method and the TSS-RESTREND-based noise suppression method, which correspond to the statistical test-based method and the time series analysis-based method respectively. For the first 10 GNSS survey stations listed in Table 2, the SD of the original coordinate sequences, the sequences processed by the proposed method, and those processed by the two comparative methods are calculated separately, and the results are presented in Table 3.
Table 3.
Performance validation metrics of three noise suppression methods.
| Survey station | Indicator (/m) | Original data | Comparative method 1 | Comparative method 2 | Proposed method |
|---|---|---|---|---|---|
| NPS7602 | X-direction | 0.052193 | 0.052181 | 0.052200 | 0.052162 |
| Y-direction | 0.010926 | 0.010847 | 0.010722 | 0.010616 | |
| Z-direction | 0.082508 | 0.082489 | 0.082426 | 0.082442 | |
| WPE7206 | X-direction | 0.050267 | 0.050253 | 0.050216 | 0.050298 |
| Y-direction | 0.002059 | 0.001586 | 0.001650 | 0.001189 | |
| Z-direction | 0.069890 | 0.069772 | 0.069801 | 0.069748 | |
| WPS7403 | X-direction | 0.011692 | 0.011252 | 0.011211 | 0.011184 |
| Y-direction | 0.005801 | 0.004997 | 0.004990 | 0.004803 | |
| Z-direction | 0.037998 | 0.037959 | 0.037992 | 0.037927 | |
| WPE7604 | X-direction | 0.106842 | 0.106786 | 0.106777 | 0.106775 |
| Y-direction | 0.005009 | 0.004886 | 0.004861 | 0.004738 | |
| Z-direction | 0.037553 | 0.037414 | 0.037449 | 0.037393 | |
| NPS6801 | X-direction | 0.020329 | 0.015937 | 0.015927 | 0.015874 |
| Y-direction | 0.019217 | 0.002291 | 0.002438 | 0.001828 | |
| Z-direction | 0.016108 | 0.015861 | 0.015865 | 0.015678 | |
| WPE7205 | X-direction | 0.036262 | 0.034088 | 0.033918 | 0.034027 |
| Y-direction | 0.011543 | 0.007865 | 0.007352 | 0.004552 | |
| Z-direction | 0.107064 | 0.101378 | 0.101928 | 0.100410 | |
| WPE7405 | X-direction | 0.010264 | 0.010205 | 0.010186 | 0.010125 |
| Y-direction | 0.003031 | 0.002909 | 0.002881 | 0.002757 | |
| Z-direction | 0.286550 | 0.028627 | 0.028687 | 0.028625 | |
| NPS6802 | X-direction | 0.004317 | 0.004299 | 0.004295 | 0.004245 |
| Y-direction | 0.014790 | 0.014787 | 0.014799 | 0.014777 | |
| Z-direction | 0.057246 | 0.057249 | 0.057274 | 0.057102 | |
| WPE7006 | X-direction | 0.002030 | 0.001968 | 0.001921 | 0.001830 |
| Y-direction | 0.002259 | 0.002202 | 0.002205 | 0.002175 | |
| Z-direction | 0.003879 | 0.003799 | 0.003868 | 0.003433 | |
| WPS7603 | X-direction | 0.006445 | 0.006356 | 0.006385 | 0.006317 |
| Y-direction | 0.001803 | 0.001656 | 0.001631 | 0.001481 | |
| Z-direction | 0.014631 | 0.014593 | 0.014627 | 0.014515 | |
| Mean | X-direction | 0.030064 | 0.029333 | 0.029304 | 0.029284 |
| Y-direction | 0.007644 | 0.005403 | 0.005353 | 0.004892 | |
| Z-direction | 0.071343 | 0.044914 | 0.044992 | 0.044727 |
Taking the first three survey stations in Table 2 as examples, the time series plots of the data processed by the proposed method and the two comparative methods are presented in sequence in Figs. 3 and 4, and Fig. 5.
Fig. 3.
Visualization results of noise suppression for NPS7602 survey station.
Fig. 4.
Visualization results of noise suppression for WPE7206 survey station.
Fig. 5.
Visualization results of noise suppression for WPS7403 survey station.
Based on a comprehensive analysis of the noise suppression effects of the proposed method and the two comparative methods, as presented in Table 3; Figs. 3, 4 and 5, Comparative Method 1 relies on fixed statistical thresholds to identify outliers and demonstrates insufficient adaptability to nonlinear or non-stationary sequences. This results in numerous instances of over-correction or retained abnormal fluctuations, as illustrated in the figures, which reflects its inadequate overall stability. Such limitations highlight the strong dependence of statistical test-based methods on assumptions about noise distribution; specifically, when the noise deviates from the preset distribution, their performance degrades significantly. In contrast, Comparative Method 2 extracts valid information through sequence segmentation and trend analysis but suffers from over-reliance on the linear trend assumption. This leads to a limited capability to process coupled signals of short-term noise and long-term deformation, and the deficiency ultimately causes excessive retention of high-frequency noise.
The proposed method outperforms both comparative methods in terms of both noise suppression effectiveness and stability, with the precision across all survey stations showing a balanced trend. After multi-stage filtering, impulsive outliers are eliminated, and high-frequency noise is significantly suppressed. The trends of static settlement and dynamic displacement are initially evident with only a slight lag, which are further optimized via the AKF algorithm. Through a quantitative comparison of the mean indicators in the table, the SD of the
,
,and
dimensions processed by the proposed method are reduced by 2.60%, 36.01%, and 37.31% compared with the original data, by 0.17%, 9.46%, and 0.42% compared with Comparative Method 1, and by 0.07%, 8.62%, and 0.59% compared with Comparative Method 2, respectively.
In summary, the fusion filtering preprocessing module proposed in this paper effectively overcomes the threshold dependence problem of statistical test-based methods and the model assumption constraints of time series analysis-based methods. It exhibits significant advantages in both noise suppression and the retention of valid information, thereby providing more reliable preprocessed data for subsequent deformation prediction.
Validation experiment of the stacking ensemble deformation prediction module
To verify the actual prediction performance of the proposed stacking ensemble deformation prediction module, the method in Reference24 and the method in Reference26 are selected as Comparative Method 1 and Comparative Method 2, namely the ARIMA-based deformation prediction method and the GA-optimized XGBoost-SVR-RNN deformation prediction method, respectively. In addition, the first 5 survey stations in Table 2 are used as the validation data source, where 80% of the data is allocated to the training set and the remaining 20% to the test set. The prediction performance of each method is evaluated based on the prediction accuracy of the 3D overall cumulative deformation. Therefore, based on the true values of each test set, the prediction Root Mean Squared Errors (RMSE), Maximum Errors (ME), and 95% confidence interval based on the Z-distribution of the three aforementioned methods are calculated separately, with the specific quantitative results of these metrics presented in Table 4.
Table 4.
Prediction RMSE of the three methods on each test set.
| Metrics | Survey station | Comparative method 1 (/m) | Comparative method 2 (/m) | Proposed method (/m) |
|---|---|---|---|---|
| RMSE | NPS7602 | 0.028003 | 0.029621 | 0.000574 |
| WPE7206 | 0.031684 | 0.021663 | 0.000453 | |
| WPS7403 | 0.003620 | 0.031476 | 0.000765 | |
| WPE7604 | 0.002708 | 0.021525 | 0.000556 | |
| NPS6801 | 0.009653 | 0.025734 | 0.000283 | |
| Mean | 0.015134 | 0.026004 | 0.000526 | |
| ME | NPS7602 | 0.045324 | 0.076836 | 0.002179 |
| WPE7206 | 0.056786 | 0.079198 | 0.001879 | |
| WPS7403 | 0.012136 | 0.082946 | 0.002596 | |
| WPE7604 | 0.009478 | 0.078562 | 0.002113 | |
| NPS6801 | 0.015445 | 0.080125 | 0.001094 | |
| Mean | 0.027834 | 0.079533 | 0.001972 | |
| 95% confidence interval based on the Z-distribution | NPS7602 | [0.025358, 0.026618] | [0.023503, 0.025512] | [0.000418, 0.000463] |
| WPE7206 | [0.026825, 0.028448] | [0.016082, 0.017514] | [0.000341, 0.000370] | |
| WPS7403 | [0.002477, 0.002716] | [0.026450, 0.027950] | [0.000561, 0.000608] | |
| WPE7604 | [0.002478, 0.002716] | [0.020825, 0.022025] | [0.000533, 0.000579] | |
| NPS6801 | [0.008977, 0.009663] | [0.024934, 0.026534] | [0.000269, 0.000287] | |
| Mean | [0.016543, 0.017211] | [0.022535, 0.023508] | [0.000458, 0.000484] |
Taking the first three survey stations from Table 2 as examples, the time series plots of the true values of the 3D overall cumulative deformation in the test set and the prediction results of the three methods are respectively plotted, as presented in Fig. 6. To further verify the adaptability of the proposed method under different prediction time scales, four prediction time scales including 24 h, 36 h, 48 h and 72 h are set respectively for the five survey stations in Table 4. Similarly, based on historical data, the cumulative deformation of the corresponding time scales is obtained through rolling continuous extrapolation, and the prediction RMSE is calculated based on the predicted results and true values, as presented in Table 5. Furthermore, rolling window cross-validation is adopted to thoroughly verify the robustness of the proposed method. The corresponding datasets of the five survey stations in Table 4 are respectively divided into 24 sample intervals in chronological order of timestamps, with the training set containing 3000 samples and the test set containing 500 samples in each interval. Model training and prediction are performed separately for the dataset of each aforementioned sample interval, and the prediction RMSE of cumulative deformation on each test set is calculated. The corresponding time series plot is presented in Fig. 7, where the values in the legend represent the corresponding mean values.
Fig. 6.
Visual comparison of predicted values and true values among the three methods. (a) NPS7602 survey station, (b) WPE7206 survey station, (c) WPS7403 survey station.
Table 5.
Deformation prediction RMSE of the proposed method under different time scales.
| Survey station | 24 h | 36 h | 48 h | 72 h |
|---|---|---|---|---|
| NPS7602 | 0.000282 | 0.000353 | 0.000439 | 0.000574 |
| WPE7206 | 0.000376 | 0.000371 | 0.000401 | 0.000453 |
| WPS7403 | 0.000528 | 0.000575 | 0.000653 | 0.000765 |
| WPE7604 | 0.000389 | 0.000437 | 0.000476 | 0.000556 |
| NPS6801 | 0.000217 | 0.000224 | 0.000251 | 0.000283 |
| Mean | 0.000358 | 0.000392 | 0.000444 | 0.000526 |
Fig. 7.
Visualization results of RMSE under rolling window cross-validation.
Based on comprehensive comparisons of the cumulative deformation prediction results between the proposed method and the two comparative methods, as presented in Tables 4 and 5; Figs. 6 and 7, the following observations are derived:
Comparative Method 1 adopts the ARIMA model as its core prediction model, which can effectively identify linear trends and short-term autocorrelation characteristics but cannot adapt well to the complex evolutionary characteristics of mine deformation. By solely relying on modeling the autocorrelation of the overall sequence, it fails to accurately isolate the independent impacts of periodic factors, thereby resulting in significant prediction deviations for periodic deformation. Furthermore, it performs extrapolation solely based on the linear relationship of the displacement itself, leading to extremely poor performance in predicting nonlinear mutation characteristics; its prediction errors also increase significantly during phases where the deformation trend reverses. In contrast, Comparative Method 2 improves its nonlinear fitting capability by integrating machine learning algorithms; however, a single feature cannot fully reflect the dynamic evolutionary laws of deformation. In the model integration process, it overlooks the inherent fitting advantages of time series models for linear trends. Moreover, its K-Nearest Neighbors (KNN) optimization strategy focuses on sample similarity matching and lacks a mechanism for the dynamic adaptive adjustment of model weights, which results in insufficient robustness of the model under complex scenarios.
Through the collaborative design of feature engineering and model ensembling, the proposed method successfully compensates for the deficiencies of the comparative methods. This results in a high degree of agreement between the predicted values and the true values, with high consistency observed in both overall trends and local fluctuations. The method can accurately capture the periodic variations in the original data and demonstrates strong responsiveness at mutation points. Quantitatively, based on the mean metrics in Table 4, the proposed method exhibits significant advantages across multiple evaluation metrics. Compared with Comparative Method 1 and Comparative Method 2, the prediction RMSE of the 3D overall cumulative deformation is reduced by 96.52% and 97.98%, respectively; the ME is decreased by 92.92% and 97.52%, respectively; meanwhile, the width of the 95% confidence interval based on the Z-distribution is narrowed by 96.11% and 97.33%, respectively. As presented in Table 5, the RMSE of the proposed method under each prediction time scale for all survey stations remains between 0.217 mm and 0.765 mm, generally meeting the requirements of high-precision prediction. With the increase of prediction time length, the mean RMSE increases from 0.358 mm to 0.526 mm, while the variance of the indicator among different time scales is only 0.073 mm, demonstrating the good adaptability of the proposed method for short-term to long-term deformation prediction. Furthermore, the rolling window cross-validation results presented in Fig. 7 demonstrate that the prediction RMSE of the proposed method remains consistently stable across all sample intervals for each survey station, without significant fluctuations. Therefore, in complex mine environments, the proposed method can maintain reliable processing performance across different time periods and survey stations, which not only confirms its excellent prediction accuracy but also fully verifies its superior robustness.
Overall, the proposed method first reduces various types of noise in the original data through fusion filtering, significantly improving the quality of input data for deformation prediction. It then leverages the stacking prediction model to accurately fit the long-term deterministic variation law of 3D cumulative deformation, with the residual errors inherently incorporated into the fitting process, thereby further mitigating the impact of measurement errors on prediction accuracy. Ultimately, the deformation prediction residual reaches the sub-millimeter level. However, this precision does not break through the measurement accuracy limitations of GNSS RTK hardware; instead, it represents the model’s deformation prediction residual based on high-quality filtered data, which is the result of the synergistic effect between high-quality signals and adaptive models. Essentially, it reflects the model’s high degree of fitting to the true deformation.
In summary, the stacking ensemble deformation prediction module proposed in this paper effectively achieves high-precision deformation prediction through systematic deformation component analysis, multi-dimensional extraction of kinematic characteristics, and integrated model ensembling. It exhibits relatively consistent prediction performance across different prediction time scales and maintains stable performance even under complex conditions involving periodic fluctuations and nonlinear mutations. This fully verifies the practical value of the proposed method in the active prevention and control of deformation disasters, providing more reliable technical support for high-precision deformation prediction.
Similarly, taking the aforementioned three survey stations as examples, based on the trained stacking prediction model, the 3D overall cumulative deformation for the next 72 h is predicted for each station respectively. Early warnings are issued for abnormal deformation signals in both the historical stage and the prediction stage. The prediction results are presented in Table 6, and the visualization results of the abnormal deformation signal early warning are shown in Fig. 8.
Table 6.
Prediction results of 3D overall deformation of survey stations for the next 72 h (sampled at 6-hour intervals).
| Hour | NPS7602(/m) | WPE7206(/m) | WPS7403(/m) |
|---|---|---|---|
| 1 | 0.326668 | 0.320025 | 0.166160 |
| 7 | 0.331101 | 0.324486 | 0.164942 |
| 13 | 0.336227 | 0.328952 | 0.165550 |
| 19 | 0.336533 | 0.332102 | 0.165292 |
| 25 | 0.338556 | 0.335354 | 0.163768 |
| 31 | 0.340383 | 0.337910 | 0.164490 |
| 37 | 0.342800 | 0.339881 | 0.164647 |
| 43 | 0.342639 | 0.341752 | 0.163691 |
| 49 | 0.343610 | 0.343043 | 0.163876 |
| 55 | 0.344417 | 0.343696 | 0.164028 |
| 61 | 0.343627 | 0.344277 | 0.163539 |
| 67 | 0.344661 | 0.344766 | 0.163482 |
| 72 | 0.344459 | 0.344946 | 0.163550 |
Fig. 8.
Visualization results of early warning of abnormal deformation signals. (a) NPS7602 survey station, (b) WPE7206 survey station, (c) WPS7403 survey station.
Analysis of deformation characteristics and causes in the study area
Based on a systematic analysis of the long-term GNSS RTK monitoring data from the study area, the deformation of the mining area exhibits complex spatiotemporal evolution patterns and shows significant regional differences, yet it is highly correlated with the mining activities in the area and local geological conditions.
From the perspective of temporal evolution, the temporal curves of cumulative deformation at most GNSS monitoring stations show a macroscopic trend dominated by slow and stable subsidence, which mainly reflects the typical law of gradual compaction and consolidation of overlying strata under self-weight. On the basis of this macroscopic trend, two types of characteristics are superimposed: periodic fluctuations and sporadic abrupt changes. Among which, the periodic components are usually consistent with seasonal factors such as winter freeze-thaw cycles or regular operational schedules such as periodic loads on transportation roads. The sporadic abrupt changes are mostly associated with discrete events including blasting operations and movement of heavy equipment.
From the perspective of spatial distribution, the deformation degree in different regions within the study area exhibits high heterogeneity. The area is dominated by low-risk zones, with a small number of medium-risk zones and an extremely small number of high-risk zones, presenting an obvious annular risk distribution pattern centered on the inner dump. Specifically, the higher the terrain in the inner dump, the higher the risk level. The farther the distance from the mining face, the smaller the impact of mining activities, the more stable the geological conditions, and the lower the risk level. In detail, the deformation degree in the inner dump and the south slope of the outer dump is significantly higher than that in undisturbed stable areas such as the industrial square. These regions have exhibited significant and continuous deformation characteristics throughout the entire monitoring period, featuring large deformation magnitude and high cumulative rate. The Interferometric Synthetic Aperture Radar (InSAR) monitoring results also identify the inner dump and the south slope of the outer dump as areas with sustained and significant deformation, which further verifies the robustness of the GNSS deformation monitoring adopted in this study and highlights the technical advantage of GNSS monitoring in terms of high temporal resolution.
For the aforementioned regions with significant deformation, the rationality of the monitoring results can be further verified by combining with the internal geomechanical causes. The waste dump slopes in the study area are formed by artificial accumulation of loose crushed rock and soil. The stability of such slopes depends not only on the inherent strength of the accumulated materials but also on factors such as the bedrock dip angle and the shear strength of the bedrock soil mass. Meanwhile, the slope edges and waste dump boundaries are topographic abrupt change zones, which are usually subjected to the combined effects of gravitational effects, mining-induced disturbances and unloading stress concentration. These zones are prone to become initiation areas for deformation disasters such as sliding or instability. Once abnormal deformation occurs in these regions, it may indicate early signs of local landslides or slope failure. The east slope of the inner waste dump has the characteristics of both the maximum height and the steepest angle among all waste dump slopes in the study area. Its convex slope morphology induces a significant stress concentration effect, and thus it is reasonably identified as the region with the highest deformation risk level in the study area.
Based on the aforementioned analysis of the spatiotemporal evolution patterns and genetic mechanisms of deformation in the study area, combined with the high-precision deformation prediction capability of the stacking ensemble deformation prediction model proposed in this paper, accurate quantification of the future deformation trends in high-risk areas can be achieved, thus providing reliable data support for the proactive prevention and control of deformation disasters. Meanwhile, the results of the hierarchical early warning model can be derived from the deformation prediction outcomes. Combined with the reference to the deformation causes of the corresponding areas, tailored response and disposal plans can be formulated in advance, thereby improving the intelligent and refined level of safety management in the mining area. In summary, this paper establishes a closed-loop application system incorporating cause analysis, trend prediction, and early warning and control. It provides a referenceable and reproducible practical paradigm for the prevention and control of deformation disasters in open-pit mines, further highlighting the application value of this study.
Conclusion and discussion
Conclusions
To address the demand for high-precision prediction of mine deformation, this paper proposes a deformation prediction method based on GNSS RTK and stacking ensemble learning. Through systematic theoretical construction and comprehensive experimental validation, the following conclusions are drawn:
The fusion filtering preprocessing module constructed in this paper effectively improves the data quality of GNSS RTK. By means of cascaded processing of median filtering, Butterworth filtering, and Savitzky-Golay filtering, combined with dynamic optimization via the AKF, this module achieves accurate suppression of complex mixed noise. Experimental results demonstrate that the SD of the processed data in the three dimensions are reduced by 2.60%, 36.01%, and 37.31% respectively compared with the original data, and the performance is superior to that of the comparative methods, thereby laying a reliable data foundation for subsequent deformation prediction.
The stacking ensemble deformation prediction module proposed in this paper achieves high-precision prediction of mine deformation. This model decomposes the cumulative deformation sequence into trend, seasonal, and residual components, incorporates the features of deformation rate and acceleration, and integrates three time series models, namely ARIMA, SARIMA, and AutoARIMA, as well as four machine learning models: SVR, RF, MLP, and XGBoost. It employs ENR as the meta-model to dynamically optimize the weights of the base models. Experimental results demonstrate that the prediction RMSE of cumulative deformation achieved by this method on the measured dataset remains consistently below 0.3 mm, and it exhibits relatively consistent prediction performance across different prediction time scales. Furthermore, both the ME and the confidence interval span are significantly superior to those of the comparative methods, and the model still demonstrates excellent robustness and applicability in complex scenarios where periodic fluctuations and nonlinear mutations coexist.
Based on historical epoch data and predicted cumulative deformation, early warning of dynamic abnormal signals is implemented for different deformation stages such as stable creep, slow acceleration, and rapid instability, thereby providing decision support for the early prevention and control of mine deformation disasters. In addition, we have systematically analyzed the deformation characteristics of the study area from three dimensions, namely temporal evolution, spatial distribution, and deformation magnitude, explored the causes of its deformation by integrating internal and external factors, and discussed the applications of future deformation prediction and hierarchical early warning, thus further enhancing the academic value and engineering application value of this paper.
Discussion
The proposed method demonstrates excellent performance in mine deformation prediction, which will be elaborated on from three aspects, namely its advantages, existing limitations, and future research directions.
The fusion filtering preprocessing module breaks through the adaptability limitations of single filtering algorithms in handling complex noise. By virtue of the functional complementarity of multi-level filtering and dynamic adjustment via the adaptive Kalman filter, it achieves an optimal balance between noise suppression and feature preservation. The stacking ensemble deformation prediction module relies on a synergistic mechanism whereby linear models capture deformation trends, nonlinear models fit mutation characteristics, and meta-models optimize base model weights. It addresses the issue of insufficient generalization ability of single models under complex deformation conditions, rendering it more suitable for complex mine environments with multi-factor coupling effects.
The proposed method still has room for improvement in predicting instantaneous mutations. The existing experimental data are mainly derived from deformations under conventional mining conditions, and the scarcity of samples in extreme scenarios may lead to a decline in the prediction accuracy of the model. Although the current model exhibits favorable prediction performance and meets the requirements of practical applications, its corresponding feature system does not include external influencing features such as geological and mechanical factors, the incorporation of which could potentially further improve the prediction accuracy.
In future research, efforts will be devoted to multi-source data fusion, with in-depth integration of GNSS data and monitoring data from instruments such as InSAR and tiltmeters. Meanwhile, samples from extreme scenarios will be supplemented to further improve the universal applicability of the method. Furthermore, physical mechanisms will be incorporated into the data-driven model proposed in this paper, with geological parameters and other relevant factors added as constraint conditions in the prediction process, thereby providing more reliable technical support for the active prevention and control of mine deformation disasters.
Acknowledgements
We would like to especially thank the reviewers and editors for their constructive comments and efforts to improve the quality of this paper. We also thank those who provided language proofreading assistance. Finally, thanks for the following funding: [National Natural Science Foundation of China] under Grant [number 42404045].
Author contributions
Conceptualization: X.D. and R.M.; Data curation: X.D. and W.T.; Formal analysis: X.D., R.M., and K.Z.; Funding acquisition: W.T.; Investigation: Z.Z.; Methodology: X.D. and J.Z.; Software: X.D. and G.S.; Validation: X.D. and H.X.; Writing—original draft preparation: X.D.; Writing—review and editing: X.D., R.M., W.T., K.Z., Z.Z., J.Z., G.S., and H.X. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by [National Natural Science Foundation of China] grant number [42404045].
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Ma, S. et al. Surface multi-hazard effect of underground coal mining. Landslides20(1), 39–52. 10.1007/s10346-022-01961-0 (2023). [Google Scholar]
- 2.Mu, C. et al. The formation mechanism of surface landslide disasters in the mining area under different slope angles. Adv. Civ. Eng.2021(1), 6697790–6697802. 10.1155/2021/6697790 (2021). [Google Scholar]
- 3.Li, X., Chen, S., Wang, E. & Li, Z. Rockburst mechanism in coal rock with structural surface and the microseismic (MS) and electromagnetic radiation (EMR) response. Eng. Fail. Anal.124, 105396–105409. 10.1016/j.engfailanal.2021.105396 (2021). [Google Scholar]
- 4.Li, X. et al. Rock burst monitoring by integrated microseismic and electromagnetic radiation methods. Rock Mech. Rock Eng.49(11), 4393–4406. 10.1007/s00603-016-1037-6 (2016). [Google Scholar]
- 5.Yang, Y., Zheng, Y., Yu, W., Chen, W. & Weng, D. Deformation monitoring using GNSS-R technology. Adv. Space Res.63 (10), 3303–3314. 10.1016/j.asr.2019.01.033 (2019). [Google Scholar]
- 6.Nguyen, H. V., Pham, K. C., Nguyen, D. B. & Nguyen, L. Q. Application of the GNSS method in the monitoring of mine surface displacement: A systemic review. Inż. Miner.2(1), 247–255. 10.29227/IM-2024-01-115 (2024). [Google Scholar]
- 7.Takahashi, S., Kubo, N., Yamaguchi, N. & Yokoshima, T. Real-time monitoring for structure deformations using hand-held RTK-GNSS receivers on the wall. In 2017 International Conference on Indoor Positioning and Indoor Navigation (IPIN), 1–7 (2017). 10.1109/IPIN.2017.8115945
- 8.Jiang, Y. & Wang, J. Extracting relevant patterns from GNSS observations to mitigate multipath in RTK deformation monitoring. GPS Solutions. 28 (4), 200–214. 10.1007/s10291-024-01745-0 (2024). [Google Scholar]
- 9.Du, Y., Huang, G., Zhang, Q., Gao, Y. & Gao, Y. Asynchronous RTK method for detecting the stability of the reference station in GNSS deformation monitoring. Sens20 (5), 1320–1331. 10.3390/s20051320 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Liu, C., Zhou, F., Gao, J. & Wang, J. Some problems of GPS RTK technique application to mining subsidence monitoring. Int. J. Min. Sci. Technol.22 (2), 223–228. 10.1016/j.ijmst.2012.03.001 (2012). [Google Scholar]
- 11.Schmid, W. CUSUM control schemes for Gaussian processes. Stat. Pap.38(2), 191–217. 10.1007/BF02925223 (1997). [Google Scholar]
- 12.Yu, L. & Liu, F. Application of EWMA control chart instability analysis of MSA. J. Sys Eng.23 (3), 381–384. 10.1016/j.nima.2021.165947 (2008). [Google Scholar]
- 13.Wang, J. & Huang, D. Dual-frequency GPS cycle slip detection and repair based on dynamic test. KSCE J. Civ. Eng.27(12), 5329–5337. 10.1007/s12205-023-0388-2 (2023). [Google Scholar]
- 14.Yin, Y. H., Guo, Q. & Li, H. N. The research on detection methods of GPS abnormal monitoring data based on control chart. Eng. Mech.30(08), 133–141. 10.6052/j.issn.1000-4750.2012.04.0280 (2013). [Google Scholar]
- 15.Duan, L. X., Zhang, L. B. & Chen, J. L. Boundary treatment of lifting wavelet transform based on volterra series model and its application. Chin. J. Sci. Instrum.33(01), 7–12 (2012). [Google Scholar]
- 16.Zhang, H. J. & Cheng, P. F. Noise recognition and extraction of GPS heigh time series based on EEMD. J. Geod. Geodyn.34(02), 79–83 (2014). [Google Scholar]
- 17.Li, W. & Shen, Y. Z. The consideration of formal errors in spatiotemporal filtering using principal component analysis for regional GNSS position time series. Remote Sens.10 (4), 534–549. 10.3390/rs10040534 (2018). [Google Scholar]
- 18.Verbesselt, J., Hyndman, R., Newnham, G. & Culvenor, D. Detecting trend and seasonal changes in satellite image time series. Remote Sens. Environ.114 (1), 106–115. 10.1016/j.rse.2009.08.014 (2016). [Google Scholar]
- 19.Burrell, A. L., Evans, J. P. & Liu, Y. Detecting dryland degradation using time series segmentation and residual trend analysis (TSS-RESTREND). Remote Sens. Environ.197, 43–57. 10.1016/j.rse.2017.05.018 (2017). [Google Scholar]
- 20.Saito, M. Forecasting time of slope failure by tertiary creep. In Proceedings of the 7th international conference on soil mechanics and foundation engineering, Mexico City, Mexico, 2, 677–683 (1969).
- 21.Cina, A. & Piras, M. Performance of low-cost GNSS receiver for landslides monitoring: Test and results. Geomatics Nat. Hazards Risk6(5–7), 497–514. 10.1080/19475705.2014.889046 (2015). [Google Scholar]
- 22.Dietrich, W. E., Reiss, R., Hsu, M. L. & Montgomery, D. R. A process-based model for colluvial soil depth and shallow landsliding using digital elevation data. Hydrol. Process.9(3‐4), 383–400. 10.1002/hyp.3360090311 (1995). [Google Scholar]
- 23.Bathrellos, G. D., Kalivas, D. P. & Skilodimou, H. D. Landslide susceptibility assessment mapping: a case study in Central Greece. In Remote Sensing of Hydrometeorological Hazards 493–512 (CRC Press, 2017). [Google Scholar]
- 24.Wang, Z. et al. Landslide displacement prediction from on-site deformation data based on time series ARIMA model. Front. Environ. Sci.11, 1249743–1249756. 10.3389/fenvs.2023.1249743 (2023). [Google Scholar]
- 25.Liu, Z. et al. Toward the reliable prediction of reservoir landslide displacement using earthworm optimization algorithm-optimized support vector regression (EOA-SVR). Nat. Hazards120(4), 3165–3188. 10.1007/s11069-023-06322-1 (2024). [Google Scholar]
- 26.Xu, J., Jiang, Y. & Yang, C. Landslide displacement prediction during the sliding process using XGBoost, SVR and RNNs. Appl. Sci.12(12), 6056–6071. 10.3390/app12126056 (2022). [Google Scholar]
- 27.Huang, F. et al. Landslide displacement prediction using discrete wavelet transform and extreme learning machine based on chaos theory. Environ. Earth Sci.75 (20), 1376–1393. 10.1007/s12665-016-6133-0 (2016). [Google Scholar]
- 28.Wang, Y. et al. A comparative study of different machine learning methods for reservoir landslide displacement prediction. Eng. Geol.298, 106544–106555. 10.1016/j.enggeo.2022.106544 (2022). [Google Scholar]
- 29.Justusson, B. I. Median filtering: Statistical properties. In Two-dimensional digital signal prcessing II: transforms and median filters 161–196 (Springer Berlin Heidelberg, 2006). 10.1007/BFb0057597. [Google Scholar]
- 30.Mahata, S., Herencsar, N. & Kubanek, D. Optimal approximation of fractional-order Butterworth filter based on weighted sum of classical Butterworth filters. IEEE Access9, 81097–81114. 10.1109/ACCESS.2021.3085515 (2021). [Google Scholar]
- 31.Krishnan, S. R. & Seelamantula, C. S. On the selection of optimum Savitzky-Golay filters. IEEE Trans. Signal Process.61(2), 380–391. 10.1109/TSP.2012.2225055 (2012). [Google Scholar]
- 32.Ge, Q. et al. Adaptive Kalman filtering based on model parameter ratios. IEEE Trans. Autom. Control69(9), 6230–6237. 10.1109/TAC.2024.3376306 (2024). [Google Scholar]
- 33.Ge, Q., Hu, X., Li, Y., He, H. & Song, Z. A novel adaptive Kalman filter based on credibility measure. IEEE/CAA J. Autom. Sin.10(1), 103–120. 10.1109/JAS.2023.123012 (2023). [Google Scholar]
- 34.Zhe, Y., Hou, K., Niu, X. & Liang, W. Early warning technique research of surface subsidence for safe mining in underground goaf in Karst Plateau zone. Front. Earth Sci.11, 1266649–1266663. 10.3389/feart.2023.1266649 (2023). [Google Scholar]
- 35.Wang, L., Guo, Q. & Yu, X. Stability-level evaluation of the construction site above the goaf based on combination weighting and cloud model. Sustainability15 (9), 7222–7238. 10.3390/su15097222 (2023). [Google Scholar]
- 36.Bai, D. et al. Prediction interval estimation of landslide displacement using bootstrap, variational mode decomposition, and long and short-term time-series network. Remote Sensing14(22), 5808–5836. 10.3390/rs14225808 (2022). [Google Scholar]
- 37.Cleveland, W. S. Robust locally weighted regression and smoothing scatterplots. J. Am. Stat. Assoc.74(368), 829–836. 10.1080/01621459.1979.10481038 (1979). [Google Scholar]
- 38.Duhamel, P. & Vetterli, M. Fast Fourier transforms: A tutorial review and a state of the art. Signal Process.19(4), 259–299. 10.1016/0165-1684(90)90158-U (1990). [Google Scholar]
- 39.Wu, Z. & Huang, N. E. Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal1(01), 1–41. 10.1142/S1793536909000047 (2009). [Google Scholar]
- 40.Yadav, D. K., Soumya, K. & Goswami, L. Autoregressive integrated moving average model for time series analysis. In 2024 International Conference on Optimization Computing and Wireless Communication (ICOCWC), 1–6 (2024). 10.1109/ICOCWC60930.2024.10470488
- 41.Soni, A. & Yadav, A. Seasonal autoregressive integrated moving average model for forecasting electrical load of chhattisgarh state. In 2025 Fourth International Conference on Power, Control and Computing Technologies (ICPC2T), 46–51 (2025). 10.1109/ICPC2T63847.2025.10958744
- 42.Al-Qazzaz, R. A. & Yousif, S. A. High performance time series models using auto autoregressive integrated moving average. Indones. J. Electr. Eng. Comput. Sci.27, 422–430. 10.11591/IJEECS.V27.I1.PP422-430 (2022). [Google Scholar]
- 43.Usman, A. G. et al. Optimized SVR with nature-inspired algorithms for environmental modelling of mycotoxins in food virtual-water samples. Sci. Rep.15 (1), 16569–16585. 10.1038/s41598-025-99908-7 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Hu, Y., Fu, Y. & Chen, Y. A lightweight LSTM-based open-set RF fingerprinting identification for edge deployment. Sci. Rep.15 (1), 41568–41577. 10.1038/s41598-025-25417-2 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Kruse, R. et al. Multi-layer perceptrons. computational intelligence: a methodological introduction 53–124 (Springer International Publishing, 2022). 10.1007/978-3-030-42227-1_5. [Google Scholar]
- 46.Nemati, N., Meshgini, S., Rezaii, T. Y. & Afrouzian, R. Neonatal seizure detection from EEG using inception ResNetV2 feature extraction and XGBoost optimized with particle swarm optimization. Sci. Rep15(1), 41493–41514. 10.1038/s41598-025-25361-1 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Parkavi, R., Karthikeyan, P. & Sheik, A. A. Predicting academic performance of learners with the three domains of learning data using neuro-fuzzy model and machine learning algorithms. J. Eng. Res.12(3), 397–411. 10.1016/j.jer.2023.09.006 (2024). [Google Scholar]
- 48.Wei, Z., Gao, Y., Zhang, X., Li, X. & Han, Z. Adaptive marine traffic behaviour pattern recognition based on multidimensional dynamic time warping and DBSCAN algorithm. Expert Syst. Appl.238, 122229–122242. 10.1016/j.eswa.2023.122229 (2024). [Google Scholar]
- 49.Zhao, Y., Wei, X. & Zhao, K. Analysis of Z-score and total score of athleticism on drafted and undrafted players from the NFL scouting combine. Sci. Rep.15(1), 21742–21751. 10.1038/s41598-025-07383-x (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.





























