Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Apr 14;15:12804. doi: 10.1038/s41598-025-95498-6

Factor of safety prediction for slope stability using PCA and BPNN in Guangdong’s H mining area

Yangfan Jing 1,2, Yuefeng Li 3,, Jian Chang 1,2, Zhenbiao Liu 1,2, Zhiwei Ni 1,2, Qian Wang 1,2, Difa Gao 1,2
PMCID: PMC11997092  PMID: 40229417

Abstract

Evaluating slope failure is a primary concern in geotechnical engineering, and employing advanced machine learning techniques to design Factor of Safety (FOS) has become a critical focus. This study introduces a method that integrates Principal Component Analysis (PCA) with Back Propagation Neural Networks (BPNN) to predict the FOS. Compared to existing machine learning design approaches, the PCA-BPNN method demonstrates superior accuracy, achieving an R2 of 0.917, RMSE of 0.061, and MAE of 0.047 for the training set, and an R2 of 0.879, RMSE of 0.071, and MAE of 0.057 for the testing set. This method is applied to assess the slope stability of the H mining area in Guangdong, China, resulting in a designed FOS of 1.409, which meets practical engineering requirements. The findings highlight the effectiveness of the PCA-BPNN method in enhancing slope stability assessments in geotechnical applications.

Keywords: Factor of safety, Principal component analysis, Back propagation neural networks, Engineering application

Subject terms: Civil engineering, Computer science

Introduction

Slope stability1 assessment is a critical aspect of safety design in open-pit mining operations. The FOS24 serves as an essential indicator of slope stability, directly reflecting the safety status of slopes. Consequently, it finds extensive application in geotechnical fields such as mining, geology, and civil engineering. Accurately determining the safety factor of slopes has become a central issue in related research.

The existing methods for calculating the FOS can be broadly classified into traditional methods, numerical methods, and machine learning methods. Traditional methods, such as the Mohr–Coulomb theory5, the Swedish circle method6, and the simplified Janbu method7, are widely used due to their computational efficiency. However, they suffer from lower accuracy and are difficult to apply in complex slope conditions, resulting in significant limitations in practical applications. In contrast, numerical methods, such as the Finite Element Method (FEM)810 and the Discrete Element Method (DEM)1114, offer higher precision. FEM divides the slope soil into multiple finite elements and computes the stress and strain for each element, allowing detailed simulation of nonlinear soil behavior, geometric configurations, and loading conditions. DEM, on the other hand, is specifically designed to simulate the behavior of soil-rock mixtures, particularly in analyzing sliding phenomena, and is capable of handling complex behaviors such as particle flow, sliding, and collisions. While these numerical methods offer clear advantages in handling complex terrains and high-demand engineering projects, providing superior accuracy, they are computationally intensive, cumbersome, and typically reliant on specialized software, which limits their efficiency and operability in practical engineering applications.

With the advancement of computational technology, machine learning techniques1521 have increasingly been applied to address slope stability issues. Models such as artificial neural networks22 (ANN), multiple linear regression23(MLR), and support vector machines2426 (SVM) have shown promise in predicting the FOS of slopes. Additionally, heuristic algorithms have been integrated with machine learning for further research, including particle swarm optimization27, firefly algorithms28, artificial bee colony algorithms29, fruit fly algorithms30, and Monte Carlo methods31. However, many of these studies fail to adequately consider the multidimensional data structures inherent in slope stability analysis. Given the complexity and nonlinearity of slope problems, the curse of dimensionality may adversely affect research outcomes. Furthermore, existing studies predominantly focus on the predictive performance of datasets, with limited application in real engineering projects.

To address the limitations of existing methods, the development of a FOS prediction model capable of rapidly and accurately assessing slope stability under complex engineering conditions, especially in intricate geological environments, has become a critical task. This model aims to provide real-time support for practical engineering applications. In this study, PCA is introduced as a key research tool. PCA reduces the dimensionality of the original data by mapping it from a high-dimensional space to a lower-dimensional space. The core idea of PCA is to identify a new set of variables, known as principal components, which are linear combinations of the original variables and are ordered by their variance in descending order. By selecting the top few principal components with higher variance contribution, the dimensionality of the data can be significantly reduced while retaining the majority of the original information. This approach simplifies data complexity, eliminates multicollinearity issues between features, and enhances the accuracy and reliability of the data. Additionally, an adaptive BPNN is employed to predict the FOS, leading to the development of a PCA-BP neural network-based prediction model for FOS. This model is further applied to assess slope stability at the H mining area in Guangdong, China, for verification and practical application in engineering. Compared to traditional methods, numerical methods, and other machine learning approaches, the proposed method offers four key advantages, as outlined below:

  1. The dimensionality of the data is effectively reduced through PCA, thereby simplifying the data structure. This dimensionality reduction not only minimizes redundant information but also addresses issues of excessive computational complexity and feature redundancy that may arise in traditional methods and other machine learning algorithms when dealing with high-dimensional data. By mapping the original data to a lower-dimensional space that retains the most important information, PCA enables subsequent neural network models to be trained on fewer features, thereby improving training efficiency and reducing the risk of overfitting.

  2. Improvement of prediction accuracy and model stability: The PCA-BPNN method combines the dimensionality reduction capability of PCA with the nonlinear mapping ability of BPNN, enabling the model to better capture the nonlinear features within the data. Compared to traditional computational methods, PCA-BPNN is more adept at addressing complex slope stability issues, demonstrating significant accuracy advantages in the context of multivariate and nonlinear relationships. PCA helps eliminate correlations between features, enhancing the stability of the model, while BPNN further improves the model’s adaptability to complex patterns, ensuring highly accurate prediction results.

  3. Reduction of computational complexity and improvement of efficiency: Compared to traditional numerical methods, such as the FEM and DEM, the PCA-BPNN method significantly reduces computational complexity and minimizes the data processing workload. Traditional numerical methods require detailed calculations for each element and are relatively complex when handling high-dimensional data, leading to substantial computational resource consumption. In contrast, PCA-BPNN reduces the computational load through dimensionality reduction, while the efficient training mechanism of the neural network further accelerates the prediction of slope stability in practical applications, demonstrating higher computational efficiency.

  4. Enhanced generalization capability and applicability: The PCA-BPNN method demonstrates excellent generalization ability and adaptability, effectively addressing stability assessments under varying slope conditions. In traditional methods, modeling complex geological conditions often requires numerous assumptions and manual adjustments, while numerical methods are constrained by computational complexity and software dependencies. In contrast, PCA-BPNN automatically adapts to different input data, and through the dimensionality reduction of PCA and the nonlinear learning capabilities of BPNN, it exhibits greater flexibility and adaptability. When applied to a wide range of slope conditions, this method consistently provides stable prediction results, offering significant value for engineering applications.

Methodology

Principle

Principal component analysis (PCA)

PCA3235 is a widely used dimensionality reduction technique that aims to transform high-dimensional data into a lower-dimensional representation through linear transformation, while retaining as much variance as possible from the original dataset. PCA helps simplify data structures, reduce redundant information, and enhance computational efficiency, facilitating data visualization and feature selection. The mathematical principles underlying PCA are as follows.

Let there be n samples, with each sample described by p variables, represented as x. The original data matrix X is defined as shown in Eq. (1).

graphic file with name d33e340.gif 1

At this point, the matrix X can be expressed as p vectors, as shown in Eq. (2).

graphic file with name d33e357.gif 2

Equation (2) can be simplified, as demonstrated in Eq. (3).

graphic file with name d33e371.gif 3

After expressing the matrix linearly, the covariance matrix S of the sample data is calculated, as shown in Eq. (4).

graphic file with name d33e385.gif 4

The eigenvalues λ1, λ2λn and the corresponding orthonormal eigenvectors of the covariance matrix S are obtained. Consequently, the i-th principal component Fi of X is expressed as shown in Eq. (5).

graphic file with name d33e424.gif 5

where ai represents the i-th orthonormal eigenvector.

Ultimately, p principal components are calculated, and those components with a cumulative variance contribution rate exceeding 99% are selected for analysis.

Back propagation neural networks, BPNN

The BPNN36 is a commonly used method for addressing complex nonlinear problems. In this study, the BPNN input layer consists of six neurons, each corresponding to different predictive indicators. The configuration of the hidden layers is based on references37,38, with this study utilizing two hidden layers. After several iterations of tuning, the number of nodes in the hidden layers is selected as 10 and 1, respectively. The output layer contains a single neuron representing the safety performance indicator of the slope, as illustrated in Fig. 1.

Fig. 1.

Fig. 1

Structure of the PCA-BP Neural Network for Predicting FOS.

The BP neural network model developed in this study sets the maximum number of training iterations at 1000, with a learning rate of 0.01 and an error threshold of 1 × e−6. The chosen training function is the Levenberg–Marquardt algorithm.

Integration of PCA and BPNN

  1. Data Standardization: The data is standardized to ensure that all features are on the same scale.In this step, the Min-Max normalization method is applied for data standardization, as illustrated by the principle in Eq. (6).
    graphic file with name d33e488.gif 6
    where Inline graphic represents the original data, Inline graphic and Inline graphic are the minimum and maximum values of the data, respectively, and Inline graphic is the normalized data.
  2. Covariance Matrix Calculation: The covariance of each feature in the dataset is calculated to analyze the relationships between these features.

  3. Eigenvalue and Eigenvector Decomposition: Eigenvalue decomposition of the covariance matrix yields the eigenvalues (indicating the variance explained by each principal component) and eigenvectors (representing the direction of each principal component, i.e., the weights of the original variables).

  4. Principal Component Selection: The top principal components are selected based on the magnitude of their eigenvalues, typically choosing enough components to achieve a cumulative explained variance of 99% to retain as much data information as possible.

  5. Data Transformation: The original data is linearly transformed using the selected principal components, resulting in a new low-dimensional data representation, where these components are linear combinations of the original features.

  6. BPNN Data Definition: The generated low-dimensional data is used as the output layer for the BP neural network model.

  7. Data Preparation: The low-dimensional data is collected as training data for the BP neural network model.

  8. Network Structure Design: The number of layers and the number of nodes in each layer of the input and hidden layers are determined. The activation function used in the hidden layers is the Sigmoid function.

  9. Weight Initialization: The network’s weights and biases are randomly initialized, typically using Xavier or He initialization to prevent gradient vanishing or explosion.

  10. Forward Propagation: The input data is passed layer by layer through the network, calculating the weighted sum at each node, adding the bias, and applying the activation function. The output layer produces a continuous value as the regression prediction result.

  11. Loss Calculation: A loss function is used to compute the error between predicted and actual values, with the Mean Squared Error (MSE) reflecting the magnitude of the model’s prediction error. A smaller loss value indicates that the model’s predictions are closer to the actual values.

  12. Backward Propagation: The gradient of the loss function with respect to each weight and bias in the network is calculated. Using these gradients, the network updates the weights and biases through gradient descent, gradually reducing the loss function’s value. This process computes errors layer by layer using the chain rule and updates parameters accordingly.

  13. Iterative Training: The forward and backward propagation processes are repeated, with each iteration updating the model parameters. Appropriate settings for the number of epochs and learning rate are established to avoid overfitting or underfitting.

  14. Model Evaluation: The final results are output, and the regression performance of the model is evaluated.

Data preparation

Studies have shown that the main factors influencing slope stability include the slope’s bulk density, cohesion, friction angle, slope angle, slope height, and pore water pressure ratio, as illustrated in Fig. 2. By reviewing relevant literature3944, this study has established a slope FOS dataset encompassing various geological conditions. The dataset consists of 132 samples, covering multiple input features under different slope conditions. Some sample data are shown in Table 1, and the feature characteristics of the dataset are listed in Table 2. The input data include physical parameters of the slope, such as unit weight, cohesion, and friction angle, which, as seen in Fig. 2, represent various factors of the slope. The output data correspond to the slope’s FOS. The diversity and representativeness of the dataset ensure the model’s generalization capability, contributing to improved accuracy and reliability in slope stability assessments.

Fig. 2.

Fig. 2

Typical diagram of slope.

Table 1.

Data set of FOS.

Bulk density (kN/m3) Cohesion (MPa) Friction angle (°) Slope angle (°) Slope height (m) Pressure ratio FOS
25.00 46.00 35.00 47.00 443.00 0.25 1.28
27.00 40.00 35.00 47.10 292.00 0.25 1.15
27.00 50.00 40.00 42.00 407.00 0.25 1.44
27.00 32.00 33.00 42.20 289.00 0.25 1.30
21.00 20.00 24.00 21.00 565.00 0.00 1.26
22.00 21.00 23.00 30.00 257.00 0.00 1.10
20.00 20.00 36.00 45.00 50.00 0.20 0.96
22.40 10.00 35.00 45.00 10.00 0.40 0.90
27.30 10.00 39.00 40.00 470.00 0.15 1.42
31.30 68.00 37.00 47.00 360.50 0.25 1.20

Table 2.

Statistical characteristics of FOS data set.

Factors Ave Min Max Std Qua
Bulk density (kN/m3) 24.66 15.00 50.00 5.10 20.90
Cohesion (MPa) 37.71 10.00 100.00 41.21 15.74
Friction angle (◦) 30.15 10.00 50.00 10.40 23.75
Slope angle (◦) 39.69 18.00 60.00 11.39 30.00
Slope height (m) 194.11 6.00 565.00 173.87 48.75
Pressure Ratio 0.22 0.00 1.00 0.14 0.15

The dataset established in this study is a typical multi-indicator dataset. To assess the data correlations among the predictive indicators, a correlation analysis is conducted, with the results presented in Fig. 3.

Fig. 3.

Fig. 3

Diagram of correlation.

Figure 3 illustrates that there is a certain degree of correlation among the various indicators, which may negatively impact the effectiveness of data modeling. High correlation among indicators can lead to redundant information, thereby affecting the stability and accuracy of the model. To effectively reduce the interference of this correlation on the model, we employ PCA to process the dataset. PCA projects the original data into a new feature space through linear transformation, extracting the principal components that capture the most significant variance in the data. This approach enables us to reduce data dimensionality, enhancing modeling efficiency and effectiveness while ensuring the accuracy and reliability of the model during predictions.

The slope bulk density, cohesion, friction angle, slope angle, slope height, and pore water pressure ratio are denoted as X1 to X6, respectively. Upon calculation, six principal components are obtained, denoted as F1 to F6, with the corresponding calculation formulas provided in Eqs. (7) to (12). The contribution rates of the principal components F1 to F6 are 92.20%, 6.24%, 0.82%, 0.55%, 0.16%, and 0.03%, respectively. It can be observed that the cumulative contribution rate of F1 to F3 is 99.26%, which sufficiently represents all the data features. Therefore, F1 to F3 are selected as the representative principal components for the predictive analysis, and the resulting dataset is shown in Fig. 4.

graphic file with name d33e995.gif 7
graphic file with name d33e1001.gif 8
graphic file with name d33e1007.gif 9
graphic file with name d33e1013.gif 10
graphic file with name d33e1019.gif 11
graphic file with name d33e1025.gif 12

Fig. 4.

Fig. 4

Principal components dataset.

Results

The slope FOS dataset collected in this study is divided into a training set and a testing set in an 8:2 ratio. The number of training iterations (epochs) for the BPNN is set to 1000. The network architecture consists of two hidden layers, with 10 and 1 node(s) in each layer, respectively. The Sigmoid function is used as the activation function for the hidden layers of the BPNN. The learning rate is set to 0.01, with an error threshold of 1e−6, and the Levenberg–Marquardt algorithm is employed as the training function.

Both BPNN and PCA-BPNN models are established to conduct predictive research on slope FOS. The performance of the models is evaluated based on the average absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2). The calculation formulas for each metric are provided in Eqs. (13) to (15).

graphic file with name d33e1054.gif 13
graphic file with name d33e1060.gif 14
graphic file with name d33e1066.gif 15

Figure 5 presents the prediction results of the slope safety factors using the BPNN. Specifically, Fig. 5a displays the predictions for the training set, while Fig. 5b shows the predictions for the testing set. The calculated R2 for the training set is 0.782, with a RMSE of 0.099 and an MAE of 0.064. For the testing set, the R2 is 0.545, with an RMSE of 0.133 and an MAE of 0.133.

Fig. 5.

Fig. 5

The BPNN model results ((a) is the prediction result of the training set, (b) is the prediction result of the test set).

Figure 6 illustrates the prediction results of the slope FOS using the PCA-BPNN. Specifically, Fig. 6a presents the predictions for the training set, while Fig. 6b displays the predictions for the testing set. The calculated R2 for the training set is 0.917, with a RMSE of 0.061 and an MAE of 0.047. For the testing set, the R2 is 0.879, with an RMSE of 0.071 and an MAE of 0.057.

Fig. 6.

Fig. 6

The PCA-BPNN model results ((a) is the prediction result of the training set, (b) is the prediction result of the test set).

Engineering application

The prediction results from the dataset indicate that the PCA-BPNN model developed in this study demonstrates high predictive accuracy. To assess the model’s practicality, it is applied to the evaluation of slope stability in Area I of the H mining site in Guangdong, China.

The mining area is situated in a western hilly region, characterized by an overall hilly topography. The maximum elevation within the mining area reaches + 436.0 m in the northeastern section, while the minimum elevation is + 51.0 m at the base of a gully located directly north of the area. This results in a relative elevation difference of 385.0 m. The surface slopes range from 25° to 40°, with the area predominantly covered by Quaternary residual deposits.

The mining area experiences a subtropical monsoon climate characterized by intense solar radiation, abundant heat, ample sunshine, and significant rainfall. The climate features coincide with the seasons, with long summers and short winters, humid and rainy springs and summers, and cold winters with occasional frost. The average annual temperature is 21.2 °C, with a mean temperature of 12 °C in January and 28.7 °C in July. The extreme minimum temperature reaches − 1.0 °C, while the extreme maximum temperature can be as high as 38.3 °C. Annual precipitation averages 1648.5 mm, primarily occurring from April to September, with a maximum daily rainfall of 223 mm and an hourly maximum of 91.3 mm. The area experiences up to 118 days of thunderstorms per year. Annual evaporation exceeds 1300 mm, and the average annual sunshine duration is 1747.5 h. The maximum instantaneous wind speed is recorded at 38.1 m/s, and the frost-free period spans 310 to 345 days.

The mining area and its surrounding 300-m radius primarily consist of forest land, with no other mining or exploration rights present. According to the land use status map, the land use within the mining area is classified as general agricultural land and forest land, with no designated basic farmland, water source protection areas, nature reserves, or ecological protection zones. The location of the H mining area is illustrated in Fig. 7, while the specific location of Zone I is shown in Fig. 8.

Fig. 7.

Fig. 7

Location of mine (Map data ©2025 Google).

Fig. 8.

Fig. 8

Slope zoning and section line location map.

The slope parameters for Zone I of the H mining area, including weight bulk density, cohesion, friction angle, slope angle, slope height, and pore water pressure ratio, are presented in Table 3.

graphic file with name d33e1177.gif

Table 3.

FOS design of Zone I of the H mining area.

Bulk density (kN/m3) Cohesion (MPa) Friction angle (◦) Slope angle (◦) Slope height (m) Pressure Ratio F1 F2 F3 PCA-BPNN
18.9 22 27.9 43.9 60 0.45 62.88 26.25 39.86 1.409

The principal component indicators obtained are used as the input layer for the BPNN. The data at each input node is multiplied by the corresponding weights, and the weighted sum is calculated and biased before being processed by the activation function and passed to the next layer. This process is repeated at each layer until the information reaches the output layer, producing the predicted value. The error between the predicted value and the actual value is then computed, and the loss function is used to quantify the error. Next, through the backpropagation algorithm, the error is propagated backward from the output layer, and the gradients of the loss function with respect to each weight and bias are calculated. The Levenberg–Marquardt algorithm is employed to adjust the weights and biases to minimize the error. Training stops when the error threshold is reached. After training is completed, the network can perform forward propagation on new input data to generate the prediction. The calculated FOS for the slope in the H Mine District I is 1.409, which exceeds the minimum allowable FOS of 1.25 specified in the regulations. This indicates that the FOS designed using the PCA-BPNN method meets the engineering requirements.

Discussion

Results analysis

In this study, the BPNN serves as a comparative model for predicting the FOS, with its fitting performance illustrated in Fig. 9. The results indicate that the prediction accuracy of the BPNN is relatively poor. In contrast, Fig. 10 presents the fitting curve of the PCA-BPNN model, revealing a significant improvement in prediction accuracy after applying PCA. The R2 values for both the training and testing sets increase by at least 13%. This enhancement can be attributed to the presence of high-dimensional data in the constructed database, where certain features may not contribute to model training and could even introduce noise. PCA performs a linear transformation, converting the original data features into a reduced set of principal components that retain the primary information while eliminating irrelevant or redundant features. Furthermore, PCA decreases the data dimensionality and reduces computational complexity, enabling the model to more swiftly identify optimal solutions with fewer features, thus enhancing both training speed and accuracy. Importantly, the dimensionality reduction provides linear combinations of the original data features; these new features (principal components) often elucidate the data’s structure and primary variation directions more clearly, facilitating the model’s ability to capture underlying patterns within the data.

Fig. 9.

Fig. 9

BPNN model fitting results ((a) is the fitting result of the training set, (b) is the fitting result of the test set).

Fig. 10.

Fig. 10

PCA-BPNN model fitting results ((a) is the fitting result of the training set, (b) is the fitting result of the test set).

Comparison with traditional methods

In this study, the PCA-BPNN method is employed to predict the FOS of slopes. Although this method is more efficient and rapid compared to traditional approaches, conventional methods remain widely utilized in engineering practice. To compare the differences between these two methodologies, the Morgenstern-Price method and the Bishop method are applied to design the FOS for the slopes in area I of the H mine. The FOS obtained from the Morgenstern-Price and Bishop methods are 1.376 and 1.382, respectively. Comparatively, the FOS predicted by the PCA-BPNN method is greater than those calculated by the Morgenstern-Price and Bishop methods, indicating that the FOS designed in this study offers a more conservative assessment relative to traditional methods.

Comparison with other ML methods

Previous studies have considered various indicators and different outcomes of slope stability to conduct related research, primarily focusing on directly assessing slope stability and using FOS for evaluation, as detailed in Table 4. Currently, the application of BPNN in slope stability studies is relatively scarce. The evaluation of slope stability represents a complex system influenced by multiple factors, leading general machine learning assessment methods to employ multiple indicators in their analyses (see Table 4). However, the linear relationships among these indicators may impact the final results, potentially contributing to the lower reliability of slope stability assessments. As illustrated in Figs. 5 and 6, addressing the correlations among the indicators significantly enhances the model’s performance.

Table 4.

Research that applied ML to slope stability.

References ML method Slope conditions Result Evaluation index
Lin et al.45 BR, LR, EN, KNN, SVM, RF, ABM, GBM, ET, DT Unit weight, cohesion, internal friction angle, slope angle, slope height, and pore water pressure ratio FOS R2, MAE, MSE
Azmoon et al.46 DNN, CLEM Geometry, soil properties FOS Accuracy, EL
Nanehkaran et al.47 MLP, SVM, KNN, DT, RF Slope height, total slope angle, dry density, cohesion and internal friction angle FOS MSE, MAE, RMSE
Gupta et al.48 DNN, EAB Upper clay, lower clay, peat, angle of internal friction, embankment FOS CC, NSE, RMSE, MAE, SI
Zhang et al.49 MDMSE Unit weight, cohesion, internal friction angle, slope angle, slope height, and pore pressure ratio Slope stability Accuracy, F1-score
Mahmoodzadeh et al.50 GPR, SVM, DT, LSTM, DNN, KNN Unit weight, cohesion, friction angle, slope angle, slope height, and pore pressure ratio FOS R2, RMSE, MAE, MAPE
Goswami and Chakraborty51 MLR, MNLR, ANN Slope angle, height of the slope, ratio of slope height to layer thickness, cohesion of layer 1, cohesion of layer 2, average angle of internal friction, and average unit weight of the soil FOS R and MSE and MAE
Liu et al.52 OPF-KNN Unit weight, cohesion, the internal friction angle, the slope angle, the slope height, and the pore pressure ratio FOS Accuracy, F1-score, AUC
Nanehkaran et al.53 MLP, DT, SVM, RF Slope height, slope angle, slope topography, water level in slope, layers number, tensile crack depth, Sliding surfaces depth FOS R2
Yang et al.54 SVM, RF, KNN, DT, GB Rock bulk density, cohesion, internal friction angle, slope angle, slope height, and pore water pressure FOS AUC
Yang55 ACE-QPSO, LSSVM Unit weight, slope angle, height, internal cohesion, internal friction angle and pore water pressure FOS R2, MAE, MSE
Zhang et al.56 RF, XGBoost Elevation of front edge, elevation of back edge, slope height, slope angle, lithological property, inclination angle, dip direction, structure type, plane morphology, profile shape, landslide volume, influence degree of human activities Slope stability Recall rate, precision, accuracy

EL: euclidean loss, CC: correlation-coefficient, NSE: nash–sutcliffe-model efficiency-coefficient, SI: scattering-index, AUC: area under the curve, BR: bayesian ridge, LR: linear regression, EN: Elastic net, RF: random forest, ABM: adaptive boosting machine, GBM: gradient boosting machine, ET: extra trees, DT: decision trees, CLEM: conventional limit equilibrium methods, EAB: ensemble of ANN with bagging, MDMSE: margin distance minimization selective ensemble method, GPR: gaussian process regression, MLP: multiple linear regression, MNLR: multiple nonlinear regression, OPF-KNN: optimum-path forest algorithm based on KNN, GB: gradient boosting, ACE-QPSO: adaptive CE factor quantum behaved particle swarm optimization, LSSVM: least-square support vector machine, XGBoost: eXtreme Gradient Boosting.

Due to variations in datasets selected across studies, direct comparison based solely on evaluation metrics has limited significance. Based on the dataset used in this study, common models for slope stability evaluation—KNN, SVM, MLP, DNN, and LSTM—are selected for modeling and comparison, with results shown in Fig. 11. It is evident that the PCA-BPNN model performs best overall on both the training and test sets, exhibiting superior R2, MAE, and RMSE values, as well as strong generalization capability. While BPNN also performs well on the training set, it shows poor performance on the test set, indicating significant overfitting. Although LSTM demonstrates relatively larger errors on the training set, it maintains a high R2 on the test set, reflecting reasonable generalization capability, though its errors (MAE and RMSE) require further optimization. The KNN and MLP models show unsatisfactory performance on the test set, with lower R2 values and higher errors, suggesting weaker generalization capability.

Fig. 11.

Fig. 11

Prediction by seven ML models ((a) is the prediction result of the training set, (b) is the prediction result of the test set).

Sensitivity analysis

To evaluate the impact of different input parameters (such as slope angle, friction angle, pore water pressure, etc.) on the model’s prediction results, a sensitivity analysis is conducted on the six parameters used for modeling. Specifically, five different parameter systems are employed in the PCA-BPNN-based slope FOS prediction method developed in this study to analyze the effect of various indicators on the model. The experimental plan and results are presented in Table 5.

Table 5.

Results of sensitivity analysis.

Factors F1 F2 F3 FOS
X2, X3, X4, X5, X6 71.34 33.64 96.08 1.390
X1, X3, X4, X5, X6 23.00 78.04 56.65 1.010
X1, X2, X4, X5, X6 92.21 14.36 67.17 1.152
X1, X2, X3, X5, X6 49.32 83.35 31.07 1.270
X1, X2, X3, X4, X6 37.63 84.75 59.27 1.306
X1, X2, X3, X4, X5 5.89 88.02 62.31 1.365
X1, X2, X3, X4, X5, X6 62.88 26.25 39.86 1.409

From Table 5, it can be seen that when using the method proposed in this study for slope stability evaluation, different factors directly affect the FOS value, which in turn impacts the judgment of slope stability. The importance of ranking of these factors is as follows: cohesion, friction angle, slope angle, slope height, pore water pressure ratio, and bulk density.

Based on the parameter analysis results, the following conclusions can be drawn:

  1. Cohesion is the adhesive force between soil or rock particles, and it has the most significant impact on slope stability. It helps resist sliding or failure, especially in steeper slopes. Soils with higher cohesion maintain greater stability, particularly when external loads are minimal or absent. Cohesion ranks first in importance.

  2. The friction angle is an important parameter that measures the friction between soil or rock particles, directly affecting the shear strength of slope materials. A higher friction angle typically indicates stronger resistance to sliding and can effectively prevent landslides. Therefore, the friction angle’s influence ranks second, after cohesion.

  3. The slope angle directly determines the effect of gravity on the slope. As the slope angle increases, the tendency for sliding along the slope also increases. Thus, slope angle is one of the key factors influencing slope stability, especially when other factors (such as cohesion and friction angle) are lower. A larger slope angle is more likely to lead to landslides.

  4. Slope height affects the gravitational force exerted on the slope and the potential area of failure. A higher slope generally implies a greater risk of landslides, as it increases the potential energy along the sliding surface. However, the relationship between slope height and stability is also influenced by material strength and gradient, so its importance is slightly lower than that of friction angle and cohesion, ranking fourth.

  5. The pore water pressure ratio refers to the influence of water on the stability of the soil, especially in rainy areas or regions with high groundwater levels. When pore water pressure increases, the shear strength of the soil decreases, making landslides more likely. However, the extent of its impact depends on the permeability of the soil and rainfall conditions, so its importance comes after slope height.

  6. The slope’s bulk density also affects slope stability, particularly when considering the effects of gravity. A higher bulk density increases the gravitational stress on the slope. While it does have some impact on stability, its importance is relatively low compared to other factors, as its interaction with other factors (such as cohesion and friction angle) is more significant, ranking last.

In conclusion, cohesion and friction angle are the most critical factors influencing slope stability, while slope height, pore water pressure ratio, and slope bulk density also play important roles, but their impact is relatively smaller.

Applicability of model and future work

This study proposes a PCA-BPNN model for predicting the FOS of slopes, which has been validated in the H mine area in Guangdong, China. The validation results demonstrate the significant practical application value of the model, especially in real-time slope stability assessment, where it shows great potential.

The method combines PCA and BPNN. PCA reduces the dimensionality of high-dimensional data, eliminates redundancy, and improves the model’s computational efficiency and accuracy. Meanwhile, BPNN effectively captures the complex nonlinear relationships in slope stability problems, further enhancing the model’s predictive capability. Because the PCA-BPNN model can quickly process and predict, it is capable of real-time assessment of slope stability based on live data, such as slope angle, friction angle, and pore water pressure, providing timely support for engineering decisions.

In dynamic and changing engineering environments, slope stability is influenced by various factors, such as rainfall and earthquakes, which makes traditional methods computationally expensive and slow. In contrast, PCA-BPNN can quickly update input data and provide fast predictions, offering a significant advantage for real-time monitoring systems. With the development of sensor technology and the ease of data collection, the PCA-BPNN method offers an efficient and reliable solution for slope stability assessment under complex geological conditions, which is of great practical value in improving engineering safety and real-time monitoring capabilities.

However, in practical applications, issues such as data noise and missing data often arise. Therefore, future work should focus on introducing data preprocessing and interpolation methods to further improve the stability and accuracy of the model.

Further research is needed to advance real-time slope stability assessment. Additionally, attention should be given to the development of new prediction methods, with an emphasis on exploring various machine learning techniques. In recent years, active learning and surrogate models have gained increasing importance in geotechnical engineering5663. In the future, we will focus more on applying these advanced computational techniques to slope stability research. Moreover, we plan to integrate numerical simulation techniques64 with machine learning methods, particularly by developing an efficient and advanced unified approach that combines physical–mechanical measurements65, numerical simulations, and machine learning to advance slope stability research.

Conclusions

Slope stability is a critical issue in mining, civil engineering, and related fields. The design of FOS is predicated on this concern. This study establishes a database for slope FOS and employs the PCA-BPNN model for predicting these factors. The specific conclusions drawn from this research are as follows:

  1. This study identifies weight bulk density, cohesion, friction angle, slope angle, slope height, and pore water pressure ratio as the primary factors influencing slope stability. A database containing 132 samples of slope FOS has been established.

  2. This study proposes a novel method for slope stability assessment by integrating PCA with BPNN. The introduction of PCA effectively reduces data dimensionality and addresses multicollinearity among features, significantly enhancing data processing efficiency and model accuracy. Simultaneously, the BPNN, with its robust nonlinear mapping capabilities, is successfully applied to predict slope FOS, achieving an R2 of 0.917 for the training set and 0.879 for the testing set.

  3. The engineering application at H Mine in Guangdong Province demonstrates that the PCA-BPNN model exhibits robust capabilities in determining design FOS. The results obtained align with practical requirements, thereby confirming the applicability and reliability of this method in complex geological conditions.

Compared to traditional and numerical methods, the model proposed in this study simplifies calculations while maintaining high precision and effectively handling multidimensional complex data. It overcomes the limitations of traditional approaches under complex slope conditions. This research provides a novel perspective and tool for evaluating slope stability, demonstrating significant practical engineering application value, especially in complex terrains and high-demand projects, thereby offering valuable insights for related studies and practices.

Acknowledgements

The authors thank reviewers and editor for their help in revising the paper. This research was supported by the National key Research and development program "solid waste resource" key project of China (2020YFC1909101) and the National Key Research and Development project of China (2020YFC1909801).

Author contributions

Y.F. Jing: Writing-Original Draft. Y. F. Li: Conceptualization, Writing-review and editing, Methodology, Supervision. J. Chang: Writing, Investigation. Z. B. Liu: Writing, Data Curation, Formal Analysis. Z. W. Ni: Data Curation, Formal Analysis. Q. Wang: Data Curation. D. F. Gao: Formal Analysis.

Data availability

The data that support the findings of this study are available on request from the corresponding author.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Zhou, J. et al. Slope stability prediction for circular mode failure using gradient boosting machine approach based on an updated database of case histories. Saf. Sci.118, 505–518 (2019). [Google Scholar]
  • 2.Leong, E. C. & Rahardjo, H. Two and three-dimensional slope stability reanalyses of Bukit Batok slope. Comput. Geotech.42, 81–88 (2012). [Google Scholar]
  • 3.Xiao, S., Dai, T. & Li, S. Review and comparative analysis of factor of safety definitions in slope stability. Geotech. Geol. Eng.42(6), 4263–4283 (2024). [Google Scholar]
  • 4.Acevedo, A. M. G., Passini, L. D. B., Talamini, A. A., Kormann, A. C. M. & Fiori, A. P. Assessing limit equilibrium method approach and mapping critical areas for slope stability analysis in Serra do Mar Paranaense—Brazil. Environ. Earth Sci.80(17), 572 (2021). [Google Scholar]
  • 5.Jiang, X. Y., Cui, P. & Liu, C. Z. A chart-based seismic stability analysis method for rock slopes using Hoek-Brown failure criterion. Eng. Geol.209, 196–208 (2016). [Google Scholar]
  • 6.Hazari, S., Sharma, R. P. & Ghosh, S. Swedish circle method for pseudo-dynamic analysis of slope considering circular failure mechanism. Geotech. Geol. Eng.38, 2573–2589 (2020). [Google Scholar]
  • 7.Cheng, Y. M. & Yip, C. J. Three-dimensional asymmetrical slope stability analysis extension of Bishop’s, Janbu’s, and Morgenstern–Price’s techniques. J. Geotech. Geoenviron. Eng.133(12), 1544–1555 (2007). [Google Scholar]
  • 8.Li, C., Su, L., Liao, H., Zhang, C. & Xiao, S. Modeling of rapid evaluation for seismic stability of soil slope by finite element limit analysis. Comput. Geotech.133, 104074 (2021). [Google Scholar]
  • 9.Dyson, A. P. & Tolooiyan, A. Probabilistic investigation of RFEM topologies for slope stability analysis. Comput. Geotech.114, 103129 (2019). [Google Scholar]
  • 10.Dyson, A. P. & Griffiths, D. V. An efficient strength reduction method for finite element slope stability analysis. Comput. Geotech.174, 106593 (2024). [Google Scholar]
  • 11.Li, L., Hu, C., Yuan, Y., He, X. & Wu, Z. Efficient optimization parameter calibration method-based DEM simulation for compacted loess slope under dry–wet cycling. Sci. Rep.14(1), 17418 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Sarfarazi, V. et al. 2D discrete element analysis of the footing above excavated circle in soil. Sci. Rep.14(1), 21399 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Li, J. et al. Failure analysis of soil-rock mixture slopes using coupled MPM-DEM method. Comput. Geotech.169, 106226 (2024). [Google Scholar]
  • 14.Silva, A. V., Gomes, G. J., Huertas, J. R. & Cândido, E. S. Exploring tailings dam stability considering uncertainties in the critical state parameters of the NorSand model. Geotech. Geol. Eng.42(6), 4721–4741 (2024). [Google Scholar]
  • 15.Harabinová, S., Kotrasová, K., Kormaníková, E. & Hegedüsová, I. Analysis of slope stability. Civil Environ. Eng.17(1), 192–199 (2021). [Google Scholar]
  • 16.Cai, M., Koopialipoor, M., Armaghani, D. J. & Thai Pham, B. Evaluating slope deformation of earth dams due to earthquake shaking using MARS and GMDH techniques. Appl. Sci.10(4), 1486 (2020). [Google Scholar]
  • 17.Himanshu, N., Kumar, V., Burman, A., Maity, D. & Gordan, B. Grey wolf optimization approach for searching critical failure surface in soil slopes. Eng. Comput.37(3), 2059–2072 (2021). [Google Scholar]
  • 18.Ahmed, A., Khan, S., Hossain, S., Sadigov, T. & Bhandari, P. Safety prediction model for reinforced highway slope using a machine learning method. Transp. Res. Rec.2674(8), 761–773 (2020). [Google Scholar]
  • 19.Chen, W. W., Shen, Z. P., Wang, J. A. & Tsai, F. Scripting STABL with PSO for analysis of slope stability. Neurocomputing148, 167–174 (2015). [Google Scholar]
  • 20.Kurnaz, T. F. et al. Comparison of machine learning algorithms for slope stability prediction using an automated machine learning approach. Nat. Hazards120, 6991–7014. 10.1007/s11069-024-06490-8 (2024). [Google Scholar]
  • 21.Jia, H., Zhang, S., Wang, C. & Wang, X. Modelling of slope reliability analysis methods based on random field and asymmetric CNNs. Stoch. Environ. Res. Risk Assess.38, 3799–3822. 10.1007/s00477-024-02774-4 (2024). [Google Scholar]
  • 22.Das, S. K., Biswal, R. K., Sivakugan, N. & Das, B. Classification of slopes and prediction of factor of safety using differential evolution neural networks. Environ. Earth Sci.64(1), 201–210. 10.1007/s12665-010-0839-1 (2011). [Google Scholar]
  • 23.Chakraborty, A. & Goswami, D. Prediction of slope stability using multiple linear regression (MLR) and artificial neural network (ANN). Arab. J. Geosci.10, 1–11 (2017). [Google Scholar]
  • 24.Zhang, W., Li, Y. & Sun, X. Stability prediction of rock slope based on fuzzy clustering GA-FNN model. In International Association for Engineering Geology and the Environment 55–67 (Springer Nature Singapore, 2023).
  • 25.Wang, J. W., Xu, Y. S. & Li, J. Prediction of slope stability coefficient based on grid search support vector machine. Railw. Eng.59(5), 312–317 (2019). [Google Scholar]
  • 26.Tao, G. L., Yao, Z. S., Tan, B. Z., Gao, C. C. & Yao, Y. W. Application of support vector machine for prediction of slope stability coefficient considering the influence of rainfall and water level. Appl. Mech. Mater.851, 840–845 (2016). [Google Scholar]
  • 27.Koopialipoor, M., Jahed Armaghani, D., Hedayat, A., Marto, A. & Gordan, B. Applying various hybrid intelligent systems to evaluate and predict slope stability under static and dynamic conditions. Soft Comput.23, 5913–5929 (2019). [Google Scholar]
  • 28.Chu, X. & Li, L. Improved firefly optimization algorithm for location of minimum factor of safety considering spatial variability. J. Arch. Civil Eng.35(6), 94–101 (2018). [Google Scholar]
  • 29.Wang, F., Liu, Y., Hao, J. & Wei, X. Prediction model of slope safety factor based on MABC-SVR. Saf. Environ. Eng.26(2), 178–182 (2019). [Google Scholar]
  • 30.He, Z., Zhao, F., Wu, B., Wang, B. & Zhao, D. The application of FOA in searching for minimum safety factor of slope. J. Catastrophe34(4), 29–33 (2015). [Google Scholar]
  • 31.Mahdiyar, A. et al. A Monte Carlo technique in safety assessment of slope under seismic condition. Eng. Comput.33(4), 807–817 (2017). [Google Scholar]
  • 32.Nie, F., Chen, Q., Weizhong, Y. & Li, X. Row-sparse principal component analysis via coordinate descent method. IEEE Trans. Knowl. Data Eng.36(7), 3460–3471. 10.1109/TKDE.2024.3351851 (2024). [Google Scholar]
  • 33.Leyder, S., Raymaekers, J. & Verdonck, T. Generalized spherical principal component analysis. Stat. Comput.34(3), 104 (2024). [Google Scholar]
  • 34.Pavlů, I. et al. Principal component analysis for distributions observed by samples in Bayes spaces. Math. Geosci.56, 1641–1669 (2024). [Google Scholar]
  • 35.You, J., Tulpan, D. & Ellis, J. L. Predicting pellet quality using multiple linear regression with principal component analysis (PCA). J. Anim. Sci.102(Supplement_3), 154–155 (2024). [Google Scholar]
  • 36.Wang, C., Liu, Y., Li, Y., Liu, X. & Wang, Q. Classification of coal bursting liability of some Chinese coals using machine learning methods. Sci. Rep.14(1), 14030 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Lippmann, R. P. An introduction to computing with neural nets. ACM SIGARCH Comput. Arch. News16(1), 7–25 (1988). [Google Scholar]
  • 38.Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst.2(4), 303–314 (1989). [Google Scholar]
  • 39.Rukhaiyar, S., Alam, M. N. & Samadhiya, N. K. A PSO-ANN hybrid model for predicting factor of safety of slope. Int. J. Geotech. Eng.12(6), 556–566 (2018). [Google Scholar]
  • 40.Yixiang, F., Shikai, L. & Dapeng, L. Predicting models to estimate stability of rock slope based on RBF neural network. J. Wuhan Transp. Univ.27(2), 170–173 (2003). [Google Scholar]
  • 41.Zhai, S. H., Wu, A. X., Gao, Q., Zhang, M. H. & Dong, L. Prediction of slope safety factor based on the RS-GP model. Chin. J. Eng.33(1), 6–10 (2011). [Google Scholar]
  • 42.Qiao, J., Liu, B., Li, Y. & Gao, S. L. The prediction of the safety factor of the slope stability based on genetic programming. J. China Coal Soc.35(9), 1466–1469 (2010). [Google Scholar]
  • 43.Li, G., Liu, Y., Zhao, G. & Pend, J. The prediction and application of slope stability based on RS-BPNN. J. Univ. Sci. Technol. China29(3), 122–128 (2015). [Google Scholar]
  • 44.Sun, J., Wu, S., Zhang, H., Zhang, X. & Wang, T. Based on multi-algorithm hybrid method to predict the slope safety factor—stacking ensemble learning with Bayesian optimization. J. Comput. Sci.59, 101587 (2022). [Google Scholar]
  • 45.Lin, S., Zheng, H., Han, C., Han, B. & Li, W. Evaluation and prediction of slope stability using machine learning approaches. Front. Struct. Civil Eng.15, 821–833. 10.1007/s11709-021-0742-8 (2021). [Google Scholar]
  • 46.Azmoon, B., Biniyaz, A. & Liu, Z. Evaluation of deep learning against conventional limit equilibrium methods for slope stability analysis. Appl. Sci.11, 6060. 10.3390/app11136060 (2021). [Google Scholar]
  • 47.Nanehkaran, Y. A. et al. Application of machine learning techniques for the estimation of the safety factor in slope stability analysis. Water14, 3743. 10.3390/w14223743 (2022). [Google Scholar]
  • 48.Gupta, A., Aggarwal, Y. & Aggarwal, P. Deep neural network and ANN ensemble for slope stability prediction. Arch. Mater. Sci. Eng.116, 14–27. 10.5604/01.3001.0016.0975 (2022). [Google Scholar]
  • 49.Zhang, H., Wu, S., Zhang, X., Han, L. & Zhang, Z. Slope stability prediction method based on the margin distance minimization selective ensemble. CATENA212, 106055. 10.1016/j.catena.2022.106055 (2022). [Google Scholar]
  • 50.Mahmoodzadeh, A. et al. Prediction of safety factors for slope stability: Comparison of machine learning techniques. Nat. Hazards111, 1771–1799. 10.1007/s11069-021-05115-8 (2022). [Google Scholar]
  • 51.Goswami, M. & Chakraborty, A. Stability prediction of a two-layered soil slope. In Advances in Geo-Science and Geo-Structures. Lecture Notes in Civil Engineering (eds Choudhary, A. K. et al.) 171–179 (Springer, 2022). 10.1007/978-981-16-1993-9_18. [Google Scholar]
  • 52.Liu, L., Zhao, G. & Liang, W. Slope stability prediction using K-NN-based optimum-path forest approach. Mathematics11, 3071. 10.3390/math11143071 (2023). [Google Scholar]
  • 53.Nanehkaran, Y. A. et al. Comparative analysis for slope stability by using machine learning methods. Appl. Sci.13, 1555. 10.3390/app13031555 (2023). [Google Scholar]
  • 54.Yang, Y. et al. Slope stability prediction method based on intelligent optimization and machine learning algorithms. Sustainability15, 1169. 10.3390/su15021169 (2023). [Google Scholar]
  • 55.Wengang, Z., Hanlong, L., Lin, W., Xing, Z. & Yanmei, Z. Prediction of slope stability using ensemble learning techniques. In Application of Machine Learning in Slope Stability Assessment, 45–60 (Springer, 2023). 10.1007/978-981-99-2756-2_4
  • 56.Doan, N. S. & Dinh, H. B. Effects of limit state data on constructing accurate surrogate models for structural reliability analyses. Probab. Eng. Mech.76, 103595. 10.1016/j.probengmech.2024.103595 (2024). [Google Scholar]
  • 57.Li, Y. et al. Rockburst prediction based on the KPCA-APSO-SVM model and its engineering application. Shock. Vib.2021, 7968730 (2021). [Google Scholar]
  • 58.Johari, A., Javadi, A. A. & Habibagahi, G. Modelling the mechanical behaviour of unsaturated soils using a genetic algorithm-based neural network. Comput. Geotech.38(1), 2–13 (2011). [Google Scholar]
  • 59.Johari, A., Habibagahi, G. & Ghahramani, A. Prediction of SWCC using artificial intelligent systems: A comparative study. Sci. Iran.18(5), 1002–1008 (2011). [Google Scholar]
  • 60.Li, Y., Wang, C. & Liu, Y. Classification of coal bursting liability based on support vector machine and imbalanced sample set. Minerals13(1), 15 (2022). [Google Scholar]
  • 61.Doan, N. S., Mac, V. H. & Dinh, H. Machine learning applications to load and resistance factors calibration for stability design of caisson breakwater foundations. Comput. Geotech.169, 106225. 10.1016/j.compgeo.2024.106225 (2024). [Google Scholar]
  • 62.Jouhari, A. K., Ghahramani, A. & Habibagahi, G. Prediction of a soil-water characteristic curve using a genetic-based neural network. Sci. Iran.13, 284–294 (2006). [Google Scholar]
  • 63.Wang, C. et al. Optimization of BP neural network model for rockburst prediction under multiple influence factors. Appl. Sci.13(4), 2741 (2023). [Google Scholar]
  • 64.Johari, A. & Fooladi, H. Simulation of the conditional models of borehole’s characteristics for slope reliability assessment. Transp. Geotech.35, 100778 (2022). [Google Scholar]
  • 65.Zuo, T. et al. Insights into natural tuff as a building material: Effects of natural joints on fracture fractal characteristics and energy evolution of rocks under impact load. Eng. Fail. Anal.163, 108584 (2024). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES