Abstract
Rockburst present substantial hazards in both deep underground construction and shallow depths, underscoring the critical need for accurate prediction methods. This study addressed this need by collecting and analyzing 69 real datasets of rockburst occurring within a 500 m burial depth, which posed challenges due to the dataset's multi-categorized, unbalanced, and small nature. Through a rigorous comparison and screening process involving 11 machine learning algorithms and optimization with KMeansSMOKE oversampling, the Random Forest algorithm emerged as the most optimal choice. Efficient adjustment of hyper parameter was achieved using the Optuna framework. The resulting KMSORF model, which integrates KMeansSMOKE, Optuna, and Random Forest, demonstrated superior performance compared to mainstream models such as Gradient Boosting (GB), Extreme Gradient Boosting (XBG), and Extra Trees (ET). Application of the model in a tungsten mine and tunnel project showcased its ability to accurately forecast rockburst levels, thereby providing valuable insights for risk management in underground construction. Overall, this study contributes to the advancement of safety measures in underground construction by offering an effective predictive model for rockburst occurrences.
Subject terms: Natural hazards, Engineering, Mathematics and computing
Introduction
Rock explosion is a significant hazard within underground engineering operations1,2, characterized by the sudden release of energy within the rock mass due to high stress conditions induced by excavation, mining activities, or external disturbances3. This phenomenon can result in various complex geological disasters such as bursting, stripping, ejection, or even expulsion of rock fragments. While there exists a general understanding that the likelihood of rock explosion accidents increases with the depth of burial, this correlation is not absolute. Extensive research and analysis of rock explosion engineering cases have revealed that shallow engineering operations are also susceptible to such occurrences4,5.
The occurrence of rock explosions poses a grave threat to the safety of operational workers and presents a significant challenge to the pursuit of efficient and safe production practices, particularly within the context of underground engineering projects in China6–8. Consequently, there is an urgent need for comprehensive research and analysis aimed at predicting and preventing rock explosion disasters. Addressing this imperative is paramount for ensuring the continued and healthy development of underground engineering endeavors, given the critical importance of safety considerations in such operations9–11.
In recent years, numerous researchers have developed a plethora of models aimed at assessing the intensity levels of rockburst12,13. Sun et al.14 proposed a short-term rockburst prediction model based on microseismic monitoring and probabilistic optimized plain Bayes. Qiu et al.15 proposed a new hybrid model based on extreme gradient boosting (XGB) and meta-heuristic sand cat swarm optimization (SCSO). Liang et al.16 proposed an integrated classifier based on five basic learners to obtain better prediction results. Sun et al.17 showed that the prediction of rockburst and rockburst based on 13 machine learning algorithms is more accurate than a single model. Li et al.18 proposed a new method combining t-distributed stochastic neighborhood embedding (t-SNE) and Gaussian Mixture Model (GMM) clustering to relabel the dataset which can effectively improve the model prediction accuracy and generalization ability. Barkat et al.19 proposed an algorithmic combination of t-SNE, K-Means clustering, and XGBoost to predict the rockburst intensity level, which provides a good benchmark for future high-accuracy modeling. Muhammad et al.20 developed an ISOMAP + FCM + KNN framework that allows for high-precision prediction of rockbursts for short periods of time under specific conditions.
While recent advancements in model performance have undoubtedly contributed to theoretical development, there are lingering areas for improvement. Specifically, many existing models overlook the influence of burial depth on accuracy and expend considerable resources on hyper parameter optimization. Variations in burial depth can induce changes in ground stress, thereby altering rock explosion dynamics. Consequently, partitioning the dataset by burial depth becomes imperative to enhance model accuracy and applicability. Optuna21 emerges as a promising avenue in automated machine learning, offering efficiency gains over traditional grid and random search methods by swiftly determining optimal hyper parameter combinations. Overall, this paper presents a low-cost and efficient model by collecting data through the changing trend of the stress state at a certain depth of burial and innovatively applying Optuna, which highlights the high accuracy and practicality compared with other mainstream rockburst models. In this study, it meticulously curated 69 sets of real rock burst case parameters occurring within a 500 m burial depth range from literature sources. Employing KMeansSMOTE oversampling technology22, this study address data imbalances, subsequently optimizing the Random Forest23 integration model with Optuna to establish the KMSORF integration model. The efficacy and accuracy of our approach are validated through empirical investigations conducted in a tungsten mine and a tunnel project.
The structure of this paper is delineated as follows: The Background section provides a comprehensive introduction to the study's context and objectives. In the Rock Burst Data Collection and Analysis section, this study meticulously gather and analyze relevant datasets, illustrating correlations and delineating general characteristics through graphical representations. The subsequent section on KMeansSMOTE oversampling processing and analysis delves into algorithmic intricacies, particularly focusing on its efficacy in enhancing model performance with small datasets. This study then detail the optimization process of the KMSORF integration model using Optuna in the subsequent section. This is followed by a comprehensive explanation of the Optuna-based Random Forest Algorithm, elucidating its principles and the iterative steps involved in model training. In the Evaluation of the Model and its Application section, this study rigorously assess our model's performance against existing benchmarks, affirming its superior accuracy and reliability through comparative analyses and real-world case studies. In the Strengths and Limitations section, make a critical evaluation, citing the corresponding innovations while pointing out the limitations. Finally, in the Conclusion section, this study synthesize key insights and propose avenues for future research, encapsulating the contributions and significance of our work.
Shallow rockburst data acquisition and analysis
This study presented an analysis of 69 instances of shallow rock burst engineering examples24–30, each devoid of missing values and occurring within a burial depth of 500 m. Drawing upon a comprehensive review of literature from both domestic and international sources17,18,31–36, machine learning techniques and comprehensive evaluations of rock burst factors were employed. Specifically, the investigation focused on stress control, rock mechanical properties, and energy considerations. Through a number of case studies it can be seen that the intensity of rock bursts occur at the level of the following characteristics: occurred in the stress concentration area, the form of section damage was mainly tensile damage accompanied by shear damage, can store a large amount of elastic energy and other salient features. At the same time, the indicators chosen should be common, easy to measure in practice, and documented in previous examples of rockburst. So Four key indicators were selected for rock burst prediction characterization, namely the maximum tangential stress of the surrounding rock (), uniaxial compressive strength (), uniaxial tensile strength (), and elastic energy index (). These indicators served as crucial features for predicting rock burst intensity levels, categorized as None, Light, Moderate, or Strong based on actual occurrences.
In the dataset comprising 69 rockburst samples, the distribution across rockburst grades reveals 19, 21, 19, and 10 instances categorized as none, weak, moderate, and strong, respectively. Notably, the proportion of strong-level rock burst data was relatively small, as illustrated in Fig. 1. To address this imbalance, this study proposed oversampling the dataset to ensure a more equitable representation across all intensity levels.
Figure 1.

Sample dataset share chart.
To elucidate the horizontal relationships between the indicators, this study calculated the Pearson correlation coefficient heat map, presented in Fig. 2. The analysis underscores a strong correlation between the indicators and the intensity of rockburst grades. Specifically, heightened values of rock elasticity, energy index, and surrounding rock stress correspond to an elevated likelihood and intensity of rock burst occurrences.
Figure 2.
Pearson's correlation coefficient plot.
Furthermore, to illustrate the longitudinal relationships of the characteristic indicators more vividly, this study depicted the data distribution using violin box line graphs, as shown in Fig. 3. The analysis reveals that , and predominantly fall within the ranges of 20–80 MPa, 80–180 MPa, 2–10 MPa and 2–8, shedding light on the distribution patterns of these crucial factors.
Figure 3.
Violin case line diagram.
KMeansSMOTE oversampling processing and analysis
The KMeansSMOTE oversampling method22, introduced by Georgios D, Fernando B et al. in 2018, represented a novel algorithm designed to address sample imbalance. Initially, the algorithm employs KMeans clustering to partition the original unevenly distributed samples into k clusters. These filtered clusters are denoted as f. Subsequently, the sample size for each group of generated data was computed based on the weights of sample categories. Finally, the filtered clusters undergo oversampling using the Synthetic Minority Over-sampling Technique (SMOTE).This methodology offered a systematic approach to mitigate sample imbalance, thereby enhancing the robustness and reliability of classification models. The specific calculation process unfolds as follows:
- Input the original unbalanced k clusters by clustering, calculate the degree of unbalance for each cluster, and if it is greater than a specified threshold then select this cluster noted as f. k can be based on:
where N is the total number of samples.1 - Calculate the density of cluster f:
where is the number of minority samples in the cluster and is the average distance within the cluster.2 - Compute the sparsity of the cluster f:
3 - Calculate the sampling coefficients of the clusters f:
where n is the number of clusters selected for synthesizing new samples.4 - Calculate the number of new samples to be synthesized artificially based on the sampling coefficients and the m samples generated:
5 - Finally, the Smote algorithm is used to generate new minority class samples based on within clusters.
where is a newly synthesized minority class sample and is a random number within (0, 1).6
Numerous studies have underscored the profound impact of sample category imbalance on model performance37. Generally, three approaches are commonly employed to address the data imbalance issue: undersampling, oversampling, or a combination of both. Given the constraints imposed by limited sample numbers, this study opts for the oversampling method.
KMeansSMOTE represents an innovative oversampling technique that integrates the K-means clustering algorithm with the SMOTE. This approach effectively addresses challenges associated with inadequate sample data across categories and enhances model generalization capabilities. Consequently, it contributes to improving the accuracy and robustness of the model. This methodological choice aligns with the objective of mitigating data imbalance to facilitate more reliable and insightful model outcomes.
To assess the sensitivity and generalization capabilities of various algorithms on unbalanced small datasets, as well as to evaluate the efficacy of the KMeansSMOTE oversampling method in enhancing model performance, this study conducted a comparative analysis of prediction results using eleven machine learning models. The comparison involved evaluating the original dataset against both normalized and pre-processed datasets after KMeansSMOTE oversampling. To ensure consistency, the model hyper parameter was optimized using GridSearchCV, thereby ensuring identical hyper parameter values across datasets. Cohen's Kappa coefficients were employed as a metric for evaluation.
The eleven machine learning models considered in this study were: Decision Tree (DT), Support Vector Classification (SVC), k-Nearest Neighbor (KNN), Gaussian Process Regression (GPR), Naive Bayes model (NBM), Quadratic Discriminant Analysis Algorithm (QDA), Gradient Boosting (GB), Extreme Gradient Boosting (XGB), Random Forest (RF), Extra Trees (ET), Light Gradient Boosting (LGB), and Light Gradient Boosting Machine (LGBM).
This rigorous comparative analysis aims to provide insights into the performance variations among different algorithms and the impact of oversampling techniques on model efficacy. By systematically evaluating these models across various datasets and conditions, this study aim to validate the effectiveness of the proposed approach and contribute to advancing the understanding of handling unbalanced small datasets in machine learning applications.
As depicted in Fig. 4, the utilization of KMeansSMOTE oversampling demonstrates a significant enhancement in model performance. Across multiple models, the coefficients K on the original dataset average 0.4639, while those on the post-processed dataset average 0.5127. Notably, 8 out of 11 algorithms exhibit improved performance, with KNN demonstrating the highest improvement at 0.3252. In the post-processed dataset, the RF model achieves the highest score of 0.7578.
Figure 4.

Performance of KMeansSMOTE algorithm in different models.
Moreover, the integrated model displays a notable advantage over individual models, exhibiting an improvement of approximately 0.1693 in the original dataset and 0.0571 in the processed dataset. This observation underscores the superior performance of the integrated model compared to individual models.
Overall, the results highlight the effectiveness of the KMeansSMOTE algorithm in enhancing individual model performance. However, it is noteworthy that the integrated model outperforms both the single model and the KMeansSMOTE algorithm alone, underscoring the importance of model integration for achieving superior predictive accuracy. These findings contribute to a deeper understanding of model enhancement techniques and their impact on overall performance in machine learning applications.
Random forest algorithm based on Optuna
Random Forest38 (RF) was a machine learning algorithm rooted in the concept of Bagging integrated learning, aimed at constructing multiple weak classifiers and amalgamating them to form a robust classifier. Within RF, the weak classifiers typically comprised Classification and Regression Trees (CART). During the construction of CART trees, training samples were subjected to a self-servicing sampling method known as Bootstrap Sample, ensuring that each CART tree possesses independent features and judgment criteria. Moreover, a subset of features was randomly selected for training, fostering diverse combinations of features and enhancing generalization capability. This approach not only enhances diversity but also improves the algorithm's generalization ability, thereby facilitating more accurate and robust predictions. These principles underpin RF efficacy in various machine learning tasks and contribute to its widespread adoption in scientific research and practical applications.The detailed process is visually delineated in Fig. 5. The RF integrated classification model can be expressed as follows:
| 7 |
where is the integrated classification model, Y is the output variable, is the single CART tree classification result, and is the indicator function.
Figure 5.
Principle diagram of RF.
This study uses the Tree-structured Parzen Estimator sampler (TPESampler) improved by the Bayesian optimization algorithm in Optuna, whose substitution probability model is a Gaussian process distribution model determined by the mathematical expectation and covariance functions, which can be expressed as follows:
| 8 |
where any x corresponds to the probability density function as a normal distribution function, m is the mathematical expectation, and k is the covariance function.
The value of the kernel function is crucial, as it determines the similarity between function values and controls the shape of the fitted function. Assuming =0, a number of are taken uniformly and brought into to obtain:
| 9 |
where:
| 10 |
where is the output obeys a multivariate Gaussian distribution with mean 0 and covariance matrix K, and is the new sampling point for the search.
This leads to suit to compute the new sampling point function values from a one-dimensional normal distribution. A Gaussian process is fitted to the prior function to find the probability distribution of function values:
| 11 |
The Gaussian process described above can be modeled as . The tree-structured Parzen estimator replaces the previously configured distribution transformation generation process with a nonparametric density to model , which can be expressed as:
| 12 |
where is the best value searched, is the density formed by different , and is the density formed by the remaining search values.
The RF algorithm entails adjusting numerous hyper parameter, and manual tuning can be time-consuming and prone to issues such as local optima. Moreover, manual tuning does not guarantee optimal prediction results. Thus, this study employed Optuna to swiftly and effectively determine optimal values. The methodology unfolds as follows:
- Step 1:
Data preprocessing involves normalization, segmentation, and preprocessing of collected data. Following processing, the dataset is partitioned into training and test sets in an 8:2 ratio.
- Step 2:
Identification of hyper parameter to be tuned in the RF algorithm. Given the multitude of hyper parameter, focus is placed on those with significant impact, including the number of CART trees (n_estimators), maximum depth of CART tree (max_depth), minimum samples required to split internal nodes (min_samples_split), and whether to utilize bootstrapping (bootstrap). Other hyperparameters retain their default values initially. The ranges of hyper parameter values are determined based on literature review and preliminary testing, as outlined in Table 1.
- Step 3:
Configuration of key modules in the TPESampler optimizer. This involves defining the search space for hyper parameter, setting the optimization metric (e.g., accuracy), specifying the objective function, and determining the number of trials. For this study, the default number of trials is set to 50.
- Step 4:
Determination of optimal hyper parameter values. Optimization is based on accuracy and employs four-fold cross-validation for evaluation. The algorithm's performance is compared with other mainstream integration algorithms on both training and test sets.
- Step 5:
Construction of a long-term rockburst prediction model for KMSORF. The KMSORF model classifier is trained on the training set using the hyper parameter combination identified in step 4.
Table 1.
RF algorithm hyper parameter optimization range and determining hyper parameter values.
| Hyper parameter | Limit | Lower limit | Optuna auto-adjusted values |
|---|---|---|---|
| n_estimators | 2500 | 500 | 1973 |
| max_depth | 50 | 10 | None |
| min_samples_split | 10 | 1 | 2 |
| min_samples_leaf | 5 | 1 | 2 |
| bootstrap | – | – | False |
This methodology ensures efficient hyper parameter optimization and robust model development, thereby advancing the field of rockburst prediction modeling.
Assessment models and applications
Model evaluation
After establishing the rockburst prediction model, it's crucial to select appropriate metrics for performance evaluation. Multi-classification models are typically assessed using Cohen's Kappa coefficient (K)39, Accuracy (acc), and Macro-averaged metrics such as Macro-Precision, Macro-Recall, and Macro-F1 Score. However, relying solely on arithmetic averages to derive these statistical indicators for each classification can potentially overlook sample imbalances. To address this issue, this study adopt the Weighted method, encompassing Weighted-Precision (WP), Weighted-Recall (WR), and Weighted-F1 Score (WF). Here, L represents the number of specific samples in the multi-classification sample, and each evaluation index is computed using the following formulas. This approach ensures a more nuanced consideration of sample imbalances, enhancing the robustness of this model evaluation process.
| 13 |
where is the percentage of correctly categorized samples and is the consistency of labels when randomly assigning categories.
| 14 |
| 15 |
| 16 |
| 17 |
where the values are taken according to Table 2.
Table 2.
Confusion matrix.
| Predict | Actual | |
|---|---|---|
| Positive | Negative | |
| Positive | TPi | FPi |
| Negative | FNi | TNi |
The KMSORF model proposed in this study is compared with four mainstream "tree" model integration algorithms, namely GB, XBG, and ET. Hyper parameter optimization for these comparison algorithms is conducted using RandomizedSearchCV, ensuring robustness in model parameter selection. Four-fold cross-validation is employed to determine the specific parameter values, thereby enhancing the generalization performance of the models. The specific parameters for each algorithm are as follows: In GB, subsample = 0.8, n_estimators = 2289, min_samples_split = 2, min_samples_leaf = 2, learning_rate = 0.6; In XBG, max_depth = 10, learning_rate = 0.094, subsample = 0.911, colsample_bytree = 0.467, min_child_weight = 1; In ET, n_estimators = 138, min_samples_split = 13, min_samples_leaf = 7, max_features = sqrt, max_depth = 10. Other hyper parameter values are kept at their default settings. This rigorous parameter optimization process ensures the comparability and reliability of the model evaluations conducted in this study.
The overall accuracy of each model is illustrated in Fig. 6. Upon examining all four models collectively, it becomes evident that both the KMSORF and XGB models exhibit superior accuracy on the training set compared to the other two integrated models, achieving an impressive 0.9855. This signifies the effectiveness of these two models in capturing the underlying features within the training data. On the test set, the KMSORF model achieves the highest accuracy of 0.8333, further affirming its efficacy. However, relying solely on accuracy metrics may not suffice for a comprehensive model assessment. Therefore, it is imperative to employ more comprehensive evaluation metrics to further scrutinize the models' performance.
Figure 6.

3D side-by-side bar graphs for four models accuracy.
Figure 7 delineates the specific performance of the four models concerning F1 score, recall, and precision on the test set. Notably, a substantial variance among the models is observed across these evaluation metrics. Remarkably, the KMSORF model demonstrates superior performance across all three evaluation indexes, with an F1-score of 0.8315, Recall of 0.8333, and Precision of 0.8588, all of which outperform the mainstream models. These metrics collectively underscore the model's commendable predictive performance and generalization capability. The closest competitor, the XGB model, exhibits excellent performance in various domains; however, it displays certain deficiencies in generalization for smaller datasets. This nuanced analysis highlights the KMSORF model's efficacy and its potential to excel even in challenging scenarios with limited data.
Figure 7.

3D band chart of three indicators.
Engineering overview and applications
Overall, the value of the model needs to be demonstrated in actual engineering. In this section, the model is applied through and specific real cases in order to highlight the effectiveness and value of the model. In this study, the model is simulated and predicted by a tungsten mine in southern Jiangxi Province and a tunneling project in the cited heavy hills and mountains to determine whether the model is effective or not.
Located in the southern region of Jiangxi Province, a tungsten mine boasts an ore body primarily situated underground between depths of + 260 m and + 660 m. The terrain surrounding the mine is characterized by steep, mountainous landscapes, forming a distinctive "V"-shaped topography. The ground surface exhibits slopes exceeding 45°, extending predominantly from the northwest to the southeast. The surrounding rock formations within the mine tunnel comprise semi-hard phyllite, slate, or hard metamorphic sandstone, with occasional exposures of hidden granite in deeper sections.
Geological investigations reveal the presence of a dorsal fold structure with an axial orientation of 40° within the mining area. The most prominent fracture in the vicinity, denoted as F1, exhibits a striking trend of 70°, trending southeast, with a steep dip angle of 83°. This fracture has contributed to the formation of a crushed zone with a maximum width of 27 m.
To comprehensively analyze the geological conditions surrounding rock blasting operations in the mine periphery, extensive experiments have been conducted to investigate the main lithology, strength, and mechanical parameters of the granite formations within the mine. Figure 8 illustrates a subset of the mechanical parameters examined, while the comprehensive test findings are meticulously presented in Table 3. These results play a pivotal role in advancing this study understanding of the lithological granite's mechanical characteristics within the mine site.
Figure 8.
Mechanical parameter test part diagram.
Table 3.
Physical and mechanical parameters of mine surrounding rock.
| Rock group | Densities | Compressive strength | Modulus of elasticity | Poisson's ratio | Softening factor | Tensile strength | Cohesion | Friction angle |
|---|---|---|---|---|---|---|---|---|
| g/cm3 | MPa | GPa | % | MPa | MPa | ° | ||
| Perimeter rock | 2.695 | 127.780 | 25.391 | 0.250 | 80.200 | 11.757 | 24.075 | 29.061 |
Based on the actual geological conditions of the tungsten mine and leveraging the KMSORF rockburst prediction model, typical geological scenarios within the mine were selected for rockburst prediction. Specifically, the rock layer predominantly consists of granite, with a burial depth of 380 m. Additionally, the model was applied to predict rockburst occurrences in a tunnel project situated in Heavy Hills and Mountains40. In this project, the rock layer primarily comprises gray rock, with a burial depth of 410 m. The mechanical parameters and prediction results are summarized in the following Table 4.
Table 4.
Engineering measurement data and validation results.
| Project name | Lithology | h/m | /MPa | /MPa | /MPa | Actual grade | Forecast level | |
|---|---|---|---|---|---|---|---|---|
| A tunnel in the Heavy Hills and Mountains | limestone | 410 | 12.7 | 190 | 8.9 | 7.1 | None | None |
| A tungsten mine in southern Jiangxi | Granite | 380 | 37.9 | 127.8 | 11.8 | 6.8 | None | None |
The prediction outcomes indicate a congruence between the model's predictions and the actual rockburst grades, affirming the accuracy of the model. This validation of the prediction results provides robust evidence of the model's reliability and its applicability to diverse geological settings.
Strengths and limitations
Compared to existing studies, this study introduces several notable innovations. Firstly, this study employ the novel technique of Optuna automatic hyper parameter adjustment, which effectively reduces model training time, enhancing efficiency without compromising performance. Secondly, this study distinguishes between shallow rockburst and other forms of rockburst, providing a fresh perspective on the prediction of shallow rockburst occurrences. This differentiation enriches predictive models and improves their accuracy in assessing specific geological conditions. Thirdly, this evaluation reveals the KMSORF model as the top performer among mainstream integrated models, particularly in small dataset scenarios, showcasing its superiority in predictive accuracy and generalization. The accuracy of the model in this study is 0.8333, compared to 0.8 for mainstream models used for rockbursts with small datasets (e.g., FA-SSA-PNN Model41). It highlights the superiority of the present model.
However, despite these advancements, certain limitations persist. Chief among them is the challenge posed by small sample sizes, necessitating a greater influx of real rockburst data to augment model training and validation. Addressing this limitation will be crucial for further refining and validating predictive models in real-world applications.
Conclusion
In this study, this study investigated a long-term prediction model for shallow rockburst using machine learning techniques, utilizing data gathered from literature sources. Through a series of algorithmic combinations and enhancements, this study found that the KMSORF model performs exceptionally well. The key conclusions drawn from this study are as follows:
The optimized KMSORF model, incorporating the KMeansSMOTE oversampling algorithm and the Optuna learning framework on the RF base algorithm, demonstrates remarkable efficacy in long-term prediction of shallow rockburst. The model achieves superior performance metrics compared to other mainstream models, with acc reaching 0.8333, and WP, WR, and WF values of 0.8588, 0.8333, and 0.8315, respectively. Application of the model to tungsten mining in southern Jiangxi Province and tunneling projects in hilly terrains yields predictions consistent with real-world outcomes, thereby validating the reliability and accuracy of the model. These findings suggest that the model can serve as a valuable tool for guiding exploration, design, and safe construction practices in shallow underground engineering projects.
The KMeansSMOTE algorithm effectively transforms data structures, leading to performance improvements across most models. While the enhancement for individual models surpasses that of integrated models, the latter exhibit superior overall performance both before and after dataset processing. Notably, among the 11 machine learning algorithms assessed, the RF algorithm achieves the highest K-factor of 0.7578. Additionally, the KNN model demonstrates the most significant improvement post-oversampling treatment by the KMeansSMOTE algorithm, with a gain of 0.3252.
Data analysis reveals strong correlations between certain parameters (e.g., depth and geological characteristics) and rockburst grade in shallow engineering regions. Addressing these correlations poses a crucial challenge, and future research endeavors should focus on leveraging multi-source, multi-phase data to deepen this study understanding of rockburst occurrences and the underlying causal factors. This deeper understanding will inform the refinement and enhancement of explosion prevention and pre-blast programs, ultimately improving safety measures in shallow underground engineering endeavors.
Acknowledgements
This work was supported by the 2023 Jiangxi Province "Science and Technology + Emergency Response" Joint Program Project (2023KYG01002).
Author contributions
G.R. is the executor of the modeling design and theoretical analysis of this study, and is responsible for the writing of the first draft. Y.R., J.W. and Q.H. completed data analysis and guided the writing and revision of the paper; Q.L., Z.Y., R.X. and L.Z. participates in the modeling process and results analysis. All authors read and approved the final manuscript.
Data availability
All data that support the findings of this study are available from the corresponding author upon reasonable request.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Zhou J, Li X, Mitri HS. Evaluation method of rockburst: State-of-the-art literature review. Tunn. Undergr. Space Technol. 2018;81:632–659. doi: 10.1016/j.tust.2018.08.029. [DOI] [Google Scholar]
- 2.He M, Cheng T, Qiao Y, Li H. A review of rockburst: Experiments, theories, and simulations. J. Rock Mech. Geotech. Eng. 2023;15(5):1312–1353. doi: 10.1016/j.jrmge.2022.07.014. [DOI] [Google Scholar]
- 3.Xiating FENG, Yaxun XIAO, Guangliang FENG, et al. Study on the development process of rockbursts. Chin. J. Rock Mech. Eng. 2019;38(04):649–673. [Google Scholar]
- 4.Rong, H., Yu, S., Zhang, H. & Liang, B. Quantitative calculation of critical depth in typical rockburst mine. Adv. Civil Eng.2020(1), 7968160 (2022).
- 5.Askaripour M, Saeidi A, Rouleau A, Mercier-langevin P. Rockburst in underground excavations: A review of mechanism, classification, and prediction methods. Undergr. Space. 2022;7:577–607. doi: 10.1016/j.undsp.2021.11.008. [DOI] [Google Scholar]
- 6.Ma T, Lin D, Tang L, Li L, Tang CA, Yadav KP, Jin W. Characteristics of rockburst and early warning of microseismic monitoring at qinling water tunnel. Geomatics Nat. Hazards Risk. 2022;13(1):1366–1394. doi: 10.1080/19475705.2022.2073830. [DOI] [Google Scholar]
- 7.Dong LIU, Xi-bing LI, Zhi-xiang LIU, et al. The induced mechanism for shallow rock burst below group goafs. Min. Metall. Eng. 2016;36(02):23–27. [Google Scholar]
- 8.Li TZ, Li YX, Yang XL. Rock burst prediction based on genetic algorithms and extreme learning machine. J. Cent. South Univ. 2017;24(9):2105–2113. doi: 10.1007/s11771-017-3619-1. [DOI] [Google Scholar]
- 9.Kamran M, Wattimena RK, Armaghani DJ, Asteris PG, Jiskani IM, Mohamad ET. Intelligent based decision-making strategy to predict fire intensity in subsurface engineering environments. Process Saf. Environ. Prot. 2023;171:374–384. doi: 10.1016/j.psep.2022.12.096. [DOI] [Google Scholar]
- 10.Kamran M, Shahani NM. Decision support system for the prediction of mine fire levels in underground coal mining using machine learning approaches. Min. Metall. Explor. 2022;39(2):591–601. [Google Scholar]
- 11.Kadkhodaei MH, Ghasemi E. Development of a semi-quantitative framework to assess rockburst risk using risk matrix and logistic model tree. Geotech. Geol. Eng. 2022;40(7):3669–3685. doi: 10.1007/s10706-022-02122-9. [DOI] [Google Scholar]
- 12.Iu GF, Jiang Q, Feng GL, et al. Microseismicitybased method for the dynamic estimation of the potential rockburst scale during tunnel excavation. Bull. Eng. Geol. Environ. 2021;80(5):3605–3628. doi: 10.1007/s10064-021-02173-x. [DOI] [Google Scholar]
- 13.Li X, Mao HY, Li B, et al. Dynamic early warning of rockburst using microseismic multi-parameters based on Bayesian network. Eng. Sci. Technol. Int. J. 2021;24(3):715–727. [Google Scholar]
- 14.Jia-hao SUN, Wen-jie WANG, Lian-ku XIE. Short-term rockburst prediction model based on microseismic monitoring and probability optimization Naive Bayes. Rock Soil Mech. 2024;2024(06):1–11. [Google Scholar]
- 15.Qiu Y, Zhou J. Short-term rockburst damage assessment in burst-prone mines: An explainable XGBOOST hybrid model with SCSO algorithm. Rock Mech. Rock Eng. 2023;2023:1–26. [Google Scholar]
- 16.Liang W, Sari YA, Zhao G, et al. Probability estimates of short-term rockburst risk with ensemble classifiers. Rock Mech. Rock Eng. 2021;54:1799–1814. doi: 10.1007/s00603-021-02369-3. [DOI] [Google Scholar]
- 17.Sun L, Hu N, Ye Y, et al. Ensemble stacking rockburst prediction model based on Yeo-Johnson, K-means SMOTE, and optimal rockburst feature dimension determination. Sci. Rep. 2022;12(1):15352. doi: 10.1038/s41598-022-19669-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Li J, Fu H, Hu K, et al. Data preprocessing and machine learning modeling for rockburst assessment. Sustainability. 2023;15(18):13282. doi: 10.3390/su151813282. [DOI] [Google Scholar]
- 19.Ullah B, Kamran M, Rui Y. Predictive modeling of short-term rockburst for the stability of subsurface structures using machine learning approaches: T-SNE, K-Means clustering and XGBoost. Mathematics. 2022;10(3):449. doi: 10.3390/math10030449. [DOI] [Google Scholar]
- 20.Kamran M, Ullah B, Ahmad M, Sabri MMS. Application of KNN-based isometric mapping and fuzzy c-means algorithm to predict short-term rockburst risk in deep underground projects. Front. Public Health. 2022;10:1023890. doi: 10.3389/fpubh.2022.1023890. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Akiba, T. et al. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Vol. 2019, 2623–2631 (2019).
- 22.Douzas G, Bacao F, Last F. Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Inf. Sci. 2018;465:1–20. doi: 10.1016/j.ins.2018.06.056. [DOI] [Google Scholar]
- 23.Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]
- 24.Wang Y, Li W, Lee PKK, et al. Method of fuzzy comprehensive evaluations for rockburst prediction. Chin. J. Rock Mech. Eng. 1998;1998(05):15–23. [Google Scholar]
- 25.Zhangjun LIU, Qiuping YUAN, Jianlin LI. Application of fuzzy probability model to prediction of classification of rockburst intensty. Chin. J. Rock Mech. Eng. 2008;2008(S1):3095–3103. [Google Scholar]
- 26.Afraei S, Shahriar K, Madani SH. Developing intelligent classification models for rock burst prediction after recognizing significant predictor variables, Section 1: Literature review and data preprocessing procedure. Tunn. Undergr. Space Technol. 2019;83:324–353. doi: 10.1016/j.tust.2018.09.022. [DOI] [Google Scholar]
- 27.Liu Guofeng Du, Chenghao FG, et al. Causative characteristics and prediction model of rockburst based on large and incomplete data set. Earth Sci. 2023;48(05):1755–1768. [Google Scholar]
- 28.Wang Y, Xu Q, Chai H, et al. Rock burst prediction in deep shaft based on RBF-AR model. J. Jilin Univ. Earth Sci. Ed. 2013;43(06):1943–1949+1965. [Google Scholar]
- 29.Hang Z, Xin L, Shikuo C, et al. Rockburst risk assessment of deep lying tunnels based on combination weight and unascertained measure theory: A case study of Sangzhuling tunnel on Sichuan-tibet traffic corridor. Earth Sci. 2022;47(06):2130–2148. [Google Scholar]
- 30.Feng XT, Wang LN. Rockburst prediction based on neural networks. Trans. Nonferrous Met. Soc. China. 1994;4(1):7–14. [Google Scholar]
- 31.Kidega R, Ondiaka MN, Maina D, Jonah KAT, Kamran M. Decision based uncertainty model to predict rockburst in underground engineering structures using gradient boosting algorithms. Geomech. Eng. 2022;30(3):259. [Google Scholar]
- 32.Kadkhodaei MH, Ghasemi E, Sari M. Stochastic assessment of rockburst potential in underground spaces using Monte Carlo simulation. Environ. Earth Sci. 2022;81(18):447. doi: 10.1007/s12665-022-10561-z. [DOI] [Google Scholar]
- 33.Lin Y, Zhou K, Li J. Application of cloud model in rock burst prediction and performance comparison with three machine learning algorithms. IEEE Access. 2018;6:30958–30968. doi: 10.1109/ACCESS.2018.2839754. [DOI] [Google Scholar]
- 34.Sun Y, Li G, Zhang J, Huang J. Rockburst intensity evaluation by a novel systematic and evolved approach: Machine learning booster and application. Bull. Eng. Geol. Environ. 2021;80:8385–8395. doi: 10.1007/s10064-021-02460-7. [DOI] [Google Scholar]
- 35.Liu, Y. & Hou, S. Rockburst prediction based on particle swarm optimization and machine learning algorithm. In Information Technology in Geo-Engineering: Proceedings of the 3rd International Conference (ICITG), Guimarães, Portugal Vol. 3, 292–303 (Springer International Publishing, 2020).
- 36.Shukla R, Khandelwal M, Kankar PK. Prediction and assessment of rock burst using various meta-heuristic approaches. Min. Metall. Explor. 2021;38(3):1375–1381. [Google Scholar]
- 37.Liu Q, Xue Y, Li G, et al. Application of KM-SMOTE for rockburst intelligent prediction. Tunn. Undergr. Space Technol. 2023;138:105180. doi: 10.1016/j.tust.2023.105180. [DOI] [Google Scholar]
- 38.Breiman L. Random forests. Mach. Learn. 2001;45:5–32. doi: 10.1023/A:1010933404324. [DOI] [Google Scholar]
- 39.Kvålseth TO. Note on Cohen's kappa. Psychol. Rep. 1989;65(1):223–226. doi: 10.2466/pr0.1989.65.1.223. [DOI] [Google Scholar]
- 40.Sun C. A prediction model of rock burst in tunnel based on the improved MATLAB-BP neural network. J. Chongqing Jiaotong Univ. Nat. Sci. 2019;38(10):41–49. [Google Scholar]
- 41.Xu G, Li K, Li M, Qin Q, Yue R. Rockburst intensity level prediction method based on FA-SSA-PNN model. Energies. 2022;15(14):5016. doi: 10.3390/en15145016. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
All data that support the findings of this study are available from the corresponding author upon reasonable request.




