Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Feb 28;15:7135. doi: 10.1038/s41598-025-91980-3

Physics-informed modeling of splitting tensile strength of recycled aggregate concrete using advanced machine learning

Kennedy C Onyelowe 1,2,, Viroon Kamchoom 3,, Shadi Hanandeh 4, S Anandha Kumar 5, Rolando Fabián Zabala Vizuete 6, Rodney Orlando Santillán Murillo 7, Susana Monserrat Zurita Polo 6, Rolando Marcel Torres Castillo 8, Ahmed M Ebid 9,, Paul Awoyera 10, Krishna Prakash Arunachalam 11,
PMCID: PMC11871005  PMID: 40021733

Abstract

Physics-informed modeling (PIM) using advanced machine learning (ML) represents a paradigm shift in the field of concrete technology, offering a potent blend of scientific rigor and computational efficiency. By harnessing the synergies between physics-based principles and data-driven algorithms, PIM-ML not only streamlines the design process but also enhances the reliability and sustainability of concrete structures. As research continues to refine these models and validate their performance, their adoption promises to revolutionize how concrete materials are engineered, tested, and utilized in construction projects worldwide. In this research work, an extensive literature review, which produced a global representative database for the splitting tensile strength (Fsp) of recycled aggregate concrete, was indulged. The studied concrete components such as C, W, NCAg, PL, RCAg_D, RCAg_P, RCAg_wa, Vf, and F_type were measured and tabulated. The collected 257 records were partitioned into training set of 200 records (80%) and validation set of 57 records (20%) in line with a more reliable partitioning of database. Five advanced machine learning techniques created using the “Weka Data Mining” software version 3.8.6 were applied to predict the Fsp and the Hoffman & Gardener method and performance metrics were also used to evaluate the sensitivity and performance of the variables and ML models, respectively. The results show the Kstar model demonstrates the highest level of performance and reliability among the models, achieving exceptional accuracy with an R2 of 0.96 and Accuracy of 94%. Its RMSE and MAE are both low at 0.15 MPa, indicating minimal deviations between predicted and actual values. Additional metrics such as WI (0.99), NSE (0.96), and KGE (0.96) further confirm the model’s superior efficiency and consistent performance, making it the most dependable tool for practical applications. Also the sensitivity analysis shows that Water content (W) exerts the most significant impact at 40%, demonstrating that the amount of water in the mix is a critical factor for achieving optimal tensile strength. This underscores the need for careful water management to balance workability and strength in sustainable concrete production. Coarse natural aggregate (NCAg) has a substantial impact of 38%, indicating its essential role in maintaining the structural integrity of the concrete mix.

Keywords: Recycled aggregate concrete, Splitting tensile strength, Physics-informed modeling, Sustainable construction, Concrete structures

Subject terms: Engineering, Materials science

Introduction

Recycled aggregate concrete is a sustainable construction material that incorporates recycled aggregates, typically derived from crushed concrete waste, into new concrete mixes. This approach helps reduce the environmental impact of concrete production by reusing materials that would otherwise end up in landfills (see Fig. 1). Recycled aggregates can replace a portion of the natural aggregates, such as sand and gravel, used in traditional concrete1. The inclusion of recycled aggregates often results in concrete with slightly reduced strength compared to conventional concrete, but this can be mitigated by optimizing the mix design. Recycled aggregate concrete is environmentally beneficial as it reduces the demand for virgin raw materials, conserves natural resources, and lowers the carbon footprint associated with construction activities2. This type of concrete is gaining popularity in green building projects and sustainable infrastructure, offering a cost-effective and eco-friendly alternative to traditional concrete materials3. Concrete is the most ubiquitous human-engineered substance on Earth and the second most consumed commodity following water. In contrast to other technical materials such as steel, polymers, and wood, concrete is essential in the building business because to its distinctive attributes of strength, cost-effectiveness, mouldability, and durability4. Structures such as buildings, roads, bridges, and dams, along with several common infrastructure components, are predominantly constructed from concrete. Global annual concrete use has attained 35 billion tons, which is double the total of all other construction materials combined1. Zhang et al.2 employed four machine learning algorithms to evaluate the compressive strength and splitting tensile strength of steel fiber recycled aggregate concrete (SFR-RAC). The model was trained utilizing Bayesian optimization with a dataset including 465 and 339 strength sets with varying mix proportions. The impact of various components on SFR-RAC strength was examined by partial correlation analysis and SHapley Additive explanations. AdaBoost and GBRT exhibited commendable performance, with a 20% discrepancy between projected and real data. The study advocates for the enhanced integration of out-of-range data and features in further research endeavors. Han et al.3 used Nonlinear fitting approaches are challenging for characterizing material fatigue performance, and comprehensive material fatigue tests are costly. A physics-informed neural network integrated into Viscoelastic Continuum Damage Mechanics (VECD) is presented to resolve these concerns. The PINN-AFP can precisely forecast the entire material C-S curve utilizing a minimal quantity of pre-fatigue data, resulting in accurate fatigue life estimations. The case study employs AC-25 fatigue test data, showcasing robust generalization capability and predictive precision, attaining state-of-the-art outcomes with an average fatigue life prediction error of 5.2%. Pande et al.4 investigated the influence of nano-materials on the strength of high-performance concrete, a subject that has been predominantly neglected due to constraints in current research. The research suggests a comprehensive methodology that amalgamates Generative Adversarial Networks, Finite Element Analysis, Molecular Dynamics, and Long Short-Term Memory. Generative Adversarial Networks (GANs) produce synthetic data, hence augmenting resilience and training efficacy. The ensemble learning model, trained on the supplemented dataset, enhances predictions by 15–20% and decreases Root Mean Squared Error. Multiscale modeling using molecular dynamics simulation elucidates nanoscale material interactions, supplying parameters for finite element analysis, which forecasts macroscale structural behavior. This leads to enhanced compressive strength and improved model accuracy. LSTM networks are employed for predictive maintenance, projecting performance deterioration over time, and offering SHapley Additive Explanations (SHAP) for the elucidation of the black-box model. The suggested framework may reduce the RMSE in anticipated strength deterioration by 5–10% and indicates that nano-materials account for up to 35% of the variance in strength predictions. Tipu et al.5 used machine learning algorithms to forecast the compressive strength of concrete containing recycled coarse material. The dataset is examined through literature studies, utilizing three models: Random Forest Regression, Gradient Boosting Regression, and XGBoost Regression. The results indicate exemplary performance across all models, with XGBoost emerging as the leading performer. The research emphasizes the impact of variables like as curing age, cement, and fly ash on feature significance. This study enhances comprehension of concrete characteristics with recycled coarse aggregate and underscores the potential of machine learning in sustainable construction methodologies. Ahmad et al.6 introduced statistical models employing linear regression (LR), non-linear regression (NLR), and artificial neural networks (ANN) to forecast the compressive strength of foam concrete. The models utilize 97 experimental data sets and are evaluated using statistical metrics such as R2, RMSE, and MAE. The ANN model demonstrates superior efficacy, achieving a 36% R2 value, a 22% RMSE, and markedly reduced MAE and RMSE values in comparison to both LR and NLR models. The findings underscore the efficacy of foam concrete in decreasing building expenses and enhancing overall performance. Li et al.7 employed Machine Learning (ML) models to enhance the prediction of concrete compressive strength (CS), examining 1030 experimental data points from prior research databases. The models comprised both none-ensemble and ensemble variants, utilizing eight input parameters: cement, blast-furnace slag, aggregates, fly ash, water, superplasticizer, and curing duration. Thorough performance assessments were executed utilizing visual and quantitative techniques with k-fold cross-validation to evaluate dependability and precision. A sensitivity analysis employing Shapley-Additive-exPlanations (SHAP) was performed to ascertain the impact of each input variable on CS. The Categorical-Gradient-Boosting (CatBoost) model had the highest accuracy during testing, achieving a determination coefficient (R2) of 0.966 and a minimal Root-Mean-Square-Error (RMSE) of 3.06 MPa. The age of the concrete was determined to be the most significant determinant in predicting accuracy. A Graphical User Interface (GUI) was provided for designers to efficiently and affordably forecast concrete compressive strength, as an alternative to expensive computational or experimental evaluations. Mater et al.8 studied amalgamate waste management with artificial intelligence within the construction sector by using an artificial neural network (ANN) model to forecast the compressive strength of green concrete. The model incorporates recovered coarse aggregate (RCA), recycled fine aggregate (RFA), and fly ash (FA) as partial substitutes for concrete components. The model is developed, trained, and verified in Python with experimental data obtained from the literature. The results indicated that substituting 10% of cement with fly ash (FA) resulted in a marginal decrease of up to 9% in compressive strength, particularly at early stages. A roughly 40% reduction in 28-day compressive strength was noted when fine aggregate was substituted with 25% recycled fine aggregate (RFA). This study focuses on the usual compressive strength of green concrete, ranging from 25 to 40 MPa. The model is designed to be adaptable and user-friendly, so facilitating the sustainable development of the construction sector by conserving time, effort, and costs in experimental testing. Green concrete including garbage can address environmental issues including waste disposal, resource depletion, and energy usage (see Fig. 1).

Fig. 1.

Fig. 1

Schematic illustration of the sustainable use of industrial waste in construction.

Wang et al.9 employed artificial intelligence to improve the durability, sustainability, safety, and recyclability of concrete, fiber-reinforced composites, and metals. This technology tackles material research difficulties and emphasizes the distinctive attributes of the building sector within the Industry 4.0 paradigm. The incorporation of artificial intelligence in construction materials will commence with digitization, advance to sophisticated production, and ultimately lead to intelligent building operations. This transformative future entails the precise amalgamation of artificial intelligence, big data, and all theoretical frameworks, experiments, and computations. Baduge et al.10 offered an extensive analysis of the application of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) within the context of Industry 4.0 in the building and construction sector. It encompasses multiple facets of architectural design, material design, structural design, offsite manufacturing, construction management, intelligent operation, building management, health monitoring, durability, life cycle analysis, and circular economy. The paper examines data gathering strategies, data cleaning techniques, and data storage for model development, emphasizing the obstacles and approaches to address them. It additionally delineates prospective trends and research opportunities in these domains. In this paper, five different ML techniques such as “Semi-supervised classifier (Kstar)”, “M5 classifier (M5Rules), “Elastic net classifier (ElasticNet), “Correlated Nystrom Views (XNV)”, and “Decision Table (DT)” were applied to predict the splitting strength of the recycled aggregate concrete using collected databases. This combination of advanced machine learning techniques has not been used previously.

Innovative statement and research gap

The present research work introduces an innovative approach by employing a combination of advanced machine learning techniques—Kstar, M5Rules, ElasticNet, XNV, and DT—to predict the splitting tensile strength of recycled aggregate concrete (RAC). This integration of diverse machine learning models is a novel contribution, as previous studies have primarily focused on predicting the compressive strength of concrete using traditional machine learning and statistical methods. While earlier research has successfully demonstrated the application of ML models in forecasting the mechanical properties of concrete, most studies have been limited to compressive strength prediction, often neglecting the critical aspect of splitting tensile strength, which is a key factor in structural performance and durability assessments. Additionally, the reviewed literature indicates a predominant reliance on conventional regression-based models and ensemble learning techniques, with limited exploration of semi-supervised learning and correlated feature analysis in concrete performance modeling. The identified research gap lies in the lack of comprehensive investigations utilizing a diverse range of machine learning algorithms to predict RAC’s splitting tensile strength with high accuracy. Furthermore, existing studies have not extensively examined the sensitivity of key mix design parameters, such as void fraction, water content, and aggregate properties, in influencing RAC’s mechanical behavior. This research fills this gap by incorporating a multi-model approach, performing an in-depth sensitivity analysis, and validating the potential of RAC as a sustainable construction material with optimized mechanical properties. The study’s novel methodology not only enhances predictive accuracy but also provides practical insights into optimizing mix proportions for RAC, ultimately supporting the development of more sustainable and efficient construction practices.

Methodology

Collected database and preliminary analysis

This work includes an extensive literature review, which produced a global representative database for the splitting tensile strength (Fsp) of recycled aggregate concrete11. The studied concrete components such as C, W, NCAg, PL, RCAg_D, RCAg_P, RCAg_wa, Vf, and F_type were measured and tabulated. The collected 257 records were divided into training set (200 records = 80%) and validation set (57 records = 20%) in line with a more reliable partitioning of database published in the literature12. Table 1 summarizes their statistical characteristics. Finally, Fig. 2 shows the Violin distribution for each and Fig. 3 shows Pearson correlation matrix, histograms, and the relations between variables.

Table 1.

Statistical analysis of collected databases.

C
kg/m3
W
kg/m3
NCAg
kg/m3
RCAg
kg/m3
PL
kg/m3
RCAg_D
mm
RCAg_Inline graphic
kg/m3
RCAg_wa
%
Vf
%
F_Type Fsp
MPa
Training set
 Max 548.4 343.5 1143.0 1474.0 7.8 25.0 2640.0 10.9 1.8 6.0 7.0
 Min 158.0 98.3 0.0 59.0 0.0 10.0 2010.0 1.9 0.0 0.0 1.4
 Avg 371.9 185.6 322.4 750.4 1.1 17.9 2423.3 5.6 0.3 1.3 3.1
 SD 60.7 35.2 361.8 377.2 1.7 4.4 153.7 1.9 0.4 1.8 1.1
 Var 0.2 0.2 1.1 0.5 1.6 0.2 0.1 0.3 1.6 1.4 0.4
Validation set
 Max 600.0 343.5 1143.0 1474.0 5.0 25.0 2610.0 10.9 1.8 5.0 6.3
 Min 210.0 98.3 0.0 180.8 0.0 10.0 2010.0 1.9 0.0 0.0 1.5
 Avg 372.3 185.8 332.8 741.2 0.7 18.2 2414.0 5.3 0.3 1.5 3.1
 SD 69.8 39.8 365.6 397.0 1.2 3.9 157.4 1.7 0.4 1.8 1.0
 Var 0.2 0.2 1.1 0.5 1.6 0.2 0.1 0.3 1.4 1.2 0.3

Fig. 2.

Fig. 2

Violin distribution for each input.

Fig. 3.

Fig. 3

Correlation, distribution and interpreting chart.

Sensitivity analysis

The Hoffman and Gardener sensitivity analysis is a powerful tool for identifying and quantifying the influence of key variables on the performance metrics of a system13. In the case of predicting the splitting tensile strength of recycled aggregate concrete, the analysis provided valuable insights into the relative importance of specific parameters in optimizing the concrete mix design for sustainable construction. The results highlighted that the void fraction (Vf) consistently emerged as the most critical variable across both datasets, exerting the highest impact on tensile strength. This finding underscores the significance of managing porosity and ensuring adequate compaction in concrete production to achieve superior mechanical properties. The void fraction’s dominant role suggests it is a priority area for process optimization, particularly in applications where recycled materials are used. Water content (W) was also identified as a major contributor, with its impact varying slightly between the datasets but remaining highly influential. This result emphasizes the need for precise control over the water-cement ratio, as it directly affects workability and strength development. In conjunction with void fraction, water content forms the cornerstone of an effective mix design strategy. The natural coarse aggregate (NCAg) and the properties of recycled coarse aggregates, including water absorption (RCAg_wa), particle purity (RCAg_P), and particle density (RCAg_D), also demonstrated significant influence. These findings highlight the importance of selecting high-quality aggregates and controlling their characteristics to enhance the mechanical performance of recycled aggregate concrete. Recycled aggregate properties were shown to play a particularly vital role in ensuring the compatibility and durability of the final product, aligning with sustainable production goals. Cement content (C) showed a substantial but secondary impact, indicating that while it is a crucial binder in the mix, its influence on tensile strength is moderated by other factors like void fraction and water content. Fine aggregate type (F_type) contributed moderately to the tensile strength, reflecting its role in improving particle packing and overall cohesion in the mix. Interestingly, the plastic limit (PL) had no measurable effect in this analysis, suggesting it may not be a critical parameter in the context of splitting tensile strength for recycled aggregate concrete. This result allows for the deprioritization of plastic limit considerations, focusing efforts on more impactful variables. In summary, the Hoffman and Gardener sensitivity analysis provides a comprehensive understanding of the factors influencing the tensile strength of recycled aggregate concrete. It emphasizes the need for rigorous control of void fraction, water content, and aggregate properties to achieve high-performance, sustainable concrete. These insights serve as a foundation for optimizing mix designs, improving efficiency, and advancing the sustainable use of recycled materials in construction. A preliminary sensitivity analysis was carried out on the collected database to estimate the impact of each input on the (Y) values. “Single variable per time” technique is used to determine the “Sensitivity Index” (SI) for each input using Hoffman & Gardener formula13 as follows:

graphic file with name M2.gif 1

Research program

Five different ML techniques were used to predict the splitting strength of the concrete using the collected database. These techniques are “Semi-supervised classifier (Kstar)”, “M5 classifier (M5Rules), “Elastic net classifier (ElasticNet), “Correlated Nystrom Views (XNV)”, and “Decision Table (DT)”. All models were created using “Weka Data Mining” software version 3.8.6. The following section discusses the results of each model. The accuracies of developed models were evaluated by comparing sum of squared error (SSE), mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), Error (%), Accuracy (%) and coefficient of determination (R2), correlation coefficient (R), weighted index (WI), Nash–Sutcliffe Efficiency (NSE), Kling-Gupta Efficiency (KGE) and symmetric mean absolute percentage error (SMAPE) between predicted and calculated shear strength parameters values. The definition of each used measurement is presented in Eqs. (212). The results of all developed models are summarized and reported.

graphic file with name M3.gif 2
graphic file with name M4.gif 3
graphic file with name M5.gif 4
graphic file with name M6.gif 5
graphic file with name M7.gif 6
graphic file with name M8.gif 7
graphic file with name M9.gif 8
graphic file with name M10.gif 9
graphic file with name M11.gif 10
graphic file with name M12.gif 11
graphic file with name M13.gif 12

Theory of the selected machine learning methods

Semi-supervised classifier (Kstar)

A semi-supervised classifier like K* (Kstar) is a machine learning algorithm designed to work with both labeled and unlabeled data. It combines the strengths of supervised learning, which uses labeled data, and unsupervised learning, which leverages the structure of unlabeled data, to improve classification performance when labeled data is scarce or expensive to obtain1425. K* is based on the nearest-neighbor approach and employs a probabilistic framework to classify instances. It computes similarities between labeled and unlabeled data points to propagate the labels across the dataset, making use of the inherent structure in the data15. By incorporating both labeled and unlabeled data, K* can achieve higher accuracy than traditional supervised classifiers, especially in scenarios where labeled data is limited16. This makes K* particularly useful in fields like natural language processing, image recognition, and bioinformatics, where obtaining a large labeled dataset can be challenging. The instance-based semi-supervised classifier, sometimes referred to as Kstar (or K*), classifies data using an entropy-based distance function26. The main concept is to use a probability-based metric to assess how similar two examples are. It is semi-supervised because it may employ both labeled and unlabeled data during training to create a more reliable classifier. Figure 4 shows a general structure of the Kstar approach.

Fig. 4.

Fig. 4

General structure of the Kstar approach (adapted from26).

The transformation probability P(x → y) is defined as the probability of transforming instance x to y. The computation involves all possible transformations.

graphic file with name M14.gif 13

where: Pt represents the probability of each transformation step. The distance between two instances x and y is given by:

graphic file with name M15.gif 14

The class of the nearest neighbors is used for classification. The most probable class is determined using:

graphic file with name M16.gif 15

where wi is the weight of neighbor xi and P(xi → x) is the probability of transformation.

M5 classifier (M5Rules)

The M5 classifier, specifically M5Rules, is a rule-based machine learning algorithm used for regression and classification tasks. It is an extension of the M5 model tree, which generates a decision tree structure where each leaf node contains a regression model. M5Rules works by transforming the decision tree into a set of human-readable rules, which represent the learned patterns in the data17. These rules are derived by partitioning the feature space into smaller regions and associating each region with a prediction, improving interpretability. The algorithm builds an ensemble of regression trees and then extracts the rules from each tree to form a rule set that predicts outputs based on input features. M5Rules is valuable for tasks that require not only predictive power but also the ability to understand and interpret the decision-making process. It is especially useful in domains where transparency is crucial, such as in medical diagnosis or financial modeling18. The M5 model tree is the source of the rule-based classifier M5Rules. For predictive modeling, it blends linear regression and decision tree techniques27. Linear regression models are found at the leaf nodes of the decision tree, which divides data according to attributes (Fig. 5).

Fig. 5.

Fig. 5

M3 model framework (adapted from Khalid et al.28).

Each leaf node uses a regression equation:

graphic file with name M17.gif 16

where β0 is the intercept, βi are the coefficients, and xi are the input features. The decision tree splits are determined by minimizing the variance in the target variable y:

graphic file with name M18.gif 17

Elastic net classifier (ElasticNet)

The Elastic Net classifier, or ElasticNet, is a machine learning algorithm that combines the properties of both Lasso (L1 regularization) and Ridge (L2 regularization) regression techniques. This hybrid approach is designed to improve model performance, especially when dealing with datasets that have a large number of correlated features19. ElasticNet performs feature selection like Lasso but also maintains the stability of Ridge regression, which is particularly useful in cases where there are more features than observations or when features are highly correlated25. By incorporating both L1 and L2 penalties, ElasticNet minimizes the risk of overfitting while ensuring that the model generalizes well to new, unseen data. The classifier is particularly useful for linear classification problems, where it can handle sparse, high-dimensional data effectively, and it is widely applied in fields such as bioinformatics, finance, and marketing. Elastic Net is a linear regression method that combines L1-norm (lasso) and L2-norm (ridge) penalties29. It is particularly useful when features are correlated.

Objective Function:

graphic file with name M19.gif 18

Inline graphic. Inline graphic. Inline graphic: Regularization parameter. Inline graphic: Mixing parameter Inline graphic. Elastic Net uses coordinate descent to optimize the coefficients β.

Correlated Nystrom Views (XNV)

Correlated Nystrom Views (XNV) is a machine learning method that aims to improve the efficiency and accuracy of large-scale kernel-based learning algorithms. It combines the Nystrom method, a technique for approximating large kernel matrices, with the concept of correlated views to enhance the approximation process. The Nystrom method works by selecting a subset of data points (landmarks) and computing the kernel matrix for those points, which is then used to approximate the full kernel matrix for all data points24. XNV extends this idea by considering the correlations between the selected landmarks, allowing for a more accurate and stable approximation. This method is especially useful when dealing with large datasets where direct computation of kernel matrices is computationally expensive20. XNV improves both the efficiency and scalability of kernel-based algorithms, making them more practical for applications in areas like computer vision, natural language processing, and bioinformatics. XNV employs the Nystrom approximation for kernel approaches to handle large-scale data effectively30. Using linked embeddings, it integrates several data views (such as feature sets). Nystrom Approximation for Kernel Matrix K:

graphic file with name M25.gif 19

C: Submatrix of K (columns corresponding to sampled points). W: Submatrix of K (intersection of rows and columns of sampled points). Inline graphic Pseudoinverse of W. Correlated Nystrom Views combines the embeddings from different views using a correlation matrix R:

graphic file with name M27.gif 20

Inline graphic: Embedding from the ith view, R: Correlation matrix learned during training. The embeddings Z are fed into a classifier (e.g., SVM or logistic regression).

Decision table (DT)

A Decision Table (DT) is a model used for classification tasks that represents rules in a tabular format. Each row of the table corresponds to a decision rule, while the columns represent the conditions (input attributes) and the corresponding outcomes (class labels)23. Decision tables are structured in such a way that the conditions are combined logically to form rules, which are then used to make predictions based on new input data31,32. The model is intuitive and transparent, making it easy to understand and interpret the decision-making process21. Decision Tables are particularly useful when dealing with discrete data, as they allow for clear representation of complex decision boundaries22. Though not as flexible as some other machine learning models, Decision Tables are simple, computationally efficient, and easy to implement, making them suitable for small to medium-sized problems, especially when interpretability is a key requirement. A decision table is a basic rule-based classifier that uses a table with actions (class labels) and conditions (attributes) to represent knowledge31. Key Equations: Each row in the decision table corresponds to a rule: If Inline graphic and Inline graphic and … then Class = c. When multiple rows match, the class is determined by majority voting:

graphic file with name M31.gif 21

where: Inline graphic is the indicator function.

Results and discussion

Kstar model

Figure 6 shows the considered hyper-parameters of the Kstar model and Fig. 7 presents the relation between the measured and predicted values of the splitting tensile strength of the recycled aggregate concrete. The hyperparameters in the provided Fig. 6 influence its performance in various ways: batchSize (100) determines the number of instances processed at once, affecting memory usage and computational efficiency. debug (False) disables debugging output, likely improving runtime efficiency. doNotCheckCapabilities (False) ensures that input data compatibility checks are performed before execution. entropicAutoBlend (False) disables automatic adjustment of entropic blending, meaning manual control is required. globalBlend (20) defines the weight assigned to the entropic distance measure, impacting classification sensitivity.

Fig. 6.

Fig. 6

The considered hyper-parameters of (Kstar) model.

Fig. 7.

Fig. 7

Relation between predicted and calculated strength using (Kstar).

missingMode (Average column entropy curves): Specifies how missing values are handled, using entropy-based imputation for better predictive performance. numDecimalPlaces (2) controls numerical precision in the output, ensuring consistent formatting. These settings indicate a manually controlled configuration with entropy-based missing value handling and a moderate level of entropic blending. The model produced SSE of 3.5, MAE of 0.15 MPa, MSE of 0.5 MPa, RMSE of 0.15 MPa, average Error of 6%, Accuracy of 94%, R2 of 0.96, R of 0.98, WI of 0.99, NSE of 0.96, KGE of 0.96, and SMAPE of 5.37 MPa. The performance of the Kstar model for predicting the splitting tensile strength of recycled aggregate concrete demonstrates excellent predictive accuracy and efficiency based on a comprehensive set of metrics. The sum of squared errors (SSE) of 3.5 indicates a low overall deviation in the predictions, reinforcing the model’s precision. The mean absolute error (MAE) of 0.15 MPa and root mean square error (RMSE) of 0.15 MPa reflect minimal average and root-mean-square deviations between predicted and observed values, confirming the model’s robustness in capturing the actual behavior of the concrete. The mean squared error (MSE) of 0.5 MPa highlights the squared deviations’ consistency, showing a reliable error distribution. An average error of 6% and accuracy of 94% underscore the model’s strong predictive reliability, with high agreement between predictions and true values. The coefficient of determination (R2) of 0.96 and correlation coefficient (R) of 0.98 reveal an excellent fit, indicating the model accounts for 96% of the variance in the data and has a near-perfect linear relationship with observed values. Additional metrics such as the Willmott Index (WI) of 0.99, Nash–Sutcliffe Efficiency (NSE) of 0.96, Kling-Gupta Efficiency (KGE) of 0.96, and symmetric mean absolute percentage error (SMAPE) of 5.37 MPa further affirm the model’s high reliability and efficiency. The WI close to 1 highlights near-ideal predictive performance, while the NSE and KGE values confirm that the model balances predictive accuracy and error sensitivity effectively. The SMAPE value, although higher in magnitude due to percentage scaling, remains low, reflecting consistent predictive reliability across varying data points. Overall, the Kstar model demonstrates exceptional performance in predicting the splitting tensile strength of recycled aggregate concrete, making it a highly effective tool for structural analysis and sustainable construction practices.

M5Rules model

Online Appendix 1 shows the rule iteration of the M5Rule model. Figure 8 shows the considered hyper-parameters of the M5Rules model and Fig. 9 presents the relation between the measured and predicted values of the splitting tensile strength of the recycled aggregate concrete. The given hyperparameters influence its performance in various ways. batchSize (100) determines the number of instances processed in a batch, impacting computational efficiency. buildRegressionTree (False) disables building a full regression tree, meaning rules are extracted directly from tree structures. debug (False) disables debug mode to enhance performance by avoiding unnecessary logging. doNotCheckCapabilities (False) ensures capability checks are performed before execution, preventing potential errors. minNumInstances (4.0) sets the minimum number of instances per leaf node, affecting rule granularity and model generalization.

Fig. 8.

Fig. 8

The considered hyper-parameters of (M5Rules) model.

Fig. 9.

Fig. 9

Relation between predicted and calculated strength using (M5Rules).

numDecimalPlaces (4) defines the precision of numerical output, ensuring more accurate rule representation. unpruned (False) enables pruning to remove unnecessary branches, improving rule simplicity and model interpretability. useUnsmoothed (False) disables unsmoothed predictions, meaning the model applies smoothing techniques to enhance numerical stability. These settings indicate a pruned, rule-based regression model with controlled complexity and enhanced numerical precision. The model produced SSE of 22.5, MAE of 0.35 MPa, MSE of 0.2 MPa, RMSE of 0.345 MPa, average Error of 14.5%, Accuracy of 85.5%, R2 of 0.78, R of 0.88, WI of 0.925, NSE of 0.765, KGE of 0.77, and SMAPE of 11.28 MPa. The M5Rules model for predicting the splitting tensile strength of recycled aggregate concrete exhibits moderate to strong predictive performance, though it falls short of the precision observed in other advanced models. The sum of squared errors (SSE) of 22.5 suggests a relatively higher total deviation in the model’s predictions. The mean absolute error (MAE) of 0.35 MPa and root mean square error (RMSE) of 0.345 MPa indicate that the model’s predictions deviate moderately from the observed values, with errors distributed slightly unevenly. The mean squared error (MSE) of 0.2 MPa reflects moderate squared deviations, while the average error of 14.5% and accuracy of 85.5% show that the model is reasonably reliable but less precise compared to highly optimized models. The coefficient of determination (R2) of 0.78 and correlation coefficient (R) of 0.88 indicate that the model explains 78% of the variance in the data and has a strong, though not exceptional, linear relationship with observed values. The Willmott Index (WI) of 0.925 and Nash–Sutcliffe Efficiency (NSE) of 0.765 demonstrate decent predictive consistency, though they point to room for improvement in reducing errors and increasing reliability. The Kling-Gupta Efficiency (KGE) of 0.77 supports this conclusion, suggesting that while the model performs reasonably well in balancing bias, variability, and correlation, it lacks the precision of top-performing models. The symmetric mean absolute percentage error (SMAPE) of 11.28 MPa, a measure sensitive to percentage error scaling, highlights moderate inconsistencies in the model’s predictions, especially for lower values. In summary, the M5Rules model provides a reliable but less optimal prediction of the splitting tensile strength of recycled aggregate concrete. It is suitable for applications requiring reasonable accuracy but may require further optimization or comparison with alternative models for high-precision structural analyses.

ElasticNet model

Figure 10 shows the considered hyper-parameters of the ElasticNet model and Fig. 11 presents the relation between the measured and predicted values of the splitting tensile strength of the recycled aggregate concrete. The given hyperparameters shape its behavior in many ways. additionalStats (False) disables extra statistical details, focusing on core outputs. alpha (0.001) controls the regularization strength, with a small value allowing more flexibility in model fitting. batchSize (100) processes data in batches of 100, impacting computational efficiency. custom_lambda_sequence (Empty) no custom lambda sequence is provided, meaning default regularization paths are used. debug (False): Disables debug mode to optimize performance. doNotCheckCapabilities (False) ensures model compatibility checks before execution. epsilon (1.0E−4) sets the stopping criterion for optimization, ensuring numerical stability. maxIt (10,000,000) defines the maximum number of iterations for convergence, allowing extensive optimization. numDecimalPlaces (2) limits numerical precision to two decimal places. numInnerFolds (10) uses tenfold cross-validation to tune hyperparameters. numModels (100) builds 100 models for ensemble learning or parameter selection. sparse (False) disables sparse representation, meaning full coefficient matrices are used. threshold (1.0E−7) sets a low convergence threshold for fine-tuned optimization. use_method2 (True) enables an alternative computation method, likely for better performance. use_stderr_rule (False) does not apply standard error-based model selection criteria. These settings configure a well-tuned ElasticNet model with extensive iterations, cross-validation, and controlled regularization for optimal regression performance. The model produced SSE of 68.5, MAE of 0.7 MPa, MSE of 0.55 MPa, RMSE of 0.7 MPa, average Error of 23.5%, Accuracy of 76.5%, R2 of 0.375, R of 0.61, WI of 0.715, NSE of 0.37, KGE of 0.425, and SMAPE of 22.85 MPa. The ElasticNet model for predicting the splitting tensile strength of recycled aggregate concrete demonstrates relatively low predictive accuracy and higher error margins compared to other models. The sum of squared errors (SSE) of 68.5 reflects significant total deviation in the model’s predictions. The mean absolute error (MAE) of 0.7 MPa and root mean square error (RMSE) of 0.7 MPa indicate a notable average deviation from the observed values, suggesting limited precision in predictions. The mean squared error (MSE) of 0.55 MPa further emphasizes the model’s moderate performance, while the average error of 23.5% and accuracy of 76.5% highlight substantial variability and reduced reliability compared to more advanced models. The coefficient of determination (R2) of 0.375 and correlation coefficient (R) of 0.61 indicate that the model captures only 37.5% of the variance in the data and has a weak to moderate linear relationship with the observed values. The Willmott Index (WI) of 0.715 and Nash–Sutcliffe Efficiency (NSE) of 0.37 suggest limited predictive consistency, with the model struggling to achieve strong alignment with observed values. The Kling-Gupta Efficiency (KGE) of 0.425 further confirms the model’s suboptimal balance of bias, variability, and correlation. The symmetric mean absolute percentage error (SMAPE) of 22.85 MPa, a measure that reflects percentage-scaled errors, highlights substantial discrepancies in the predictions, particularly for lower values. In summary, the ElasticNet model exhibits moderate to low performance in predicting the splitting tensile strength of recycled aggregate concrete. Its higher errors and reduced accuracy make it less suitable for applications requiring precise and reliable predictions. This analysis underscores the need for either improving the ElasticNet model through optimization or considering alternative models for more accurate and consistent results.

graphic file with name M33.gif 22

Fig. 10.

Fig. 10

The considered hyper-parameters of (ElasticNet) model.

Fig. 11.

Fig. 11

Relation between predicted and calculated strength using (ElasticNet).

XNV model

Figure 12 shows the considered hyper-parameters of the XNV model and Fig. 13 presents the relation between the measured and predicted values of the splitting tensile strength of the recycled aggregate concrete. The hyperparameters in Fig. 12 influence its performance in many ways. Regularization parameter gamma (0.01) controls the complexity of the model, with a lower value reducing overfitting. Sample size for Nyström method (100) defines the number of samples used for the Nyström approximation, affecting computational efficiency and kernel approximation accuracy. Kernel function (RBFKernel -C 250007 -G 0.01) uses the Radial Basis Function (RBF) kernel, with parameters controlling complexity and influence of support vectors. Do not apply standardization (False) ensures data is standardized before training, improving numerical stability. batchSize (100) processes data in batches of 100, impacting performance and memory usage. debug (False) disables debugging mode for efficient execution. doNotCheckCapabilities (False) ensures capability checks are performed before execution. numDecimalPlaces (2) limits numerical precision to two decimal places. seed (1) sets a fixed random seed for reproducibility. These settings configure an RBF kernel-based model with Nyström approximation, controlled complexity, and standardized input for stable and efficient learning. This model produced SSE of 26, MAE of 0.4 MPa, MSE of 0.25 MPa, RMSE of 0.5 MPa, average Error of 15.5%, Accuracy of 84.5%, R2 of 0.75, R of 0.865, WI of 0.915, NSE of 0.735, KGE of 0.75, and SMAPE of 12.845 MPa. The XNV model demonstrates strong predictive capabilities for estimating the splitting tensile strength of recycled aggregate concrete, with performance metrics indicating reasonable accuracy and consistency. The sum of squared errors (SSE) of 26 reflects moderate overall deviations in predictions. The mean absolute error (MAE) of 0.4 MPa and root mean square error (RMSE) of 0.5 MPa suggest that the model achieves a satisfactory level of precision, with relatively small average and root-mean-square deviations from the observed values. The mean squared error (MSE) of 0.25 MPa further confirms the model’s ability to maintain consistent prediction accuracy. An average error of 15.5% and accuracy of 84.5% indicate that the XNV model achieves high reliability in predicting outcomes. The coefficient of determination (R2) of 0.75 and correlation coefficient (R) of 0.865 reveal that the model captures 75% of the variance in the data and exhibits a strong linear relationship with the observed values. The Willmott Index (WI) of 0.915 demonstrates that the model provides high predictive consistency, while the Nash–Sutcliffe Efficiency (NSE) of 0.735 suggests that the model performs well in balancing predictive accuracy with error sensitivity. The Kling-Gupta Efficiency (KGE) of 0.75 further supports the conclusion that the model strikes a good balance between bias, variability, and correlation. The symmetric mean absolute percentage error (SMAPE) of 12.845 MPa, though moderate, indicates that the model can handle percentage-scaled deviations adequately. In summary, the XNV model exhibits robust performance in predicting the splitting tensile strength of recycled aggregate concrete. Its high accuracy, strong correlation, and consistent error management make it a reliable tool for practical applications in structural and material engineering.

Fig. 12.

Fig. 12

The considered hyper-parameters of (XNV) model.

Fig. 13.

Fig. 13

Relation between predicted and calculated strength using (XNV).

DT model

Online Appendix 2 shows the range of variables for the optimized strength. Figure 14 shows the considered hyper-parameters of the DT model and Fig. 15 presents the relation between the measured and predicted values of the splitting tensile strength of the recycled aggregate concrete. The DT (Decision Tree) model hyperparameters in the image define its behavior for training and evaluation. batchSize (100) processes data in batches of 100, impacting computational efficiency. crossVal (2) uses two-fold cross-validation for model assessment, balancing performance evaluation and training time. debug (False) disables debugging mode for standard execution. displayRules (True) enables rule display, making the model’s decision process more interpretable. doNotCheckCapabilities (False) ensures that required capabilities are checked before execution. evaluationMeasure (Default: accuracy for discrete classes, RMSE for numeric classes) uses accuracy for classification tasks and Root Mean Square Error (RMSE) for regression tasks. numDecimalPlaces (2) limits numerical precision to two decimal places. search (BestFirst -D 1 -N 5) uses the BestFirst search method with depth 1 and node expansion limit of 5, optimizing feature selection. useIBk (False) disables the use of the IBk algorithm, indicating a pure decision tree approach. These settings configure the decision tree for efficient training, evaluation, and feature selection, ensuring a balance between interpretability and performance. This model produced SEE of 35, MAE of 0.45 MPa, MSE of 0.3 MPa, RMSE of 0.55 MPa, average Error of 18.5%, Accuracy of 81.5%, R2 of 0.645, R of 0.8, WI of 0.88, NSE of 0.63, KGE of 0.735, and SMAPE of 14.18 MPa. The DT (Decision Tree) model for predicting the splitting tensile strength of recycled aggregate concrete demonstrates moderate predictive performance with acceptable accuracy and precision for general applications. The sum of squared errors (SSE) of 35 indicates a relatively higher level of overall prediction deviation compared to more optimized models. The mean absolute error (MAE) of 0.45 MPa and root mean square error (RMSE) of 0.55 MPa suggest reasonable, though not exceptional, prediction accuracy. The mean squared error (MSE) of 0.3 MPa aligns with moderate prediction reliability, while the average error of 18.5% and accuracy of 81.5% indicate that the model provides reasonably reliable results, though with room for improvement. The coefficient of determination (R2) of 0.645 and correlation coefficient (R) of 0.8 suggest the model explains 64.5% of the variance in the data and exhibits a strong, though not outstanding, linear relationship with the observed values. The Willmott Index (WI) of 0.88 and Nash-Sutcliffe Efficiency (NSE) of 0.63 reflect good predictive consistency, though these metrics also highlight the model’s limitations in capturing the full complexity of the dataset. The Kling-Gupta Efficiency (KGE) of 0.735 suggests that the model strikes a fair balance between bias, variability, and correlation. The symmetric mean absolute percentage error (SMAPE) of 14.18 MPa, a measure of percentage-scaled errors, reveals some inconsistency in predictions, particularly for lower-value observations. In summary, the DT model provides moderately reliable predictions for the splitting tensile strength of recycled aggregate concrete. While it demonstrates acceptable accuracy and error management, it does not achieve the high precision and robustness of more advanced models. Further optimization or exploration of alternative modeling approaches may be necessary for applications requiring higher accuracy and reduced prediction errors.

Fig. 14.

Fig. 14

The considered hyper-parameters of (DT) model.

Fig. 15.

Fig. 15

Relation between predicted and calculated strength using (DT).

Table 2 and Fig. 16 have been used to compare the performance of the selected models. Overall, the performance, efficiency, and reliability of models predicting the splitting tensile strength of recycled aggregate concrete vary significantly, with key differences in their accuracy, error management, and consistency3236. The Kstar model demonstrates the highest level of performance and reliability among the models, achieving exceptional accuracy with an R2 of 0.96 and Accuracy of 94%. Its RMSE and MAE are both low at 0.15 MPa, indicating minimal deviations between predicted and actual values. Additional metrics such as WI (0.99), NSE (0.96), and KGE (0.96) further confirm the model’s superior efficiency and consistent performance, making it the most dependable tool for practical applications. The XNV model also performs strongly, with an R2 of 0.75 and Accuracy of 84.5%, though it lags behind Kstar in precision. Its RMSE and MAE values of 0.5 MPa and 0.4 MPa, respectively, indicate good but slightly less consistent predictions, which agrees with previous studies3740. Metrics like WI (0.915), NSE (0.735), and KGE (0.75) underscore the model’s reasonable balance between error minimization and reliability, making it a viable alternative for applications where slight reductions in precision are acceptable. The M5Rules model offers moderate performance, with an R2 of 0.78 and Accuracy of 85.5%. It exhibits slightly higher errors, with an RMSE of 0.345 MPa and MAE of 0.35 MPa, compared to Kstar and XNV. Its NSE of 0.765 and KGE of 0.77 reflect good efficiency but indicate limitations in managing data complexity, particularly for diverse datasets. While reliable, it may require optimization for more critical applications. The DT model performs acceptably, with an R2 of 0.645 and Accuracy of 81.5%, but falls behind the leading models. Its RMSE of 0.55 MPa and MAE of 0.45 MPa suggest moderate prediction accuracy, with additional metrics such as WI (0.88), NSE (0.63), and KGE (0.735) indicating fair reliability. It is suitable for general predictions but may struggle with datasets requiring high precision. The ElasticNet model demonstrates the weakest performance, with an R2 of 0.375 and Accuracy of 76.5%. It exhibits higher errors, with RMSE and MAE both at 0.7 MPa, and metrics like WI (0.715), NSE (0.37), and KGE (0.425) confirm limited efficiency and reliability. This model is less suitable for precise applications and may require significant refinement to achieve competitive performance. In conclusion, the Kstar model emerges as the most efficient and reliable tool for predicting the splitting tensile strength of recycled aggregate concrete, followed closely by the XNV model. The M5Rules and DT models provide reasonable alternatives but are less precise, while the ElasticNet model performs the poorest, necessitating further optimization for practical use.

Table 2.

Performance measurements of developed models.

Model Dataset SSE MAE MSE RMSE Error (%) Acc (%) R2 R WI NSE KGE SMAPE
Kstar Training 4 0.1 0.0 0.1 4 96 0.97 0.99 0.99 0.97 0.98 5.14
Validation 3 0.2 0.1 0.2 8 92 0.95 0.97 0.99 0.95 0.94 5.61

M5

Rules

Training 27 0.3 0.1 0.3 11 89 0.83 0.91 0.95 0.83 0.84 10.54
Validation 18 0.4 0.3 0.6 18 82 0.73 0.85 0.90 0.70 0.70 12.02

Elastic

Net

Training 96 0.7 0.4 0.6 20 80 0.41 0.64 0.72 0.40 0.43 23.09
Validation 41 0.7 0.7 0.8 27 73 0.34 0.58 0.71 0.34 0.42 22.61
XNV Training 32 0.4 0.1 0.4 12 88 0.81 0.90 0.94 0.80 0.78 11.64
Validation 20 0.4 0.4 0.6 19 81 0.69 0.83 0.89 0.67 0.72 14.05
DT Training 39 0.4 0.2 0.4 13 87 0.76 0.87 0.92 0.76 0.78 12.19
Validation 31 0.5 0.4 0.7 24 76 0.53 0.73 0.84 0.50 0.69 16.17

Fig. 16.

Fig. 16

Comparing the accuracies of the developed models using Taylor charts.

Furthermore, the present research demonstrates superior predictive performance in modeling the splitting tensile strength of recycled aggregate concrete (RAC) compared to previous studies, particularly due to its implementation of diverse machine learning models and rigorous validation metrics. The Kstar model emerges as the most robust, achieving an R2 of 0.97 in training and 0.95 in validation, along with an exceptionally low RMSE (0.1 MPa in training and 0.2 MPa in validation) and a high accuracy of 96% and 92%, respectively. This significantly outperforms models from prior literature, where regression-based and ensemble learning techniques have typically struggled to achieve similar levels of precision in tensile strength prediction. In contrast, the ElasticNet model performed the weakest in this study, with an R2 of 0.41 in training and 0.34 in validation, along with the highest error rates (20% in training and 27% in validation), reflecting its limitations in capturing complex nonlinear relationships in RAC performance. Previous studies, particularly those using regression models such as linear and nonlinear regression, exhibited similar challenges in accuracy, often producing RMSE values and R2 scores that were suboptimal for reliable engineering applications4143. Compared to prior works employing gradient boosting, decision trees, and neural networks for concrete strength prediction, the XNV and M5Rules models in this research showed moderate reliability, achieving validation R2 values of 0.69 and 0.73, respectively, with accuracies of 81% and 82%. These results align with studies where decision tree-based models generally provided reasonable predictions but suffered from increased error margins due to their sensitivity to training data. The Decision Table (DT) model, with an R2 of 0.76 in training and 0.53 in validation, demonstrated moderate utility but was less effective for high-precision applications, a finding consistent with existing literature on decision tree approaches to concrete strength modeling. Overall, this research surpasses previous works by integrating an advanced machine learning framework that enhances accuracy, minimizes errors, and effectively identifies key influencing factors through sensitivity analysis. The use of the Kstar model in particular represents a significant advancement, as it consistently outperforms traditional and ensemble learning approaches reported in the literature, offering a more reliable tool for predicting RAC tensile strength in sustainable construction applications.

The present research on predicting the splitting tensile strength of recycled aggregate concrete (RAC) using machine learning (ML) models demonstrates significant advancements over previous works reported in the literature. Prior studies have employed a variety of ML approaches, including ensemble learning techniques, regression models, and neural networks, to enhance predictive accuracy and optimize mix designs. Zhang et al.2 utilized Bayesian optimization with multiple ML algorithms to predict the strength of steel fiber-reinforced recycled aggregate concrete (SFR-RAC), achieving a 20% error margin. While their models demonstrated commendable performance, the present study surpasses these results, particularly with the Kstar model, which achieves a significantly lower RMSE (0.1 MPa in training and 0.2 MPa in validation) and an accuracy of 96% and 92%, respectively. Han et al.3 explored the application of physics-informed neural networks integrated with viscoelastic continuum damage mechanics (VECD) to predict material fatigue performance. Their approach demonstrated robust predictive capability with an average fatigue life prediction error of 5.2%. While effective for fatigue modeling, their method did not focus on RAC tensile strength, and the present study expands the ML application to sustainable concrete. Pande et al.4 incorporated generative adversarial networks (GANs) and finite element analysis to examine the influence of nano-materials on high-performance concrete. Their framework improved prediction accuracy by 15–20%, demonstrating the effectiveness of hybrid AI models in material science. However, the present research, by applying semi-supervised and rule-based classifiers, offers more precise predictions for RAC without requiring extensive synthetic data generation. Tipu et al.5 employed ensemble learning models, including Random Forest, Gradient Boosting, and XGBoost regression, to predict the compressive strength of concrete with recycled coarse aggregate. Their findings emphasized the importance of curing age, cement content, and fly ash in strength prediction, with XGBoost performing best. Compared to the present work, the models used by Tipu et al. showed strong predictive ability but were not specifically focused on RAC tensile strength. The present study extends their approach by incorporating sensitivity analysis and evaluating various ML models, identifying the Kstar model as the most effective. Ahmad et al.6 used artificial neural networks (ANN), linear regression (LR), and non-linear regression (NLR) for foam concrete strength prediction. While ANN demonstrated better accuracy with an R2 of 0.36 and reduced errors compared to regression models, it was still inferior to the Kstar model in the present study, which achieved an R2 of 0.97 in training and 0.95 in validation. Li et al.7 examined 1030 experimental datasets using both ensemble and non-ensemble ML models to enhance concrete compressive strength predictions. Their findings highlighted that CatBoost exhibited the highest accuracy, with an R2 of 0.966 and an RMSE of 3.06 MPa. The present research parallels their approach by incorporating multiple ML models and conducting a sensitivity analysis, but it further improves accuracy for RAC tensile strength predictions by optimizing model selection and error minimization. Mater et al.8 introduced an ANN-based model for green concrete, incorporating recovered coarse aggregate (RCA), recycled fine aggregate (RFA), and fly ash (FA). Their findings revealed that replacing 10% of cement with FA led to a 9% reduction in compressive strength, while 25% substitution of fine aggregate with RFA resulted in a 40% reduction in 28-day compressive strength. While their study effectively assessed the impact of recycled materials, the present work advances predictive modeling for RAC by integrating multiple ML techniques and comparing their performance comprehensively. Wang et al.9 explored AI-driven improvements in construction materials’ durability and sustainability within Industry 4.0. Their study emphasized the integration of AI, big data, and digitalization to enhance material research and engineering applications. The present study aligns with this vision by implementing advanced ML techniques for sustainable concrete solutions, particularly in predicting RAC tensile strength. Baduge et al.10 provided a broad review of AI, ML, and deep learning applications in construction, covering aspects such as material design, structural health monitoring, and life cycle analysis. Their work highlighted future trends and challenges in AI-driven construction, whereas the present research provides a focused, data-driven implementation of ML techniques for RAC strength prediction. The comparison between the present study and prior research highlights significant improvements in predictive accuracy, sensitivity analysis, and model robustness. Unlike previous works that relied primarily on regression models or neural networks with limited RAC-specific focus, this study integrates five different ML techniques, demonstrating the superior performance of the Kstar model. The sensitivity analysis further provides valuable insights into key influencing parameters, ensuring that the model predictions align with sustainable construction objectives.

Sensitivity analysis results

A sensitivity index of 1.0 indicates complete sensitivity, a sensitivity index less than 0.01 indicates that the model is insensitive to changes in the parameter13. Figure 17 shows the sensitivity analysis with respect to Fsp. The sensitivity analysis produced C of 32%, W of 40%, NCAg of 38%, PL of 0%, RCAg_D of 8%, RCAg_P of 12%, RCAg_wa of 32%, Vf of 66%, and F_type of 18% impacts on the splitting tensile strength of the studied concrete. The sensitivity analysis highlights the varying degrees of influence that different parameters have on the splitting tensile strength of recycled aggregate concrete, emphasizing their importance for sustainable production. Water content (W) exerts the most significant impact at 40%, demonstrating that the amount of water in the mix is a critical factor for achieving optimal tensile strength. This underscores the need for careful water management to balance workability and strength in sustainable concrete production. Coarse natural aggregate (NCAg) has a substantial impact of 38%, indicating its essential role in maintaining the structural integrity of the concrete mix. The cement content (C), with an impact of 32%, is another crucial factor, reflecting the importance of cement in binding the aggregates and achieving the desired strength. Recycled coarse aggregate properties, such as water absorption (RCAg_wa at 32%), particle density (RCAg_D at 8%), and purity (RCAg_P at 12%), also contribute notably to the tensile strength. These findings suggest that the quality and characteristics of recycled aggregates must be optimized to improve the performance of sustainable concrete. Void fraction (Vf) has the highest influence at 66%, highlighting its paramount importance in controlling the porosity and mechanical properties of the mix. This factor must be carefully managed to enhance the durability and strength of recycled aggregate concrete. Finally, the type of fine aggregate (F_type at 18%) affects the tensile strength moderately, reflecting the role of fines in enhancing the particle packing and cohesion of the mix. Plastic limit (PL) has no recorded impact (0%), suggesting it is not a determinant factor in this context. In summary, the analysis reveals that parameters such as water content, void fraction, and aggregate quality are the most critical for improving the splitting tensile strength of recycled aggregate concrete in sustainable production. Proper optimization and management of these variables are essential for achieving high performance while promoting environmental sustainability.

Fig. 17.

Fig. 17

Sensitivity analysis.

Conclusions

The comprehensive analysis and modeling exercise for predicting the splitting tensile strength of recycled aggregate concrete (RAC) highlights significant insights into the performance of different models, the sensitivity of key parameters, and the potential for sustainable production practices. The modeling exercise employed various computational techniques, including Kstar, XNV, M5Rules, DT, and ElasticNet models, to predict the splitting tensile strength of RAC. The following have been concluded from the foregoing;

  • Among these, the Kstar model emerged as the most robust, achieving an R2 of 0.96 and an accuracy of 94%, with minimal errors such as RMSE and MAE at 0.15 MPa. This superior performance demonstrates the model’s ability to provide highly accurate and consistent predictions, making it a reliable choice for practical engineering applications.

  • The XNV model followed closely, with an R2 of 0.75 and accuracy of 84.5%, delivering strong predictions but with slightly higher errors.

  • Similarly, the M5Rules model provided moderate reliability, with an R2 of 0.78 and accuracy of 85.5%. Both models represent viable options for predicting RAC performance, particularly when a balance between computational efficiency and precision is required.

  • In contrast, the DT model demonstrated moderate performance, with an R2 of 0.645 and accuracy of 81.5%, suitable for general applications but less effective for high-precision needs.

  • The ElasticNet model exhibited the weakest performance, with an R2 of 0.375 and accuracy of 76.5%, emphasizing the need for refinement to achieve competitive results.

  • The sensitivity analysis provided valuable insights into the key parameters affecting the splitting tensile strength of RAC. Void fraction (Vf) was identified as the most critical factor, with a 66% impact, underscoring its role in determining the porosity and overall mechanical properties of the mix. Water content (40%) and natural coarse aggregate (NCAg, 38%) were also highly influential, highlighting the importance of proper water-cement ratio and aggregate quality. Recycled aggregate properties, including water absorption (RCAg_wa, 32%), purity (RCAg_P, 12%), and particle density (RCAg_D, 8%), played significant roles in the mechanical performance of RAC, emphasizing the need for stringent quality control in recycled materials. Cement content (32%) and fine aggregate type (F_type, 18%) also contributed notably, while the plastic limit (PL) was found to have negligible influence.

  • The findings align with the principles of sustainable construction, demonstrating that RAC, when optimized through effective modeling and sensitivity analysis, can deliver comparable mechanical performance to conventional concrete. The use of recycled aggregates contributes to resource conservation and waste reduction, aligning with global efforts to minimize the environmental footprint of the construction industry.

  • Generally, the Kstar model is the most reliable tool for predicting RAC’s splitting tensile strength, followed by XNV and M5Rules. These models effectively balance accuracy, efficiency, and computational demand. Void fraction, water content, and aggregate quality (both natural and recycled) are the primary determinants of RAC’s tensile strength, requiring meticulous management for optimal performance. The integration of recycled aggregates in concrete production, supported by advanced modeling and sensitivity analysis, paves the way for sustainable and efficient construction practices.

  • In summary, this exercise underscores the importance of advanced modeling techniques, the critical role of parameter sensitivity, and the feasibility of using recycled materials in achieving high-performance, sustainable concrete solutions.

  • Future research should focus on refining weaker models, exploring additional influencing factors, and implementing these findings in large-scale construction scenarios to further validate their practical applicability.

Practical application

The practical application of this research is significant for the construction industry, particularly in the context of sustainable and high-performance concrete production. The findings demonstrate that machine learning models, especially Kstar, can be effectively utilized to predict the splitting tensile strength of recycled aggregate concrete (RAC) with high accuracy. This capability enables engineers and construction professionals to optimize mix proportions, reduce material testing costs, and enhance quality control processes without extensive experimental procedures. The identification of key influencing factors, such as void fraction, water content, and aggregate quality, provides a data-driven approach to improving RAC’s mechanical properties, ensuring that the structural integrity of recycled concrete meets industry standards. The study supports the broader adoption of RAC by offering a reliable predictive framework that mitigates uncertainties associated with using recycled materials. This is particularly valuable for large-scale construction projects aiming to incorporate sustainable materials while maintaining durability and strength requirements. The integration of advanced computational techniques also streamlines decision-making, allowing engineers to select optimal RAC mix designs based on specific performance targets. By demonstrating that RAC can achieve mechanical properties comparable to conventional concrete, the study encourages its application in structural and non-structural elements, promoting circular economy principles in the construction sector. Additionally, the emphasis on void fraction and aggregate properties highlights the necessity for stringent quality control measures in recycling processes. This ensures that RAC maintains its performance consistency, making it a viable alternative to natural aggregate concrete. The research also suggests pathways for further refinement of weaker predictive models and calls for real-world validation through large-scale implementation. Future developments in this area could lead to the creation of standardized RAC mix design protocols, ultimately reducing dependency on virgin aggregates and minimizing construction-related environmental impact. By aligning with sustainable construction practices, this study contributes to reducing the depletion of natural resources, lowering carbon emissions associated with cement and aggregate production, and minimizing construction waste. The findings provide a strong foundation for policymakers, engineers, and material scientists to advance green building initiatives and integrate RAC into mainstream construction, enhancing both economic viability and environmental responsibility.

Sustainability goals of the research

This research aligns with key sustainability goals by demonstrating how recycled aggregate concrete (RAC) can be optimized for structural performance through advanced modeling and sensitivity analysis. The study highlights that when key parameters such as void fraction, water content, and aggregate quality are effectively managed, RAC can achieve mechanical properties comparable to conventional concrete. This directly supports sustainable construction by promoting the use of recycled materials, reducing dependence on natural resources, and minimizing waste in the construction industry. The conclusions emphasize that models like Kstar, XNV, and M5Rules can provide accurate predictions of RAC’s tensile strength, ensuring that sustainable materials are not only viable but also reliable in engineering applications. By identifying the critical factors influencing RAC’s performance, the study underscores the importance of proper mix design and quality control in recycled aggregates, further enhancing the material’s feasibility for widespread use. The integration of recycled aggregates into concrete production supports global efforts to lower the environmental footprint of the construction sector by reducing landfill waste and conserving raw materials. Additionally, the research highlights the potential for computational models to drive more efficient material usage, minimizing experimental trial-and-error processes and promoting data-driven decision-making. The study also paves the way for future improvements by encouraging further refinement of weaker models and expanding the analysis to large-scale applications, ensuring that sustainability in construction is not just theoretical but practically achievable.

Recommendation for future research

Future research should focus on refining the weaker models, such as the ElasticNet, to improve their predictive performance and reduce the errors observed in this study. Investigating alternative algorithms or hybrid approaches that combine the strengths of different models could also lead to better accuracy and generalization for predicting the splitting tensile strength of recycled aggregate concrete (RAC). Additionally, expanding the dataset to include a broader range of mix proportions, environmental conditions, and aggregate types would improve model robustness and increase the applicability of the findings to real-world construction scenarios. Future studies could also explore the incorporation of additional influencing factors, such as the impact of curing conditions, temperature variations, and long-term durability, to better reflect the actual performance of RAC in construction projects. Implementing advanced feature engineering techniques or domain-specific knowledge could enhance the sensitivity analysis and provide deeper insights into the interactions between different parameters. Moreover, it would be beneficial to explore the use of reinforcement learning or deep learning models, which may be better suited for handling large, complex datasets and optimizing predictions through continuous learning. Such models could be especially useful for large-scale construction projects, where predictive accuracy and efficiency are paramount. Lastly, there is a need to validate these machine learning models in real-world applications by conducting large-scale experimental studies to ensure that the predictions are reliable and applicable in diverse construction environments. This could involve collaborating with industry partners to test the models in actual construction projects and assess their performance over time. Ultimately, integrating these advanced models into the decision-making process of sustainable construction practices could significantly contribute to the optimization of RAC, promoting the widespread adoption of recycled materials and enhancing the sustainability of the construction industry.

Supplementary Information

Author contributions

KCO conceptualized, KCO, VK, SH, AKS, RFZV, ROSM, SMZP, RMTC, AME, PA & KPA wrote the main manuscript text and KCO & AME prepared the figures. All authors reviewed the manuscript.

Funding

The authors received no external funding for this research work.

Data availability

The data supporting this research work will be made available on request from the corresponding author.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Kennedy C. Onyelowe, Email: kennedychibuzor@kiu.ac.ug, Email: konyelowe@mouau.edu.ng, Email: konyelowe@gmail.com

Viroon Kamchoom, Email: viroon.ka@kmitl.ac.th.

Ahmed M. Ebid, Email: ahmed.abdelkhaleq@fue.edu.eg

Krishna Prakash Arunachalam, Email: k.prakash@utem.cl.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-025-91980-3.

References

  • 1.Van Damme, H. Concrete material science: Past, present, and future innovations. Cem. Concr. Res.112, 5–24 (2018). [Google Scholar]
  • 2.Zhang, S., Chen, W., Xu, J. & Xie, T. Use of interpretable machine learning approaches for quantificationally understanding the performance of steel fiber-reinforced recycled aggregate concrete: From the perspective of compressive strength and splitting tensile strength. Eng. Appl. Artif. Intell.137, 109170 (2024). [Google Scholar]
  • 3.Han, C., Zhang, J., Tu, Z. & Ma, T. PINN-AFP: A novel C-S curve estimation method for asphalt mixtures fatigue prediction based on physics-informed neural network. Constr. Build. Mater.415, 135070. 10.1016/j.conbuildmat.2024.135070 (2024). [Google Scholar]
  • 4.Pande, P. B. et al. Integrated hybrid machine learning techniques and multiscale modeling towards evaluating the influence of nano-material on strength of concrete. Multiscale Multidiscip. Model. Exp. Des.8, 26. 10.1007/s41939-024-00588-z (2024). [Google Scholar]
  • 5.Tipu, R. K., Shah, O. A., Vats, S. & Purohit, S. Enhancing concrete properties through the integration of recycled coarse aggregate: A machine learning approach for sustainable construction. In 2024 4th Int. Conf. Innov. Pract. Technol. Manag., pp. 1–5. 10.1109/ICIPTM59628.2024.10563490 (2024).
  • 6.Ahmad, S. A., Ahmed, H. U., Rafiq, S. K. & Ahmad, D. A. Machine learning approach for predicting compressive strength in foam concrete under varying mix designs and curing periods. Smart Constr. Sustain. Cities1, 16 (2023). [Google Scholar]
  • 7.Li, Z. et al. Machine learning in concrete science: applications, challenges, and best practices. Npj Comput. Mater.8, 127 (2022). [Google Scholar]
  • 8.Mater, Y., Kamel, M., Karam, A. & Bakhoum, E. ANN-python prediction model for the compressive strength of green concrete. Constr. Innov.23, 340–359 (2023). [Google Scholar]
  • 9.Wang, X. Q., Chen, P., Chow, C. L. & Lau, D. Artificial-intelligence-led revolution of construction materials: From molecules to Industry 4.0. Matter6, 1831–1859 (2023). [Google Scholar]
  • 10.Baduge, S. K. et al. Artificial intelligence and smart vision for building and construction 4.0: Machine and deep learning methods and applications. Autom. Constr.141, 104440 (2022). [Google Scholar]
  • 11.Liu, J. et al. Physics-assisted machine learning methods for predicting the splitting tensile strength of recycled aggregate concrete. Sci. Rep.13, 9078. 10.1038/s41598-023-36303-0 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ebid, A., Onyelowe, K. C. & Deifalla, A. F. Data utilization and partitioning for machine learning applications in civil engineering. In International conference on advanced technologies for humanity. In book: Industrial innovations: New technologies in cities’ digital infrastructure. 10.1007/978-3-031-70992-0_8 (Springer, 2023).
  • 13.Hoffman, F. O. & Gardner, R. H. Evaluation of uncertainties in radiological assessment models. Chapter 11 of Radiological assessment: A textbook on environmental dose analysis. (Eds by Till, J. E. Meyer, H. R.) NRC Office of Nuclear Reactor Regulation, Washington, D. C (1983).
  • 14.Onyelowe, K. C. et al. Multi-objective optimization of sustainable concrete containing fly ash based on environmental and mechanical considerations. Buildings12, 948. 10.3390/buildings12070948 (2022). [Google Scholar]
  • 15.Onyelowe, K. C. et al. Evaluating the compressive strength of recycled aggregate concrete using novel artificial neural network. Civil Eng. J.8(8), 1679–1693. 10.28991/CEJ-2022-08-08-011 (2022). [Google Scholar]
  • 16.Onyelowe, K. C. et al. Global warming potential-based life cycle assessment and optimization of the compressive strength of fly ash-silica fume concrete; environmental impact consideration. Front. Built Environ.8, 992552. 10.3389/fbuil.2022.992552 (2022). [Google Scholar]
  • 17.Onyelowe, K. C. et al. Optimization of green concrete containing fly ash and rice husk ash based on hydro-mechanical properties and life cycle assessment considerations. Civil Eng. J.8(12), 3912–3938. 10.28991/CEJ-2022-08-12-018 (2022). [Google Scholar]
  • 18.Onyelowe, K. C., Gnananandarao, T., Jagan, J., Ahmad, J. & Ebid, A. M. Innovative predictive model for flexural strength of recycled aggregate concrete from multiple datasets. Asian J. Civil Eng.10.1007/s42107-022-00558-1 (2022). [Google Scholar]
  • 19.Onyelowe, K. C. et al. AI mix design of fly ash admixed concrete based on mechanical and environmental impact considerations. Civil Eng. J.10.28991/CEJ-SP2023-09-03 (2023). [Google Scholar]
  • 20.Onyelowe, K. C., Ebid, A. M. & Ghadikolaee, M. R. GRG-optimized response surface powered prediction of concrete mix design chart for the optimization of concrete compressive strength based on industrial waste precursor effect. Asian J. Civil Eng.10.1007/s42107-023-00827-7 (2023). [Google Scholar]
  • 21.Onyelowe, K. C. & Ebid, A. M. The influence of fly ash and blast furnace slag on the compressive strength of high-performance concrete (HPC) for sustainable structures. Asian J. Civil Eng.10.1007/s42107-023-00817-9 (2023). [Google Scholar]
  • 22.Onyelowe, K. C., Ebid, A. M., Aneke, F. I. & Nwobia, L. I. Different AI predictive models for pavement subgrade stiffness and resilient deformation of geopolymer cement-treated lateritic soil with ordinary cement addition. Int. J. Pavement Res. Technol.10.1007/s42947-022-00185-8 (2022). [Google Scholar]
  • 23.Ebid, A. M., Onyelowe, K. C., Denise, P. N., Kontoni, A. Q. & Gallardo, S. H. Heat and mass transfer in different concrete structures: A study of self-compacting concrete and geopolymer concrete. Int. J. Low-Carbon Technol.18, 404–411. 10.1093/ijlct/ctad022 (2023). [Google Scholar]
  • 24.Onyelowe, K. C., Ebid, A. M. & Hanandeh, S. Advanced machine learning prediction of the unconfined compressive strength of geopolymer cement reconstituted granular sand for road and liner construction applications. Asian J. Civil Eng.10.1007/s42107-023-00829-5 (2023). [Google Scholar]
  • 25.Al-Kharabsheh, B. N. et al. Basalt fiber reinforced concrete: A compressive review on durability aspects. Materials16(1), 429. 10.3390/ma16010429 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Birant, K. U. Semi-supervised k-star (SSS): A machine learning method with a novel holo-training approach. Entropy25(1), 149. 10.3390/e25010149 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Ayaz, Y., Kocamaz, A. F. & Karakoç, M. B. Modeling of compressive strength and UPV of high-volume mineral-admixtured concrete using rule-based M5 rule and tree model M5P classifiers. Constr. Build. Mater.94, 235–240. 10.1016/j.conbuildmat.2015.06.029 (2015). [Google Scholar]
  • 28.Khalid, E. G., Jamal, E. K., Isam, S. & Aziz, S. Comparison of M5 model tree and nonlinear autoregressive with eXogenous inputs (NARX) neural network for urban stormwater discharge modelling. MATEC Web Conf.295, 02002. 10.1051/matecconf/201929502002 (2019). [Google Scholar]
  • 29.Meng, K., Gai, Y., Wang, X., Yao, M. & Sun, X. Transfer learning for high-dimensional linear regression via the elastic net. Knowl.-Based Syst.304, 112525. 10.1016/j.knosys.2024.112525 (2024). [Google Scholar]
  • 30.Granata, F., Di Nunno, F. & de Marinis, G. Advanced evapotranspiration forecasting in Central Italy: Stacked MLP-RF algorithm and correlated Nystrom views with feature selection strategies. Comput. Electron. Agric.220, 108887. 10.1016/j.compag.2024.108887 (2024). [Google Scholar]
  • 31.Żabiński, K. & Zielosko, B. Decision rules construction: Algorithm based on EAV model. Entropy23(1), 14. 10.3390/e23010014 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Manan, A. et al. Machine learning prediction of recycled concrete powder with experimental validation and life cycle assessment study. Case Stud. Constr. Mater.21, e04053. 10.1016/j.cscm.2024.e04053 (2024). [Google Scholar]
  • 33.Manan, A., Pu, Z., Ahmad, J. & Umar, M. Multi-targeted strength properties of recycled aggregate concrete through a machine learning approach. Eng. Comput.42(1), 388–430. 10.1108/EC-07-2024-0635 (2025). [Google Scholar]
  • 34.Aneel Manan, Pu., Zhang, S. A., Umar, M. & Raza, A. Machine learning prediction model integrating experimental study for compressive strength of carbon-nanotubes composites. J. Eng. Res.10.1016/j.jer.2024.08.007 (2024). [Google Scholar]
  • 35.Manan, A., Zhang, P., Ahmad, S. & Ahmad, J. Prediction of flexural strength in FRP bar reinforced concrete beams through a machine learning approach. Anti-Corros. Methods Mater.71(5), 562–579. 10.1108/ACMM-12-2023-2935 (2024). [Google Scholar]
  • 36.Alyaseen, A. et al. Influence of silica fume and Bacillus subtilis combination on concrete made with recycled concrete aggregate: Experimental investigation, economic analysis, and machine learning modeling. Case Stud. Constr. Mater.19, e02638. 10.1016/j.cscm.2023.e02638 (2023). [Google Scholar]
  • 37.Alyaseen, A., Poddar, A., Alahmad, H., Kumar, N. & Sihag, P. High-performance self-compacting concrete with recycled coarse aggregate: Comprehensive systematic review on mix design parameters. J. Struct. Integr. Maint.4(3), 161–178. 10.1080/24705314.2023.2211850 (2023). [Google Scholar]
  • 38.Alyaseen, A. et al. Assessing the compressive and splitting tensile strength of self-compacting recycled coarse aggregate concrete using machine learning and statistical techniques. Mater. Today Commun.38, 107970. 10.1016/j.mtcomm.2023.107970 (2024). [Google Scholar]
  • 39.Alyaseen, A. et al. High-performance self-compacting concrete with recycled coarse aggregate: Soft-computing analysis of compressive strength. J. Build. Eng.77, 107527. 10.1016/j.jobe.2023.107527 (2023). [Google Scholar]
  • 40.Razan Alzein, M. et al. Polypropylene waste plastic fiber morphology as an influencing factor on the performance and durability of concrete: Experimental investigation, soft-computing modeling, and economic analysis. Constr. Build. Mater.438, 137244. 10.1016/j.conbuildmat.2024.137244 (2024). [Google Scholar]
  • 41.Kashyap, V., Alyaseen, A. & Poddar, A. Supervised and unsupervised machine learning techniques for predicting mechanical properties of coconut fiber reinforced concrete. Asian J. Civil Eng.25, 3879–3899. 10.1007/s42107-024-01018-8 (2024). [Google Scholar]
  • 42.Sangeetha, P. & Shanmugapriya, M. Prediction of mechanical strength of polypropylene fibre reinforced concrete using artificial neural network. Gradjevinski materijali i konstrukcije63(4), 79–86. 10.5937/GRMK2004079S (2020). [Google Scholar]
  • 43.Sangeetha, P. & Shanmugapriya, M. Artificial neural network applications in fiber reinforced concrete. J. Phys. Conf. Ser.1706(1), 012113. 10.1088/1742-6596/1706/1/012113 (2020). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The data supporting this research work will be made available on request from the corresponding author.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES