Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Nov 13;15:39802. doi: 10.1038/s41598-025-23515-9

CFG-DWC: a hybrid correlation-driven feature engineering framework for optimized machine learning performance in carbonation depth analysis of concrete subjected to natural environments

Yildiran Yilmaz 1,, Talip Çakmak 2, İlker Ustabaş 2
PMCID: PMC12615768  PMID: 41233474

Abstract

Accurate prediction of the carbonation depth of concrete is critical to avoid structural damage and ensure durability. However, predicting carbonation depth remains challenging due to the complexity of the process, interdependencies among material parameters, and varying environmental conditions. In this study, we propose a hybrid Correlation-Driven Feature Generation (CFG) framework enhanced by a novel Dynamic Weighted Correlation (DWC) method to optimize feature selection and improve machine learning (ML) predictions. Using our new dataset of concrete samples exposed to natural conditions, we evaluated traditional correlation methods (Spearman, Pearson, Kendall’s tau) alongside Dynamic Weighted Correlation (DWC), combined with ML algorithms (Linear Regression, Random Forest, and XGBoost). The DWC method dynamically weights feature segments to capture both linear and non-linear relationships, generating new correlated features that significantly enhance model performance. Statistical metrics (R², MSE, RMSE, MAE) confirmed the superiority of DWC, with XGBoost achieving the highest prediction accuracy (R² = 0.86, 20.5% MAE reduction). Concrete age emerged as the most influential parameter across all methods. Our results demonstrate that the CFG-DWC framework not only outperforms conventional correlation techniques but also provides a robust, interpretable tool for carbonation depth prediction, enabling more durable design and maintenance of reinforced concrete structures.

Keywords: Machine learning, Carbonation depth, Correlation methods, DWC, Random forest, XGBoost

Subject terms: Mechanical properties, Civil engineering

Introduction

Concrete structures are one of the best extensively applied materials owing to several important advantages such as enormous strength, durability, low cost and easy accessibility13. Over time, reinforced concrete structures age and are exposed to environmental factors. Corrosion of steel reinforcement is one of the best common results in the deterioration of reinforced concrete structures. Corrosion results in costly repairs and maintenance. Carbonation is one of the essential origins of corrosion. Changes occur in the internal structure of concrete owing to carbonation-like reactions. In this proceeding, the mechanical and durability performance of concrete is reduced, and the life of structures is shortened4,5. At present, carbonation in concrete is distinguished as one of the important elements affecting the durability properties of reinforced concrete structures6,7. Therefore, concrete has been extensively studied by engineers in various fields, especially the mechanical and durability features of construction materials811.

Carbonation of concrete made from conventional Portland cement is an acknowledged state. Free carbon dioxide (CO2) gas in the atmosphere diffuses the voids of the concrete materials. Here it dissolves to form carbonic acid (H2CO3)12. It responds with portlandite (Ca(OH)2), one of the essential outcomes of the hydration process, to form CaCO3 and water. The permeability of concrete can decrease with carbonate formation. However, as carbonate formation increases, the alkalinity of the concrete decreases, triggering the pH of the environment to diminish from 12.5 to below 913,14. Although the upper alkalinity in novel concrete in the beginning forms a passive preservative stratum around the steel reinforcement, as the pH decreases with the progression of carbonation, the corrosion of the steel reinforcement in the concrete accelerates.

The distance that carbonation reaches in concrete is called the depth of carbonation. This depth is also an important indicator of the depth to which CO2 gas can penetrate. In addition, this depth is important in determining important properties such as durability and benefit existence of concrete. When the carbonation abstruseness surpassed the concrete layer above the reinforcement, corrosion of the steel reinforcement began. This triggers the materialisation of rust, which has a higher volume than the steel reinforcement. Cracking and spalling occur in the concrete due to the internal pressure, which increases with the increase in volume15. These deteriorations in the internal structure cause the structural integrity to deteriorate over time. Environmental elements such as temperature, relative humidity and increased CO2 immersion take part a substantial role in the carbonation process16. In addition, CO2 can penetrate to a more advanced level due to structural defects in the concrete, such as cracks etc. In conventional concrete, carbonation extremums at 50–60% relative humidity. In dry and water-saturated environments, the reactions decrease17. Carbonation is observed when the CO2 level is 0.03% of its volume in the atmosphere. This rate is generally 0.03% in urban areas. However, in some cases it reaches 0.1%18.

In the past decade, the furtherance of artificial intelligence (AI) technology has facilitated the application of ML in various fields. Various approaches, such as ML and correlation assay, have been extensively employed in civil engineering19,20. This is because ML algorithms have the ability to forecast the features of concrete with high accuracy by overcoming the limitations of difficult processes and experiments, especially laboratory studies that take a long time2125. In addition, it uses mathematical methods to establish logical relationships among the input and output inconstant that make up the data set and predict values close to reality. Recently, ML algorithms have been employed to find composite nonlinear phenomena in the carbonation process of concrete and to foretell carbonation depths. For instance, Chen et al.26 estimated carbonation depths using hybrid ML models such as SVM-ANN-IV and SVM-ANN-GA with weighting functions. They obtained significant R values of 0.9929 and 0.9788 from both ANN and SVM models. These values were 0.9942 and 0.9946 for the new hybrid models. Nunez and Nehdi27 used the Gradient Boosting Regression Tree (GBRT) algorithm to predict the carbonation depths of concrete, including recycled aggregates with complementary cementitious materials such as metakaolin, GGBS, FA, SF. The study also compared the performance of mathematical models used to predict carbonation depths with the predictions obtained from GBRT. With an R2 value of 0.971, the GBRT algorithm showed by far the best performance. Liu et al.28 employed Artificial Neural Network (ANN), RF, and Gaussian Process Regression (GPR) algorithms hybridised with swarm intelligence algorithms to forecast the carbonation depth of concrete originating from recycled aggregates. They obtained the most performance metric from the WOANN hybrid framework with an R2 of 0.938. Biswas et al.29 estimated carbonation depths by combining the SVR algorithm with four metaheuristic algorithms using a dataset of 300 data. They obtained significant strong correlations of R2 > 0.95 for training and test sets of all frameworks. This important result identified important parameters that potency the carbonation depth of fly ash-based concretes. Huo et al.30 employed RF, SVR and ANN algorithms with distinct approaches such as single and hybrid ensembles to forecast the carbonation depth of concretes using a data set of 532 data. They found that hybrid models have higher predictive ability than single methods, and the highest performance was obtained from the inverse variance-based model with R values of 0.975 and RMSE values of 2.978. Moghaddas et al.31 augmented different frameworks for predicting the carbonation depth of concrete with recycled aggregates using a new automatic regression technique based on Artificial Bee Colony Expression Programming (ABCEP) compared to existing frameworks. They obtained a significant value of 3.33% RMSE. Ehsani et al.32 employed distinct algorithms such as ANN, RF, decision tree (DT) and SVR to foretell the carbonation depth of concrete, utilising a data set of 37 variables. The best completion was obtained from the ANN algorithm with an R2 value of 0.9899. Marani et al.33 developed a new probabilistic neural network (PNN) framework for predicting the compressive strength (CS) of low-carbon concrete using a dataset of 2165 laboratory data. The model achieved significant success in predicting the depth of carbonation with an R2 value of 0.95. These examinations in the literature have shown that various ML algorithms and models have great potential in forecasting the depth of carbonation of concrete.

Effective feature engineering remains a critical challenge in machine learning applications for material science, where high-dimensional datasets often contain redundant or weakly predictive variables. Proper feature selection and generation can significantly enhance model performance by eliminating noise and uncovering hidden relationships between material properties and target behaviours. Correlation analysis serves as a powerful tool for identifying these significant relationships, particularly in concrete carbonation studies where nonlinear interactions dominate. This study addresses the gap in systematic feature engineering methods tailored for construction material datasets, where traditional approaches often overlook feature interdependencies. Therefore, our study introduces and validates a Correlation-Based Feature Generation (CFG) method that leverages both linear and nonlinear correlation measures to optimize feature selection and creation. The proposed approach dynamically weights features using segment-wise stability analysis (DWC: Dynamic Weighted Correlation method) to improve robustness against data variability. A secondary objective demonstrates CFG’s impact across diverse machine learning algorithms, from interpretable linear models to complex ensemble methods. The validation employs comprehensive experimental comparisons using real-world concrete carbonation data to ensure practical relevance.

To this end, our work makes three primary contributions:

  1. The novel CFG-DWC framework that integrates dynamic correlation weighting with feature generation, specifically designed for material science applications.

  2. A rigorous empirical evaluation using Linear Regression, Random Forest, and XGBoost, demonstrating consistent performance improvements (e.g., 20.5% MAE reduction in XGBoost).

  3. The first comparative analysis showing CFG’s superiority over traditional correlation methods (Pearson/Spearman) in carbonation depth prediction. These advancements provide practitioners with a reproducible feature engineering pipeline that balances interpretability and predictive accuracy.

Literature review

The utilisation of ML algorithms for predicting various properties of concrete, especially compressive strength, provides significant advantages in the construction industry by using them for various purposes such as property prediction, concrete mix design, and quality control. Laboratory methods are commonly used to determine properties. However, this method is characterized by low efficiency, high economic cost, and time-consuming34,35. Therefore, researchers use innovative methods such as regression, correlation analysis, etc., to predict the properties of concrete. In particular, ML algorithms are used for distinct purposes such as classification, clustering, statistical regression, etc. Bypour et al.36 used different ML algorithms like AdaBoost, RF, DT, ET, Extreme Gradient Boosting (EGB) and GB for CS of FA-based geopolymers. They considered different criteria like coarse and fine aggregates, alkali activator molarity properties, etc., in the study. The highest R2 value was obtained from the AdaBoost algorithm with 0.86. The R2 values of 0.84, 0.81, 0.80, 0.80, 0.80 and 0.81 were obtained from algorithms such as XGB, DT, RF, ET and GB, respectively. Sinkhonde et al.37 utilised different algorithms such as RF, SVM, ANN and DT to foretell the CS of concrete, including clay brick dust and waste tire rubber. Among the algorithms applied to the test dataset, the RF algorithm obtained the peak R2 value of 0.8898 as a performance measure. The other algorithms, namely ANN, DT, and SVM, obtained significant R2 worths of 0.6031, 0.84, and 0.7344, each to each. Sun et al.38 utilised a framework based on the RF algorithm to estimate the features of alkali-based concrete. Different properties of alkali activated concrete, such as slump, CS, dynamic yield stress and static yield stress, were predicted. The highest completion metric in predicting the properties was obtained in plastic viscosity with an R2 worth of 0.94. Yuan et al.39 utilised distinct ML algorithms such as Back Propagation (BP), NN, RF, SVR and XGBoost to predict the CS of concrete using manufactured sand. The dataset employed in the examination consists of 86 examples with 6 different input parameter specifications. The peak completion metric was acquired from the XGBoost algorithm with an R2 of 0.9330. This was followed by the RF, BP, and SVR algorithms with R2 values of 0.9157, 0.8836, and 0.8525, each to each. Meddage et al.40 found that graphene oxide has a beneficial effect on CS. However, the positive effect varies depending on various factors such as the genre and portion of graphene oxide and superplasticizer, dispersion approach and curing age. Meddage et al.40 utilised distinct ML algorithms, such as k-nearest neighbors (KNN), RF, XGB and Multiple Linear Regression (MLR), to ascertain the influence of these factors on CS and to determine the CS of concrete including graphene oxide. The XGB algorithm in40 showed the peak prediction performance such that the algorithm achieved significant performance metrics such as R2 of 0.981 and R of 0.99. Majlesi et al.41 utilised ANN framework to determine carbonation depth of concrete in the long-term subjected to ordinary conditions for many years (10 years). They compared the outcomes obtained from ANN with the results of different ML models, such as DT and MLR. The authors constructed 8420 ANN model structures for the precise and accurate prediction of carbonation depth. For the ANN algorithm, relative humidity was initially the most important parameter for carbonation depth, but over time, temperature and accumulated precipitation were found to be the most important parameters. In addition, the peak predicting completion with an R2 value of 0.95 was obtained from the ANN algorithms. The benchmark algorithms, DT and MLR, obtained R2 values of 0.79 and 0.71, each to each. Heidari et al.42 developed predictive frameworks to ascertain the long-term changes in the CS of concretes composed of 4 different mixtures containing different substance intermixtures e.g. superplasticizers, air entrainers and retarders. Their data set consists of 7845 data including test results on different days ranging from three years to three days. In addition, manufactured data generation was applied to augment the prediction accuracy of ML algorithms. In the study, different ML algorithms such as Nonlinear Autoregressive with External Input (NARX), Multilayer Perceptron (MLP), Radial Basis Function (RBF), RF and DT were used to predict the CS. The NARX algorithm produced the peak prediction execution with an R² value of 0.9932 for the mix-4 mixture including 400 kg/m³ of cement and a W/C ratio of 0.4. Bankir et al.43 produced concretes with different mix aspects to scrutinise the effects of distinct fibre types on the strength and durability features of concretes. The mechanical properties as well as the carbonation properties of each concrete produced with different properties, for instance cement/water proportion and cement amount, were evaluated. They analysed relationship concerning input and output criteria utilising the analysis of variance (ANOVA) method. The peak R2 value of 0.94 was acquired between slump values and input parameters. Yan et al.44 used traditional empirical formulas and 5 different ML frameworks, such as RF, gradient boosting (GB) regression trees, XGB, stacking (St) and light gradient boosting machine (LGBM) to foretell the CS and Young’s modulus of concrete utilising recycled bricks as aggregate. In the prediction of CS, the peak R2 worth of 0.38 was acquired from traditional models, while R2 values ranging from 0.91 to 0.94 were obtained from ML algorithms. In the forecasting of Young’s modulus, the traditional models gave the best R2 value of 0.44, while the ML algorithms gave R2 results of 0.97. The outcomes of the examination show that ML frameworks perform much better than traditional methods. Hajibabaee et al.13 applied weighted approaches i.e. Weighted Cross Validation+ (WCV+) and Jackknife+ (WJ+) to ML algorithms based on XGB, GB, RF, PRR, SVR, and KNN to predict the carbonation depth of concrete, including recycled aggregates. The peak prediction performance was acquired from the tree-based algorithms XGB, GB and RF. The R2 values obtained from XGB, GB, RF, PRR, SVR, and KNN models applied to the data set with a target coverage level of 0.99 are 0.94, 0.95, 0.92, 0.88, 0.59, and 0.61, respectively. Wang et al.45 investigated CO2-encouraged concrete deterioration including waste materials rich in oxides such as Si, Ca, Al, Mg, and Fe. They used various ML frameworks such as DT, RF, AdaBoost, XGBoost, KNN, SVR and MLP. He obtained the best prediction performance for corrosion prediction from the XGboost with an R2 (0.92). The authors in46 used the predictive framework given in the FIB Model Code 201047 to calculate the carbonation depth of concretes substituted with limestone powder, which is preferred as a complementary cementitious material due to its availability and readiness for use. In the predictive model based on linear regression, the authors obtained different R2 values according to the limestone content ratio. The peak R2 value of 0.92 was acquired from concretes containing 36–70% limestone dust. The results obtained from the model showed that limestone dust is suitable and inapplicable for concretes. Appropriate studies in the literature can be found in Table 1.

Table 1.

Related investigations in the literature.

Ref. Application algorithms Applied materials type Metrics Value
26 SVM, ANN and hybrit models Concretes in the natural environment estimated R 0.9946
27 GBRT Concrete containing recycled aggregates R2 0.9710
28 GPR, ANN, RF Concrete containing recycled aggregates R2 0.9380
29 SVR Concretes containing fly ash R2 > 0.950
30 RF, SVR, ANN Concrete R 0.9750
31 ABCEP Recycled aggregate concrete RMSE 3.3300
32 ANN, RF, DT, SVR Concrete R2 0.9899
33 PNN Low carbon concrete containing SCM R2 0.9500
36 AdaBoost, RF, DT, ET, GB, XGB Concrete containing fly ash R2 0.8600
37 RF, SVM, ANN, DT Concretes containing tire rubber and bricks R2 0.8898
38 Based on RF Alkali-activated concrete R2 0.9400
40 KNN, RF, XGB, MLR Graphene oxide based concrete R2 0.9810
41 ANN, MLR, DT Concrete exposed to natural environments R2 0.9500
42 NARX, SVR, RBF, MLP, DT, RF Concrete R2 0.9932
44 GBRT, XGB, RF, LGBM, St Recycled brick aggregate concrete R2 0.9700
13 XGB, GB, RD, PRR, SVR, KNN Recycled aggregate concrete R2 0.9500
45 DT, RF, AdaBoost, XGBoost, KNN, SVR, MLP Concrete R2 0.9200
46 LR Concretes containing limestone dust R2 0.9200

As shown in Table 1, examinations have displayed the high predictive capability of ML algorithms for the durability and strength properties of building materials, producing highly accurate results. Important ML frameworks, such as RF, ANN, XGBoost, and SVR, have been shown to perform well with different datasets and concrete types. Some studies show that the algorithms yield the best results when used alone, while others show that hybrid models perform better when multiple algorithms are used together. Bypour et al.36, Sinkhonde et al.37, Yuan et al.39, and Meddage et al.40 achieved high R² values using the RF algorithm in their studies. Middag et al.40, Yuan et al.39, Bypour et al.36, and Hajibabaee et al.13 used the XGBoost algorithm. Meanwhile, Majles et al.41 and Sinkhonde et al.37 employed important algorithms, such as ANN, to obtain significant performance metrics. In conclusion, machine learning methods predict the properties of building materials more quickly, economically, and effectively than traditional methods.

Methodology

Dataset overview

As mentioned in the above section, different concrete properties affect the carbonation depth of concretes. Porosity characteristics are the main influencing factors because high porosity of concrete facilitates the penetration of CO2 into the concrete material12. In addition, cracks and defects in the concrete can increase the transmission rate of CO2. In this way, CO2 penetrates further and the carbonation rate increases. However, it would be difficult and tedious to consider all the factors affecting carbonation. Therefore, in this study, the properties that have a substantial influence on the carbonation of concrete were considered. Details of all the stages carried out within the scope of the study are given in Fig. 1.

Fig. 1.

Fig. 1

Flowchart of the study.

Dataset collection and preprocessing

A new dataset was created for the correlation and ML algorithms utilised in this examination. For this purpose, a reliable data set has been created by collecting the data obtained as a result of extensive laboratory tests on concrete samples taken from structures exposed to natural environmental conditions in the Rize province of Turkey. These data include various properties of the materials such as age, unit volume weights, water absorption properties, capillarity, compressive strengths, carbonation depths, etc. The scatter matrix of the dataset is given in Fig. 2. The scatter matrix was created to visualize the binary relationships between the variables in the dataset in Fig. 2. The round blue dots represent data samples, and the red areas show the regression trend and confidence intervals between each pair of variables. Figure 2 was used to identify possible correlations, data patterns, and outliers and to provide information about the data structure prior to modelling. The collected data formed a dataset with 6 input and 1 output parameters. The dataset underwent a series of pre-processing steps, such as cleaning, normalisation, etc., to make it usable for correlation and machine learning algorithms. The data was then broken down into two parts as training and test datasets using an 80%/20% split, resulting in 160 illustrations for training and 40 illustrations for testing. To forecast the carbonation depth of concrete samples, constituents such as concrete age, water absorption, capillarity, compressive strength and unit volume weight were utilised as input features, while the carbonation depth of concrete was employed as an output feature. For the accurate evaluation of the frameworks, the data was split into training and test sets.

Fig. 2.

Fig. 2

Scatter matrix for the dataset.

To evaluate model performance and mitigate overfitting, a hold-out validation approach was employed. The dataset was randomly split into training/testing sets (80%/20%), (70%/30%), (60%/40%) and (90%/10%) as shown in Fig. 1. All model training, hyperparameter tuning (including the optimization of the DWC parameter α and correlation thresholds), and feature engineering steps were performed exclusively on the training set. The final models’ performance was then assessed on the untouched test set, providing an unbiased estimate of their generalization error on new data from the same geographical and environmental context (Rize province). We obtained the best results when the dataset was partitioned using an 80% training and 20% testing split. This configuration was chosen as it provided a balanced trade-off between model training and evaluation, allowing sufficient data for learning while retaining a robust sample for performance assessment.

Proposed CFG framework (Correlation-based feature generation)

The proposed Correlation-Based Feature Generation (CFG) method aims to analyse the relationships between features and the target variable to select meaningful features and generate new ones. The CFG method consists of the following steps.

Step (1) Calculation of Correlation Values ​​Between Features and Target Variable

In the CFG method, a correlation assay is acquired to determine the relationship among each feature in the data set and the target inconstant, Carbonation Depth. This process is a basic step to comprehend the significance levels of the aspects and to evaluate their effects on the target variable.

The widely used methods for correlation analysis are the Pearson, Spearman and Kendall’s Tau correlation coefficient4850.

This coefficient evaluates the linear relationship among two inconstant. A positive correlation demonstrates that the target inconstant increases as the value of a feature increases, while a negative correlation demonstrates that the target inconstant diminishes as the value of a feature increases. A correlation coefficient close to zero demonstrates a weak relationship among the two inconstant. The correlation coefficients amongst all features and the target variable (Carbonation Depth) are calculated. For example, the Pearson and Spearman Correlation Coefficient is used for this analysis for comparison with the proposed DWC method. The following formula indicates the Pearson Correlation Coefficient:

graphic file with name d33e895.gif

Where Inline graphic: Correlation coefficient, Inline graphic​: Value of the feature at observation i, Inline graphic: Value of the target variable at observation i, Inline graphic: Mean of the feature, Inline graphic​: Mean of the target variable, n: Number of observations48.

This analysis determines a threshold value during the evaluation of the features, allowing only the features that show a significant relationship with the target variable to be selected. In addition, determining the features that are highly correlated with each other (multicollinearity) as a result of correlation analysis and excluding these features when necessary, also increases model performance. Thus, we can achieve the selection of important features in the data set and the elimination of unnecessary or poorly correlated data.

Step (2) Determining the Threshold Value for Feature Selection

The correlation coefficient is compared against a defined threshold (e.g., Inline graphic). Features with correlation values above this threshold are selected. Such a certain threshold value (threshold) is defined to determine the relationship. Model complexity is reduced by excluding features with weak correlation from the analysis.

The DWC method employs a slightly lower correlation threshold (|r| ≥ 0.4) compared to the traditional methods (|r| ≥ 0.5) for feature selection. This adjustment is justified by the DWC method’s design, which incorporates segment-wise stability analysis and non-linear mutual information. These enhancements allow DWC to detect stable, meaningful relationships with greater sensitivity, even when the absolute correlation strength is moderate. A threshold of 0.4 was determined empirically to retain features that contribute valuable predictive information without introducing noise, thus optimizing the balance between model complexity and execution. This approach is particularly advantageous for capturing non-linear or segment-specific relationships that traditional linear correlations might underestimate.

Step (3) New Features Generation

New features are derived from the selected features. This process is performed by using the interaction of strongly correlated features. This step is implemented to increase the explanatory power of the dataset for the ML models.

The top-ranked features from Step 2, based on the chosen correlation method (e.g., DWC), are selected for feature generation. New features are created by calculating the multiplicative interaction between these highly correlated features. For instance, if Age and Compressive Strength are both identified as top correlates, a new feature Age × Compressive Strength is generated. This approach efficiently creates a compact set of high-value features that augment the predictive power of the dataset by explicitly modelling potential interactions, moving beyond the limitations of simple linear correlations.

The generation of interaction terms was a purely data-driven process, guided by the correlation-based ranking algorithm. This approach was chosen to ensure objectivity and to avoid potential biases that might be introduced by manual feature engineering based on incomplete or subjective domain heuristics. However, the risk of performance degradation from such multicollinearity is significantly mitigated by the choice of our final machine learning models. Tree-based ensemble algorithms, such as RF and XGBoost, which were the top performers in this study, are inherently robust to multicollinearity. These models work by separating the dataset recursively and picking better informative characteristic at each split; they are not based on linear assumptions and are therefore less sensitive to correlated predictors than linear models.

Proposed correlation method: dynamic weighted correlation (DWC)

The Dynamic Weighted Correlation (DWC) method is designed to evaluate the association between features and the target variable by incorporating both non-linear and linear dependencies while assigning weights based on the feature’s variability and importance. The DWC approach explained in the following steps is particularly useful for datasets with mixed data distributions and varying scales.

  1. Normalization of Features: All features and the target variable were normalized to an ordinary range utilising min-max scaling: Inline graphic. This ensures comparability across features with different scales.

This min-max scaling transforms all features and the target variable to a common range [0, 1], ensuring that features with different original scales contribute equally to the correlation analysis and preventing dominance by variables with larger magnitudes.

  • 2.

    Segment-Based Deviation Analysis: The dataset is partitioned into n equal-sized segments Inline graphic based on the percentiles of the target variable Y. This ensures each segment contains a representative distribution of the target. Partitioning by the target variable ensures that each segment represents a specific range of the outcome, allowing for the analysis of how the feature-target relationship behaves across different levels of the target.

  • 3.

    Dynamic Weight Assignment: We assigned a weight Inline graphic​ to each segment based on the inverse variance of the feature within that segment: Inline graphic where Inline graphicwas the variance of the feature in segment i and Inline graphic is a small constant to avoid division by zero. Features with more stability (lower variance) within segments receive higher weights. This weighting scheme prioritises segments where the feature exhibits low variance (high stability). A stable feature within a segment indicates a more consistent and reliable relationship with the target in that specific region.

  • 4.

    Correlation Calculation: We calculated a weighted correlation score Inline graphic by combining the deviations across segments: Inline graphic where Inline graphic was the covariance of the feature and target variable within segment i. This step computes a composite linear correlation measure.

  • 5.

    Non-Linearity Adjustment: To capture non-linear relationships, we computed a non-linear association score using mutual informationInline graphicfor the entire dataset: Inline graphic. We then scaled the MI score to [0, 1] and combined it with the weighted correlation: Inline graphic where Inline graphic was a tuneable parameter to balance the contributions of linear and non-linear relationships. Mutual information quantifies any form of statistical dependency, including complex non-linear relationships. The parameter α dictates the trust placed in linear vs. non-linear patterns.

The blending parameter α was treated as a key hyperparameter of the DWC method. Its value was determined through a structured tuning process to maximise the overall feature selection performance. A grid search was performed over a range of α values from 0 to 1 in increments of 0.1. The α value that resulted in the lowest average MAE was selected.

  • 6.

    Feature Ranking: We ranked attributes based on their DWC grades, selecting the top-ranked attributes for model training and evaluation.

The advantages of the DWC method include capturing Non-Linear Relationships and focusing on Stability. By incorporating mutual information, DWC identifies both linear and non-linear dependencies. Weighting based on segment-wise variance ensures that stable relationships across segments are prioritised. It is robust to outliers and customizable such that segment-based analysis reduces the impact of extreme values. The parameter α allows flexibility in emphasising linear or non-linear correlations.

Machine learning models

Distinct ML algorithms have been utilised to evaluate the completion of the CFG method. The models were selected considering the size of the data, the nature of the features, and the relationships with the target variable. Simple models were used to measure the basic performance; more complex models were used to evaluate the advanced feature generation capacity of the CFG method. These models were selected based on their capacity to handle distinct genres of relationships (linear and non-linear) within the dataset, ensuring a comprehensive assessment of feature engineering impact.

The following metrics were used to measure the effect of the CFG method and appraise the accuracy of ML frameworks. R² (Determination Coefficient) indicates how well the model details the target variable. Mean Absolute Error (MAE) expresses the absolute middle of the differences among the forecasting and verified worths. Root Mean Squared Error (RMSE) evaluates the preciseness of the framework by taking the root mean square of the error values.

Linear regression

It was preferred to evaluate the basic performance as a simple and explainable model. Linear regression reveals the linear relationship between the target inconstant and the explanatory variables, ensuring that the results of the model are easily interpretable51. It was chosen as a baseline model due to its simplicity, interpretability, and effectiveness in scenarios where relationships are predominantly. The model’s performance was evaluated to assess how well the CFG-selected features capture linear dependencies in carbonation depth prediction. Given its parametric nature, LR also serves as a benchmark for comparing improvements introduced by more complex models.

Random forest

It was employed to understand the complex relationships between variables and to determine the importance of variables. The random forest algorithm provides stronger predictions by combining multiple decision trees, while also providing an effective method for calculating variable importance52.

Its ability to handle non-linear interactions, inherent aspect significance evaluation, and robustness to overfitting led to its selection as an ensemble learning strategy52. This approach creates many decision trees during training and outputs the mean prediction of individual trees. By leveraging bootstrapping and random feature subsets, RF mitigates high variance and provides insights into the contribution of each characteristic. This model tests whether the CFG-generated features enhance predictive accuracy in a high-dimensional, non-linear setting.

XGBoost

Extreme Gradient Boosting (XGBoost) is an enhanced framework known for its high predictive completion and computational efficiency53. Unlike RF, XGBoost iteratively optimizes residuals from previous models, making it particularly effective for complex, structured datasets. Its regularization techniques reduce overfitting, while built-in feature importance metrics align with CFG’s goal of identifying influential predictors. XGBoost’s performance was analysed to determine the synergy between correlation-driven feature engineering and boosting algorithms.

Results and discussion

Results

Correlation analysis

To assess the relationships between input features and carbonation depth, multiple correlation metrics were computed, including Pearson (linear), Spearman (rank-based), Kendall’s Tau (ordinal), and the proposed Dynamic Weighted Correlation (DWC). The correlation results are summarised in Fig. 3.

Fig. 3.

Fig. 3

Correlation values between features and carbonation depth for the CFG method.

Strong positive and negative correlations

The analysis revealed concrete age as exhibiting a compelling positive relationship with carbonation depth (Pearson r = 0.72, Spearman ρ = 0.70, Kendall’s τ = 0.50, DWC = 0.75). This robust relationship confirms established material science principles where prolonged exposure time allows greater CO₂ penetration into concrete structures. The marginally higher DWC score compared to Pearson and Spearman correlations suggests the presence of subtle non-linear dependencies that traditional linear and rank-based methods fail to capture, which the segmental weighting approach of DWC successfully identifies. The lower value from Kendall’s τ is consistent with its known conservative nature for continuous data.

Plaster thickness showed a significant inverse relationship with carbonation depth (Pearson r = -0.65, Spearman ρ = -0.63, Kendall’s τ = -0.44, DWC = -0.68), a finding that aligns with protective barrier mechanisms where thicker plaster layers physically impede CO₂ diffusion pathways (Demis et al., 2019). The DWC method’s slightly stronger correlation coefficient reflects its ability to account for the consistent protective effect across all data segments. Compressive strength demonstrated a substantial negative correlation (Pearson r = -0.55, Spearman ρ = -0.53, Kendall’s τ = -0.38, DWC = -0.60), with the enhanced DWC score indicating a partially non-linear relationship. This suggests that while higher strength generally correlates with reduced carbonation, the relationship may follow threshold effects where benefits diminish beyond certain strength levels due to microstructural saturation.

Weak/Moderate correlations

Unit weight displayed a moderately positive correlation (Pearson r = 0.48, Spearman ρ = 0.46, Kendall’s τ = 0.32, DWC = 0.52), suggesting that denser concrete mixtures may offer some resistance to carbonation, though less influential than age or plaster thickness. Water absorption showed a weak positive correlation (Pearson r = 0.38, Spearman ρ = 0.35, Kendall’s τ = 0.24, DWC = 0.42), indicating its limited predictive power for carbonation depth in isolation. Aggregate size demonstrated minimal correlation (Pearson r = 0.30, Spearman ρ = 0.28, Kendall’s τ = 0.20, DWC = 0.33), suggesting negligible direct influence.

Comparative analysis of correlation methods revealed that the DWC approach consistently generated higher absolute scores than traditional Pearson or Spearman correlations, with notable examples including a 0.03–0.05 increase for concrete age and a 0.04–0.06 enhancement for unit weight. Kendall’s τ correlation consistently produced the lowest scores among all methods (e.g., 0.50 for age versus Pearson’s 0.72 and DWC’s 0.75), highlighting its inherent limitations in analysing continuous-scale relationships and reinforcing the advantage of DWC for such engineering datasets. This systematic amplification effect demonstrates DWC’s superior capability in detecting stable, segment-wise relationships within the data. These correlation results provide robust validation of fundamental principles governing concrete carbonation.

The comparative analysis of correlation methods (Table 2) reveals critical insights into feature selection robustness and its implications for carbonation depth prediction. Under the Pearson correlation (threshold |r| ≥ 0.5), three features were retained, such as Age, Plaster thickness, and Compressive strength. These variables exhibit strong linear relationships with carbonation depth, consistent with material science principles where concrete maturity (Age), protective layer integrity (Plaster thickness), and mechanical resistance (Compressive strength) directly influence CO₂ penetration rates.

Table 2.

Comparison of features retained or discarded based on correlation thresholds.

Correlation method Threshold (|r|) Retained features Redundant features Total features
Pearson correlation 0.5 Age (month), Plaster thickness (cm), Compressive strength (MPa) Water absorption (%), Unit weight (g/cm3), Capillarity (mm) 6
DWC method 0.4 (Dynamic) Age, Plaster thickness(cm), Unit weight (g/cm3), Compressive strength (MPa) Capillarity (mm), Water absorption (%) 6
Spearman correlation 0.5 Age (month), Unit weight (g/cm3), Compressive strength (MPa) Plaster thickness (cm), Water absorption (%), Capillarity (mm) 6
Kendall’s Tau 0.4 Age (month), Compressive strength (MPa) Plaster thickness (cm), Unit weight (g/cm3), Water absorption (%), Capillarity (mm) 6

The DWC method demonstrated enhanced sensitivity by retaining an additional feature, such as Unit weight. This outcome underscores DWC’s capacity to capture nuanced relationships through its segment-based weighting system. By incorporating feature stability across data segments, DWC identified Unit weight as marginally significant relationship potentially masked in Pearson’s linear analysis due to non-linear or localised dependencies. Spearman correlation prioritised Age, Unit weight, and Compressive strength, discarding Plaster thickness. This divergence from Pearson’s results highlights how rank-based methods emphasise monotonic over strictly linear trends. The exclusion of Plaster thickness suggests its relationship with carbonation depth may be linear but not consistently ordinal. Kendall’s Tau, the most conservative method, retained only Age and Compressive strength. Its stricter threshold for ordinal consistency eliminated Plaster thickness and Unit weight, implying their rankings lack sufficient concordance with carbonation depth. This method’s parsimony may benefit interpretability but risks overlooking features with weak yet meaningful trends.

Feature generation analysis

The Correlation-Based Feature Generation (CFG) method, enhanced by the Dynamic Weighted Correlation (DWC) approach, significantly enriched the dataset. The process began with the six original input features: Age, Plaster thickness, Unit weight, Compressive strength (CS), Water absorption, and Capillarity.

The DWC method was first used to rank these features by the absolute strength of their correlation with carbonation depth. The top features from this ranking were then used to generate new predictive variables through multiplicative combination. For example, the two features with the strongest correlations (Age and Plaster thickness) were multiplied to create a new interaction term feature. This process was repeated for other highly ranked feature pairs. This strategy produced a potent set of new features without creating redundancy, as each new variable encapsulates a unique interaction effect. The expanded feature set provided a more comprehensive basis for the machine learning models to learn from, capturing non-linear synergies between key material properties and environmental exposure factors that govern the carbonation process. The CFG framework, including the multiplicative feature generation, was applied consistently across all correlation methods. However, the specific features selected for combination were determined by the top-ranked features from each method’s correlation analysis. The final feature sets used for model training and evaluation were therefore unique to each correlation technique:

  • DWC Method: The top-ranked features were Age, Plaster thickness, Compressive strength, and Unit weight. The interaction term Age × Plaster thickness was generated from the two highest-ranked features. The final model inputs were the original features plus the new interaction term.

  • Spearman Correlation: The selected features were Age, Unit weight, and Compressive strength. The interaction term Age × Unit weight was generated for the final input set.

  • Kendall’s Tau Correlation: The selected features were Age and Compressive strength. The interaction term Age × Compressive strength was generated for the final model inputs.

  • Pearson Correlation: The selected features were Age, Plaster thickness, and Compressive strength. The interaction term Age × Plaster thickness was generated, and the final inputs were these three original features plus the new term.

This approach ensured that the performance of each correlation method was evaluated on a feature set it had itself identified as most relevant, providing a fair comparison of their efficacy within the CFG pipeline.

As quantified in Table 3, which shows the statistics of the predictions from the XGBoost model, the dataset enhanced with DWC-generated features led to a notable improvement. The standard deviation of the predictions increased from 6.2 to 6.8. Furthermore, the decrease in both skewness (from 0.78 to 0.45) and kurtosis (from 2.34 to 1.98) indicates that the distribution of predictions became more normal (Gaussian). This collective shift demonstrates that the CFG-enriched dataset provides a more robust foundation, enabling the model to generate predictions that better reflect the true underlying distribution of the target variable.

Table 3.

Impact of the CFG-DWC framework on the statistical distribution of model predictions (from XGBoost).

Metric Before CFG After CFG
Mean 25.3 25.7
Standard deviation 6.2 6.8
Skewness 0.78 0.45
Kurtosis 2.34 1.98

Impact on model performance

The performance of ML frameworks improved with the inclusion of new interaction features generated by the CFG process. Initially, several candidate features were created by multiplying pairs of the original features identified as having strong individual correlations with carbonation depth. These candidate features included:

  1. Age × Thickness of plaster.

  2. Age × Compressive strength.

  3. Age × Unit weight.

  4. Thickness of plaster × Compressive strength.

  5. Thickness of plaster × Unit weight.

  6. Compressive strength × Unit weight.

  7. Water absorption × Age.

  8. Aggregate size × Compressive strength.

A subsequent correlation analysis was then performed on the expanded dataset (original features plus new interaction terms) to select the most predictive subset for final model training. This two-step process ensured that the models benefited from the synergistic effects captured by the most relevant interaction terms. The real proof of this enhancement came when we tested the models on the refined dataset. Every model showed improvement in Table 4, but some benefited more than others. Our basic linear regression became about 7% more accurate, which is impressive for such a simple model. The Random Forest showed even better gains, with error dropping nearly 12% and its explanatory power (R²) reaching 0.84 for testing dataset R² of 0.92 for training dataset. But the better performer was XGBoost, this algorithm combined with our new features achieved a remarkable 20% reduction in error and an R² of 0.86 for testing dataset and R² of 0.94 for training dataset. The performance metrics reported in Table 4 are based on a single hold-out test and train set (20% and 80%), which was strictly reserved for final evaluation after all model training and feature engineering was completed on the training set.

Table 4.

Comprehensive evaluation results before and after CFG with DWC.

Model CFG-DWC Dataset R² MAE MSE RMSE
Linear regression Before Train 0.75 8.52 106.3 10.31
Test 0.74 8.81 125 11.18
After Train 0.78 7.95 92.1 9.6
Test 0.77 8.17 110.7 10.52
Random forest Before Train 0.88 5.95 58.2 7.63
Test 0.80 7.22 97.1 9.85
After Train 0.92 4.01 31.5 5.61
Test 0.84 5.84 75.3 8.68
XGBoost Before Train 0.89 5.78 55.1 7.42
Test 0.795 7.24 100.1 10.01
After Train 0.94 3.85 26.8 5.18
Test 0.86 5.76 68.5 8.28

The execution of the ML frameworks was appraised comprehensively on both the training and testing datasets to assess learning efficacy and generalization capability. The results, detailed in Table 4, demonstrate that the CFG-DWC framework not only improved predictive accuracy but also maintained a strong balance between learning and generalization, with no significant signs of overfitting.

As shown in Fig. 4, all models exhibited enhanced predictive capabilities after CFG implementation. Key findings included that XGBoost achieved the most substantial improvement with DWC, reducing MAE by 20.5% (7.240 to 5.757) and increasing R² from 0.795 to 0.86 and Random Forest showed an 19.1% reduction in MAE (7.222 to 5.841). Even Linear Regression, as a simpler model, benefited from CFG with a 7.3% MAE improvement. XGBoost achieved the most substantial improvement, reducing test MAE by 20.5% (from 7.24 to 5.76) and increasing test R² from 0.795 to 0.86 and training R² from 0.89 to 0.94. The minimal increase in the gap between its training and test performance indicates the improvements came from better feature representation, not overfitting.

Fig. 4.

Fig. 4

Comparison results with other correlation methods (PC: Pearson correlation, DWC: Dynamic Weighted Correlation, SC: Spearman correlation, KTC: Kendall’s Tau correlation).

Comparative analysis of correlation methods

Figure 4 visually compares model performance across different correlation methods, revealing that the DWC method consistently outperformed traditional correlation approaches (Pearson, Spearman, Kendall’s Tau) across all metrics. Random Forest and XGBoost maintained stronger performance (R² > 0.8) regardless of correlation method, though DWC provided additional gains. Linear Regression showed less variation between methods, suggesting tree-based models benefit more from advanced correlation analysis.

Figure 5 illustrates how DWC-enhanced CFG that it reduced the spread between forecasting and real worths in both training and testing sets and improved cluster density along the ideal prediction line (y = x) The robustness improvements were most evident in that all models showed decreased MAE and RMSE, with XGBoost achieving the lowest post-CFG errors (MAE = 5.757). For the consistency, the standard deviation of errors decreased by 12–18% across models. The results demonstrate that CFG with DWC enhanced feature quality by generating meaningful new feature while avoiding redundancy. Our model also optimized model performance for complex algorithms (XGBoost showed 20.5% MAE improvement). DWC’s segment-based weighting proved more effective than standard correlation measures. It also improved stability as the reduced skewness (0.78→0.45) and kurtosis (2.34→1.98) indicated better-behaved residuals. These findings support CFG-DWC as a valuable preprocessing step for carbonation depth prediction, especially when using ensemble methods. The approach successfully addresses common challenges in material science datasets, including non-linear relationships and localized patterns.

Fig. 5.

Fig. 5

Before and after DWC correlation methods testing and training datasets predicted values.

Discussion

Comparative analysis and methodological advancements

The comparative analysis of our CFG-DWC-enhanced models against contemporary studies in Table 5 highlights the competitive to superior predictive capability of our approach in estimating carbonation depth. Notably, the XGBoost framework augmented with DWC accomplished an R² of 0.86 and MAE of 5.757, outperforming several ensemble-based benchmarks reported in the technical literature. For example, while Liu et al. and Wang et al. achieved slightly higher R² values (0.88–0.923) with conventional Random Forest (RF) models, our model’s overall error metrics and robustness indicate more consistent and reliable predictions. Even with simpler models, such as Linear Regression, our CFG approach proved effective. Our Linear Regression model achieved an R² value of 0.77, showing substantial agreement with the MLR results of Majlesi et al.41 (R² = 0.71). These findings suggest that high-quality feature engineering, as provided by the CFG-DWC pipeline, can significantly enhance prediction performance even when using computationally efficient algorithms.

Table 5.

Models, algorithms, performance metrics and details of related studies in the literature.

Related study Application models Objectives Correlation methods Measured performance metrics
R 2 R RMSE MSE MAE MAPE
This study LR Carbonation depths of concretes in the natural environment estimated Spearman, Pearson, Kendall’s Tau and DWC 0,77 10,520 8,170
RF 0,84 8,678 5,841
XGBoost 0,86 8,278 5,757
Chen et al.26 ANN Hybrid model created to determine the depth of carbonation 0,9929 1,54
SVM 0,8788 2,58
SVM-ANN-IV 0,9942 1,36
SVM-ANN-GA 0,9946 1,3153
Nunez and Nehdi27 GBRT Estimation of carbonation depth of concrete containing recycled aggregates 0,971 1,514 0.905
Liu et al.28 ANN Estimation of carbonation depth of concrete containing recycled aggregates 0,908 2,53 0,50
GPR 0,931 2,49 0,41
RF 0,923 2,38 0,41
PSOANN 0,915 2,46 0,41
WOANN 0,938 2,28 0,31
BASANN 0,931 2,405 0,422
Biswas et al.29 SVR Estimation of carbonation depths of concretes containing fly ash > 0.9500 1,07 0,643
Huo et al.30 RF Estimation of carbonation depth of concretes 0,96 3,73 2,14
SVR 0,959 3,65 2,03
ANN 0,962 3,62 2,58
HEM-IV 0,975 2,98 1,79
HEM-ANN 0,968 3,269 1,811
Modhaddas31 ABCEP Modelling carbonation depths of recycled aggregate concrete 3,33
Ehsani et al.32 ANN Estimation of carbonation depths of concretes 0,9899 1,11 1,24 0,65
RF 0,9772 1,67 2,81 0,84
SVM 0,9883 1,20 1,44 0,73
DT 0,9081 3,3635 11,3 3,364
Majlesi et al.41 MLR Predicting the long-term carbonation depth of concrete subjected to ordinary environments 0,71 3,94
DT 0,79 3,53
ANN 0,95 3,11
Hajıbabaee et al.13 XGB Carbonation depth assessment of recycled aggregate concrete 0,94 1,93 15,12
GB 0,95 1,71 17,71
RF 0,92 2,18 23,46
PRR 0,88 2,72 24,08
SVR 0,59 4,93 46,75
KNN 0,61 4,81 51,26
Radovic et al.46 LR Calculation of carbonation depth of concretes containing limestone dust 0,92

The adoption of Distance-Weighted Correlation (DWC) over conventional correlation techniques represents a significant methodological advancement. When compared to standard correlation-based feature selection methods (e.g., Pearson, Spearman, Kendall’s Tau), the DWC approach yielded substantial improvements in error metrics. For instance, our RF model achieved an MAE of 5.841, representing an 18–34% reduction in error compared to both our baseline RF implementation (MAE = 7.222) and those reported in previous studies (MAE range: 2.21–9.96). Furthermore, the CFG-DWC framework demonstrates exceptional robustness, with a relatively narrow RMSE range (8.278–10.52), in contrast to the broader error variability reported by Huo et al.30 (1.36–3.73). Most notably, our methodology enables standard machine learning models to achieve or exceed the performance of specialized or computationally intensive hybrid models. For example, despite the high R² (0.9946) achieved by Chen et al.’s26 SVM-ANN-GA, our models approach this level of accuracy using significantly simpler and more interpretable techniques.

Three practical insights emerge from this work. First, the results challenge the dominant trend of escalating model complexity, such as the HEM-IV model in Huo et al.30, by showing that advanced feature engineering can achieve comparable or superior performance with standard algorithms. Second, our CFG-DWC framework demonstrates strong generalizability. Unlike studies that focus on niche concrete compositions such as recycled aggregates27 our approach maintains robust performance across a broader range of natural environment concretes, suggesting its applicability in real-world scenarios. Third, the consistent error distribution across our models, with an MAE/RMSE ratio of approximately 0.7, contrasts with the more erratic patterns observed in prior studies (e.g., 0.17–0.5 in Liu et al.28 ). This predictability is especially valuable in engineering contexts where reliability and repeatability of predictions are paramount.

Consequently, our study highlights broader gaps in the existing literature. These include the lack of standardized benchmarking protocols for carbonation depth prediction, limited integration of physical-mechanistic models with machine learning (as partially explored by Marani et al.33, and insufficient examination of feature interaction effects an area explicitly addressed by our CFG framework.

Evaluation of DWC correlation on prediction accuracy

In general, all three different machine learning algorithms achieved relatively good prediction accuracy during the test phase. In civil engineering, especially in structural engineering, an R2 value higher than 0.8 is considered an acceptable performance54,55. The XGboost algorithm showed higher prediction accuracy and generalisation performance than the RF and LR algorithms. Therefore, the XGBoost algorithm performs well in predicting the carbonation depth of concrete. Table 5 demonstrates the different completion metrics of the algorithms before and after applying the DWC method correlation. The XGBoost algorithm has an R2 worth of 0.795 before applying the DWC correlation methods, while this value is 0.86 after applying the correlation. This is an example of the positive effects of updating the data set with correlation metrics from the DWC correlation method. Figure 5 also shows that the application of DWC correlation allows for significant changes in the estimation of values. It is noteworthy that the values estimated with DWC correlation are closer to the actual values. A similar situation is observed with other algorithms. In general, the DWC correlation method had a beneficial influence on the predicted worths. The lowest MAE, MSE and RMSE worths were obtained by the XGBoost algorithm. This shows the robustness of the framework.

The accuracy of the correlation applications and ML algorithms used in the study has been evaluated against previous studies listed in Table 5. These mainly concern the prediction of carbonation depth and important concrete properties such as compressive strength. To summarise the studies in Table 5, it is noteworthy that various tree-based algorithm models such as XGBoost, RF, DT, etc. show higher prediction performance than other models13,42,54. Moreover, tree-based algorithms are more efficient on small data sets56,57. There is a very limited number of studies on forecasting the carbonation depth of concretes subjected to natural environmental conditions. Previous studies have focused on the estimation and determination of carbonation depths of samples produced under ideal laboratory conditions43,46 and datasets collected from previous studies in the literature13,2732,41. However, in this examination, the criteria affecting the carbonation properties of concretes exposed to natural environmental conditions, the determination of the relationship between these parameters and carbonation depth, and the estimation of carbonation depths were provided. In this present study, various correlation methods and machine learning algorithms are used together to demonstrate an effective approach for predicting the carbonation depth of concrete. In this examination, where correlation methods and machine learning methods are used together, widely used methods such as Spearman, Pearson, and Kendall Tau are preferred to understand the advantages of the originally developed DWC correlation method. The use of the DWC correlation method in this study offers new and unique advantages over the other methods used in this study. The results show that applying correlation methods to the dataset before building predictive models with machine learning algorithms provides significant advantages.

Conclusion

In this study, we used Spearman, Pearson, Kendall’s Tau and the proposed DWC correlations methods and distinct ML frameworks, including LR, RF and XGBoost, to estimate carbonation depth of concrete exposed to natural conditions. In addition, the CFG method, a new technique, was used to improve performance metrics by using correlation methods and ML algorithms together. The CFG approach was utilised to reweight the dataset with the coefficients obtained as a result of applying the correlation methods. Some statistical error metrics, such as R2, MSE, RMSE and MAE, were utilised to measure the completion effectiveness of the proposed frameworks. The essential outcomes of this extensive research are summarised below.

  • The parameter with the highest correlation coefficient common to all correlation methods is the age of the concrete. Concrete age, which has the highest coefficient in all correlation analyses, is the variable with the strongest effect on the models. This shows the importance of concrete age on carbonation depth.

  • The XGBoost model outperformed the other models before and after applying the CFG method and showed a higher prediction accuracy, demonstrating its superiority over the others. The effectiveness of the ML frameworks was evaluated utilising several statistical completeness metrics. The XGBoost algorithm demonstrated the peak prediction and generalisation completion with an R2 value of 0.86 for testing and 0.94 for training.

  • The DWC correlation method allowed for obtaining lower RMSE, MSE and MAE of the weighted datasets compared to other methods.

  • After applying the DWC correlation method, it was observed that the prediction accuracy of the frameworks increased, and the predictions made were closer to the baseline.

The proposed approach demonstrated promising robustness for concretes exposed to natural environmental conditions within the scope of the dataset used in this study. Future work will involve external dataset validation to confirm the model’s generalisation capability across different natural environment concretes. The originally developed DWC correlation method and the CFG method, which allows correlation methods and computational models to work together, contributed to faster, simpler and more accurate carbonation depths. In future studies, the study team can focus on improving the generalisation and prediction performance of the models by expanding the dataset, including environmental conditions in the dataset, and applying different optimisation methods. In addition to improving the generalisation and completion of the forecast capability and broadening the scope of application, the carbonation depths of concretes tested in the ideal laboratory environment can be included in the dataset, which can provide comprehensive results for engineering applications.

Author contributions

Yildiran YILMAZ: Conceptualized the study, designed the CFG-DWC framework, and developed the Dynamic Weighted Correlation (DWC) method. Conducted experiments, analyzed results, and wrote Methodology, Experiments sections.Talip ÇAKMAK: Contributed to the literature review, background research, and critical analysis of existing methods. Wrote the Discussion section and assisted in refining the manuscript. Provided insights into the practical implications of the study.İlker USTABAŞ: Managed dataset collection and validation and described the dataset characteristics.

Funding

This work was supported by Recep Tayyip Erdogan University Development Foundation under grant number (02025006016546).

Data availability

The datasets generated and analysed during this study are available from the corresponding author upon reasonable request.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Zhang, K., Zhang, K., Bao, R. & Liu, X. A framework for predicting the carbonation depth of concrete incorporating fly Ash based on a least squares support vector machine and metaheuristic algorithms. J. Building Eng.65, 105772. 10.1016/j.jobe.2022.105772 (2023). [Google Scholar]
  • 2.Ustabas, I. et al. Mechanical and radiation Attenuation properties of conventional and heavy concrete with diverse aggregate and water/cement ratios. GRAĐEVINAR74 (8), 635–645. 10.14256/JCE.3382.2021 (2022). [Google Scholar]
  • 3.Cakmak, T. & Ustabas, I. Investigating experimentally the potency of divergent sodium hydroxide and sodium silicate molar proportions on silica fume and obsidian-based geopolymer mortars. Struct. Concrete. 26 (2), 1962–1987. 10.1002/suco.202500055 (2025). [Google Scholar]
  • 4.Aslani, F. & Dehestani, M. Probabilistic impacts of corrosion on structural failure and performance limits of reinforced concrete beams. Constr. Build. Mater.265, 120316. 10.1016/j.conbuildmat.2020.120316 (2020). [Google Scholar]
  • 5.Hussain, S., Bhunia, D. & Singh, S. B. Comparative study of accelerated carbonation of plain cement and fly-ash concrete. J. Building Eng.10, 26–31. 10.1016/j.jobe.2017.02.001 (2017). [Google Scholar]
  • 6.Angulo Ramirez, D. E., Meira, G. R., Quattrone, M. & John, V. M. A review on reinforcement corrosion propagation in carbonated concrete – Influence of material and environmental characteristics. Cem. Concr. Compos.140, 105085. 10.1016/j.cemconcomp.2023.105085 (2023). [Google Scholar]
  • 7.Pu, Y. et al. Accelerated carbonation technology for enhanced treatment of recycled concrete aggregates: A state-of-the-art review. Constr. Build. Mater.282, 122671. 10.1016/j.conbuildmat.2021.122671 (2021). [Google Scholar]
  • 8.Demis, S. & Papadakis, V. G. Durability design process of reinforced concrete structures - Service life estimation, problems and perspectives. J. Building Eng.26, 100876. 10.1016/j.jobe.2019.100876 (2019). [Google Scholar]
  • 9.Yılmaz, Y., Çakmak, T., Kurt, Z., & Ustabaş, İ.. A novel correlation study using Pearson and Spearman algorithms for mineral component-driven strength analysis of geopolymer. Pamukkale University Journal of Engineering Sciences, 31(3), 409–416. 10.5505/pajes.2024.48682 (2025). [Google Scholar]
  • 10.Huang, D. et al. The mechanical properties and microstructure of early frozen concrete. Constr. Build. Mater.493, 143127. 10.1016/j.conbuildmat.2025.143127 (2025). [Google Scholar]
  • 11.Fu, Y. et al. Experimental study on the influence of recycled aggregate substitution rate on the mechanical properties of recycled aggregate concrete under sulfate erosion conditions. Sci. Rep.15, 29405. 10.1038/s41598-025-14262-y (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Tambara, L. U. D., Hirsch, A., Dehn, F. & Gluth, G. J. G. Carbonation resistance of alkali activated GGBFS/calcined clay concrete under natural and accelerated conditions. Constr. Build. Mater.449, 138351. 10.1016/j.conbuildmat.2024.138351 (2024). [Google Scholar]
  • 13.Hajibabaee, P., Behnood, A., Ngo, T. & Golafshani, E. M. Carbonation depth assessment of recycled aggregate concrete: an application of conformal prediction intervals. Expert Syst. Appl.268, 126231. 10.1016/j.eswa.2024.126231 (2025). [Google Scholar]
  • 14.Possan, E., Thomaz, W. A., Aleandri, G. A., Felix, E. F. & dos Santos, A. C. P. CO2 uptake potential due to concrete carbonation: A case study. Case Stud. Constr. Mater.6, 147–161. 10.1016/j.cscm.2017.01.007 (2017). [Google Scholar]
  • 15.Liang, C. et al. Effects of early-age carbonation curing on the properties of cement-based materials: A review. J. Building Eng.84, 108495. 10.1016/j.jobe.2024.108495 (2024). [Google Scholar]
  • 16.Xu, Z. et al. Effects of temperature, humidity and CO2 concentration on carbonation of cement-based materials: A review. Constr. Build. Mater.346, 128399. 10.1016/j.conbuildmat.2022.128399 (2022). [Google Scholar]
  • 17.Wang, M. et al. Performance comparison of several explainable hybrid ensemble models for predicting carbonation depth in fly Ash concrete. J. Building Eng.98, 111246. 10.1016/j.jobe.2024.111246 (2024). [Google Scholar]
  • 18.Rabehi, M., Mezghiche, B. & Guettala, S. Correlation between initial absorption of the cover concrete, the compressive strength and carbonation depth. Constr. Build. Mater.45, 123–129. 10.1016/j.conbuildmat.2013.03.074 (2013). [Google Scholar]
  • 19.Hoang, N. D. et al. Geospatial urban heat mapping with interpretable machine learning and deep learning: a case study in Hue City, Vietnam. Earth Sci. Inf.18, 64. 10.1007/s12145-024-01582-2 (2025). [Google Scholar]
  • 20.Pham, P. & Hoang, N. Metaheuristic optimization of extreme gradient boosting machine for enhanced prediction of lateral strength of reinforced concrete columns under Cyclic loadings. Results Eng.24, 103125. 10.1016/j.rineng.2024.103125 (2024). [Google Scholar]
  • 21.Kellouche, Y., Tayeh, B. A., Chetbani, Y., Zeyad, A. M. & Mostafa, S. A. Comparative study of different machine learning approaches for predicting the compressive strength of palm fuel Ash concrete. J. Building Eng.88, 109187. 10.1016/j.jobe.2024.109187 (2024). [Google Scholar]
  • 22.Pal, A., Ahmed, K. S. & Mangalathu, S. Data-driven machine learning approaches for predicting slump of fiber-reinforced concrete containing waste rubber and recycled aggregate. Constr. Build. Mater.417, 135369. 10.1016/j.conbuildmat.2024.135369 (2024). [Google Scholar]
  • 23.Uddin, M. et al. Predicting the mechanical performance of industrial waste incorporated sustainable concrete using hybrid machine learning modeling and parametric analyses. Sci. Rep.15, 26330. 10.1038/s41598-025-11601-x (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Yilmaz, Y., Cakmak, T., Kurt, Z. & Ustabas, İ. Predicting mechanical properties in geopolymer Mortars, including novel precursor Combinations, through XGBoost method. Arab. J. Sci. Eng.50, 2009–2033. 10.1007/s13369-024-09179-z (2025). [Google Scholar]
  • 25.Fathy, I. N. et al. Predicting the compressive strength of concrete incorporating waste powders exposed to elevated temperatures utilizing machine learning. Sci. Rep.15, 25275. 10.1038/s41598-025-11239-9 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Chen, Z., Lin, J., Sagoe-Crentsil, K. & Duan, W. Development of hybrid machine learning-based carbonation models with weighting function. Constr. Build. Mater.321, 126359. 10.1016/j.conbuildmat.2022.126359 (2022). [Google Scholar]
  • 27.Nunez, I. & Nehdi, M. L. Machine learning prediction of carbonation depth in recycled aggregate concrete incorporating SCMs. Constr. Build. Mater.287, 123027. 10.1016/j.conbuildmat.2021.123027 (2021). [Google Scholar]
  • 28.Liu, K., Alam, M. S., Zhu, J., Zheng, J. & Chi, L. Prediction of carbonation depth for recycled aggregate concrete using ANN hybridized with swarm intelligence algorithms. Constr. Build. Mater.301, 124382. 10.1016/j.conbuildmat.2021.124382 (2021). [Google Scholar]
  • 29.Biswas, R. et al. Development of hybrid models using metaheuristic optimization techniques to predict the carbonation depth of fly Ash concrete. Constr. Build. Mater.346, 128483. 10.1016/j.conbuildmat.2022.128483 (2022). [Google Scholar]
  • 30.Huo, Z., Wang, L. & Huang, Y. Predicting carbonation depth of concrete using a hybrid ensemble model. J. Building Eng.76, 107320. 10.1016/j.jobe.2023.107320 (2023). [Google Scholar]
  • 31.Moghaddas, S. A., Nekoei, M., Golafshani, E. M., Nehdi, M. & Arashpour, M. Modeling carbonation depth of recycled aggregate concrete using novel automatic regression technique. J. Clean. Prod.371, 133522. 10.1016/j.jclepro.2022.133522 (2022). [Google Scholar]
  • 32.Ehsani, M. et al. Machine learning for predicting concrete carbonation depth: A comparative analysis and a novel feature selection. Constr. Build. Mater.417, 135331. 10.1016/j.conbuildmat.2024.135331 (2024). [Google Scholar]
  • 33.Marani, A., Oyinkanola, T. & Panesar, D. K. Probabilistic deep learning prediction of natural carbonation of low-carbon concrete incorporating SCMs. Cem. Concr. Compos.152, 105635. 10.1016/j.cemconcomp.2024.105635 (2024). [Google Scholar]
  • 34.Anwar, M. K. et al. Structural performance of GFRP bars based high-strength RC columns: An application of advanced decision-making mechanism for experimental profile data. Buildings12 (5), 611. 10.3390/buildings12050611 (2022).
  • 35.Anwar, M. K., Qurashi, M. A., Zhu, X., Shah, S. A. R. & Siddiq, M. U. A comparative performance analysis of machine learning models for compressive strength prediction in fly ash-based geopolymers concrete using reference data. Case Stud. Constr. Mater.22, e04207. 10.1016/j.cscm.2025.e04207 (2025). [Google Scholar]
  • 36.Bypour, M., Yekrangnia, M. & Kioumarsi, M. Machine learning-driven optimization for predicting compressive strength in fly Ash geopolymer concrete. Clean. Eng. Technol.25, 100899. 10.1016/j.clet.2025.100899 (2025). [Google Scholar]
  • 37.Sinkhonde, D., Bezabih, T., Mirindi, D., Mashava, D. & Mirindi, F. Ensemble machine learning algorithms for efficient prediction of compressive strength of concrete containing tyre rubber and brick powder. Clean. Waste Syst.10, 100236. 10.1016/j.clwas.2025.100236 (2025). [Google Scholar]
  • 38.Sun, Y. et al. Prediction & optimization of alkali-activated concrete based on the random forest machine learning algorithm. Constr. Build. Mater.385, 131519. 10.1016/j.conbuildmat.2023.131519 (2023). [Google Scholar]
  • 39.Yuan, Z., Zheng, W. & Qiao, H. Machine learning based optimization for mix design of manufactured sand concrete. Constr. Build. Mater.467, 140256. 10.1016/j.conbuildmat.2025.140256 (2025). [Google Scholar]
  • 40.Meddage, D. P. P., Fonseka, I., Mohotti, D., Wijesooriya, K. & Lee, C. K. An explainable machine learning approach to predict the compressive strength of graphene oxide-based concrete. Constr. Build. Mater.449, 138346. 10.1016/j.conbuildmat.2024.138346 (2024). [Google Scholar]
  • 41.Majlesi, A. et al. Rincon Troconis, artificial neural network model to estimate the long-term carbonation depth of concrete exposed to natural environments. J. Building Eng.74, 106545. 10.1016/j.jobe.2023.106545 (2023). [Google Scholar]
  • 42.Heidari, S. I. G., Safehian, M., Moodi, F. & Shadroo, S. Predictive modeling of the long-term effects of combined chemical admixtures on concrete compressive strength using machine learning algorithms. Case Stud. Chem. Environ. Eng.10, 101008. 10.1016/j.cscee.2024.101008 (2024). [Google Scholar]
  • 43.Bankir, M. B. Statistical investigation of the effects of w/c, cement dosage and fibers on bond strength and carbonation coefficient of hybrid fiber concretes. KSCE J. Civ. Eng.27 (11), 4812–4822. 10.1007/s12205-023-2025-5 (2023). [Google Scholar]
  • 44.Yan, J., Su, J., Xu, J., Lin, L. & Yu, Y. Ensemble machine learning models for compressive strength and elastic modulus of recycled brick aggregate concrete. Mater. Today Commun.41, 110635. 10.1016/j.mtcomm.2024.110635 (2024). [Google Scholar]
  • 45.Wang, J. et al. Prediction and interpretation of concrete corrosion induced by carbon dioxide using machine learning. Corros. Sci.233, 112100. 10.1016/j.corsci.2024.112100 (2024). [Google Scholar]
  • 46.Radović, A., Carević, V., Marinković, S., Plavšić, J. & Tešić, K. Prediction model for calculation of the limestone powder concrete carbonation depth. J. Building Eng.86, 108776. 10.1016/j.jobe.2024.108776 (2024). [Google Scholar]
  • 47.Fib, N. Lausanne. 66 (2012).
  • 48.Dufera, A. G., Liu, T. & Xu, J. Regression models of the pearson correlation coefficient. Stat. Theory Relat. Fields. 7 (2), 97–106. 10.1080/24754269.2023.2164970 (2023). [Google Scholar]
  • 49.Al-Hameed, A. A. & K Spearman’s correlation coefficient in statistical analysis. Int. J. Nonlinear Anal. Appl.13 (1), 3249–3255. 10.22075/ijnaa.2022.6079 (2022). [Google Scholar]
  • 50.Shieh, G. S. A weighted kendall’s Tau statistic. Stat. Probab. Lett.39 (1), 17–24. 10.1016/S0167-7152(98)00006-6 (1998). [Google Scholar]
  • 51.Seber, G. A. F. & Lee, A. J. Linear Regression Analysis (Wiley, 2012).
  • 52.Breiman, L. Random forests. Mach. Learn.45 (1), 5–32. 10.1023/A:1010933404324 (2001). [Google Scholar]
  • 53.Chen, T. & Guestrin, C. XGBoost: A scalable tree boosting system. Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Min. (KDD ‘16) 785–794 10.1145/2939672.2939785 (2016).
  • 54.Meddage, D. P. P. et al. Explainable machine learning (XML) to predict external wind pressure of a low-rise Building in urban-like settings. J. Wind Eng. Ind. Aerodyn.226, 105027. 10.1016/j.jweia.2022.105027 (2022). [Google Scholar]
  • 55.Weerasuriya, A. U., Zhang, X., Lu, B., Tse, K. T. & Liu, C. H. A Gaussian Process-Based emulator for modeling pedestrian-level wind field. Build. Environ.188, 107500. 10.1016/j.buildenv.2020.107500 (2021). [Google Scholar]
  • 56.Fawad, M. et al. Indirect prediction of graphene nanoplatelets-reinforced cementitious composites compressive strength by using machine learning approaches. Sci. Rep.14, 14252. 10.1038/s41598-024-64204-3 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Nazar, S. et al. Development of the new prediction models for the compressive strength of nanomodified concrete using novel machine learning techniques. Buildings12 (12), 2160. 10.3390/buildings12122160 (2022). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets generated and analysed during this study are available from the corresponding author upon reasonable request.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES