Jellyfish Search-Optimized Deep Learning for Compressive Strength Prediction in Images of Ready-Mixed Concrete

Jui-Sheng Chou; Stela Tjandrakusuma; Chi-Yun Liu

doi:10.1155/2022/9541115

. 2022 Aug 1;2022:9541115. doi: 10.1155/2022/9541115

Jellyfish Search-Optimized Deep Learning for Compressive Strength Prediction in Images of Ready-Mixed Concrete

Jui-Sheng Chou ^1,^✉, Stela Tjandrakusuma ¹, Chi-Yun Liu ¹

PMCID: PMC9359848 PMID: 35958762

Abstract

Most building structures that are built today are built from concrete, owing to its various favorable properties. Compressive strength is one of the mechanical properties of concrete that is directly related to the safety of the structures. Therefore, predicting the compressive strength can facilitate the early planning of material quality management. A series of deep learning (DL) models that suit computer vision tasks, namely the convolutional neural networks (CNNs), are used to predict the compressive strength of ready-mixed concrete. To demonstrate the efficacy of computer vision-based prediction, its effectiveness using imaging numerical data was compared with that of the deep neural networks (DNNs) technique that uses conventional numerical data. Various DL prediction models were compared and the best ones were identified with the relevant concrete datasets. The best DL models were then optimized by fine-tuning their hyperparameters using a newly developed bio-inspired metaheuristic algorithm, called jellyfish search optimizer, to enhance the accuracy and reliability. Analytical experiments indicate that the computer vision-based CNNs outperform the numerical data-based DNNs in all evaluation metrics except the training time. Thus, the bio-inspired optimization of computer vision-based convolutional neural networks is potentially a promising approach to predict the compressive strength of ready-mixed concrete.

1. Introduction

Structures like buildings, bridges, highways, and dams are currently built using concrete as their construction material, owing to its numerous advantages, such as strength, durability, and versatility. Its compression capacity, adaptability, and resistance to climate-induced erosion and corrosion make concrete one of the best construction materials. Compressive strength is one of the principal mechanical properties of concrete that is directly related to the safety of the structures that are built from it. The compressive strength of concrete must comply with relevant standard codes, which vary among countries.

To determine the compressive strength of concrete, a cubic or cylindrical sample is typically tested using a compressive testing machine after the required curing time. These tests are labor-intensive and time-consuming. Methods such as regression methods and numerical simulation have been proposed to solve this problem and to predict the compressive strength of concrete. However, the complex nonlinear correlation between relevant variables makes the obtaining of accurate values of compressive strength very difficult.

With the advances of artificial intelligence (AI) and increases in computing power [1, 2], deep learning (DL) is being applied in an increasing number of fields. DL, which is a form of AI, has been shown to be effective in making more accurate predictions than conventional methods in many situations. One DL technique, computer vision, is often used to extract information from visual media, such as images and videos. Used in various fields, computer vision-based technique is effective for image classification, object detection, and semantic segmentation.

Several studies of the prediction of concrete compressive strength have involved the use of DL techniques [3] to improve model performance, but few have involved image recognition. The latest study of the use of image recognition to determine compressive strength had good results, but it used only 74 sets of concrete data and a single-layer convolutional neural network [4]. To examine and improve the effectiveness of image recognition, in this study, a large dataset of ready-mixed concrete is used with convolutional neural networks (CNNs) that involve a prediction model with deep layers to extract high-level features from inputs.

Model accuracy is often evaluated with the use of a cross-fold validation or random split method to partition the source data for the testing of the training model [5]. Such methods are often called into question as overfitting occurs owing to the information leakage within the original dataset in the training process. Therefore, when putting the model into practice, it often shows a relatively poor forecast performance. Because the concrete data are accumulated over time in ready-mixed plants, the built model should be tested via a latest dataset to reflect its reasonable prediction accuracy in future use.

In this investigation, the effectiveness of the computer vision-based approach in predicting the compressive strength of ready-mixed concrete by converting numerical data to images is tested. In much research, the prediction of the compressive strength of concrete uses numerical data as inputs. The results thus obtained using those computer vision-based techniques are compared with those obtained using numerical data. With this logic, a collection of numerical values that are represented as images are the inputs to a DL technique that uses CNN-based models, which have been shown to provide accurate image classification in the domain of computer vision.

The effectiveness of the computer vision-based technique was tested by comparing the results with those of another DL technique that uses deep neural networks (DNNs) with numerical data for model construction. To maximize the accuracy, a metaheuristic optimization algorithm was used to finetune the hyperparameters of the best DL models. Instead of using the cross-fold validation or random split method within the original dataset, a newly collected dataset in the upcoming year was used for the testing of the training model. This approach meets the practical needs and operations in estimating compressive strength at ready-mixed concrete plants.

This paper is organized as follows. Section 2 reviews the relevant literature. Section 3 describes the methodology and performance metrics that are used herein. Section 4 presents the collection and preprocessing of data, implementation of the DL models, the experimental results obtained using the optimized DL models, and sensitivity analysis of modeling performance. The final section summarizes the findings and limitations of the method and makes recommendations for future studies.

2. Literature Review

2.1. Conventional Compressive Strength Prediction of Ready-Mixed Concrete

Ready-mixed concrete is typically manufactured in a concrete plant before being transported to a construction site. In a concrete plant, ready-mixed concrete is manufactured by combining several raw materials with a specific design mix ratio to create concrete with certain desirable properties. Figure 1 presents the manufacturing process of ready-mixed concrete.

Ready-mixed concrete manufacturing process.

The compressive strength of concrete is commonly tested using a compression test machine, which performs a mechanical test to measure the maximum compressive load that can be borne by a concrete sample [6]. Before testing, the sample must be cured for a specified curing period. Non-destructive tests (such as ultrasonic or pulse velocity tests [7] and conductivity tests [8]) have also been proposed to determine the compressive strength due to the lack of correlation between the standard compression test value with the real strength of concrete in a structure. These tests, however, have disadvantages with respect to time, cost, and labor.

Owing to the disadvantages of mechanical tests, empirical models [9, 10] for calculating the compressive strength of concrete have been developed. Empirical methods (e.g., multiple linear regression), however, have been shown to be somewhat ineffective for calculating the compressive strength of concrete because of the nonlinear behavior in relevant concrete variables. The compressive strength of concrete is influenced by numerous factors, as it is formed by complex reactions among concrete materials (such as cement and aggregates) and the environment (as in curing) [11].

2.2. Deep Learning to Determine Concrete Compressive Strength

In recent years, the field of artificial intelligence (AI) has grown very rapidly. AI methods are used in a wide variety of fields, including seismology [12], energy systems [13], and civil engineering [14]. In several studies, AI has been used to determine the concrete compressive strength, using real data for concrete to build a prediction model. During the training of the prediction model, various composite materials of concrete, such as cement, water, sand, and gravel, are used as predictors to yield a model that best fits the given training data. After validation, the model is then used to predict the compressive strength.

An advanced branch of AI, deep learning (DL), has performed excellently in fields such as computer vision [15]. Many studies [16–18] have shown that DL exhibits outstanding prediction performance, especially in image and video recognition. In this field, the commonly used DL techniques include those based on convolutional neural networks (CNNs) [19]. A recent study confirmed that the CNN model (Visual Geometry Group, VGG) achieved a 98% accuracy in concrete compressive strength prediction, which was 2% and 12% greater than the machine learning models, random forest (RF) and support vector regression (SVR), respectively [20].

2.3. Hyperparameter Optimization with Metaheuristic Algorithm

In the training of the DL models, additional optimizers are often required, as the models have several hyperparameters (such as the epsilon of batch normalization, batch size, epoch, learning rate, and dropout rate) that influence their predictive performance [21, 22]. To find the values of hyperparameters that yield the best prediction model, optimization algorithms (such as the greedy algorithm [23]) that are based on iterative methods (such as gradient descent [24]) or heuristic methods [25] are often used. However, such methods may not always lead to the optimal solution and consume a significant computational time compared to modern metaheuristic algorithms.

The metaheuristic algorithm, with its ease of implementation and effectiveness in various fields, is becoming increasingly popular for use in solving optimization problems. Recently, several newly developed metaheuristic optimizers have outperformed the well-known metaheuristic algorithms [26, 27]. The Jellyfish Search (JS) algorithm [27], in particular, has great efficacy because it requires little tuning of algorithm-specific parameters. Consequently, the JS algorithm was used in this study to optimize the DL models.

3. Methodology

3.1. Deep Learning and Computer Vision-Based Techniques

3.1.1. Deep Neural Networks

Artificial neural networks (ANNs) consist of information processing units that are arranged in layers similar to neurons in the human brain. An ANN typically comprises layers of three types: an input layer, hidden layers, and an output layer. The architecture that is used in a deep learning model typically consists of more than four hidden layers. Figure 2 displays a simple ANN model architecture.

An input layer receives data and an output layer generates a prediction. In the hidden layers, inputs are processed and the information that is obtained from the processes is passed to the next layer. Values from the input layer are transformed by multiplying them by weights and adding bias values.

Several types of ANN vary in implementation. A fully connected neural network is an ANN that consists of connected neurons. In such an ANN, all neurons in a layer are connected to the neurons in the next layer. Likewise, standard feedforward neural networks (FNNs) consist of numerous connected neurons, and each connection transmits information to other neurons in one forward direction [28].

Notably, internal hyperparameters affect the learning of an ANN model. A hyperparameter is a constant parameter that is set before the training begins. Some examples of hyperparameters in ANNs are the number of hidden layers, learning rate, batch size, and epoch. In contrast, parameters such as weights and bias values change throughout the learning process.

A deep neural network (DNN) is a neural network that differs from a typical ANN with respect to architecture. DNNs have multiple hidden layers (Figure 3) that are used to extract high-level features from the input data. Additional layers typically correspond to additional parameters (such as weights and biases) in a model. Accordingly, DNNs can capture complex nonlinear relationships [28].

Deep neural network (DNN) model architecture.

3.1.2. Convolutional Neural Network-Based Models

A convolutional neural network (CNN or ConvNet) is a connected neural network that is generally effective for solving computer vision problems, such as image feature extraction, classification, object detection, and semantic segmentation. A CNN commonly learns patterns by processing image or video data. It can detect objects, identify the locations of the objects, and differentiate or segment them inside an image.

A generic CNN usually comprises an input layer, multiple hidden layers, which include convolutional layers, pooling layers, fully connected layers, and dropout layers, and an output layer (Figure 4). In the input layer, the model receives images as inputs and creates input tensors that contain the pixel values of the images. Input matrixes of dimensions w × h × c are then fed to the hidden layer, where w represents the width of the image, h represents the height of the image, and c represents the number of channels. A standard colored image typically has three channels for red, green, and blue.

Generic convolutional neural network (CNN) model architecture.

The convolutional layer in the CNN model processes the previous matrixes into smaller forms without losing any feature by generating weight values of a filter or kernel of a certain size (m × m) and then multiplying the filter (n × n) by the input matrixes. Convolution operation is defined as follows [29]:

\begin{matrix} C = I \otimes F . \end{matrix}

(1)

Here, I is the input image data; F is the filter; ⊗ denotes the convolution operation; C is the convolution map of size (o × o), in which o = m − n+2zp/s + 1; s is the stride and denotes the number of pixels by which F is sliding over I; and zp is the zero padding. Usually, it is necessary to add a bounding of zeros around I to preserve complete image information. The values thus obtained are summed (Figure 5). Sliding over all parts of input matrixes, the convolutional layer generates, as an output, a new feature map of certain features in the image.

Convolutional layer multiplication process and plot of ReLU.

After the multiplication processes, a CNN model typically applies an activation function that introduces nonlinearity to the model to help it learn complex patterns in the data. A general form of activation function is defined as follows:

\begin{matrix} C_{m} = f (C) . \end{matrix}

(2)

C _m is the convolution map after applying the nonlinear activation function f. Of the many available activation functions, the rectified linear unit (ReLU) is commonly used, as it provides better training results than other activation functions [30]. A ReLU function is a simple calculation that returns the original input values or sets the value to zero if the input is less than or equal to zero (Figure 5).

The pooling layer in the model reduces the size of the input matrixes by reducing the number of parameters and the amount of computation in the network, preventing overfitting. Similar to a convolutional layer, a pooling layer takes several input values inside a filter from the previous layer and the filter is shifted over some pixels at a time until all parts of the input matrix are processed. Common pooling layer types are average pooling or max pooling (Figure 6). The pooling operation also called downsampling operation is expressed as follows:

\begin{matrix} P_{m} = P_{o} (C_{m}), \end{matrix}

(3)

where P_m is the pooling map and P_o is the pooling operation.

Example of max pooling and average pooling.

After the operation of several convolutional layers and pooling layers, a CNN model typically flattens the output matrix of the previous layer into a single vector of values. The single vector of values is input to a fully connected layer to extract the features that were learned in the previous layers and to classify the input images. In this layer, the probabilities that an object in the input image is a member of the possible classes are calculated. The model output of the i^th fully connected hidden layer is expressed as follows [29]:

\begin{matrix} Y^{i} = f (H^{i}), \end{matrix}

(4)

where the weight sum vector Hⁱ is

\begin{matrix} H^{i} = w^{i} Y^{i - 1} + B^{i} . \end{matrix}

(5)

w is the connected weight of the artificial neurons. f is a nonlinear activation function (e.g., sigmoid, Tanh, ReLU, etc.). The bias value Bⁱ defines the activation level of the artificial neurons.

In neural networks, when the parameters of a layer change, so do the distribution of inputs to subsequent layers. These shifts in input distributions can be problematic for neural networks. To alleviate this concern, many normalization operations, such as Batch Normalization (BN), Layer Normalization (LN), and Instance Normalization (IN), have been proposed. For example, given an input batch of height h and width w with n samples and c channels x ∈ R^n×c×h×w , BN normalizes the mean and standard deviation for each individual feature channel during training [31].

\begin{matrix} B N (x) = γ (\frac{x - μ_{B}}{σ_{B}}) + β, \end{matrix}

(6)

where γ, β ∈ R^c are referred to as the scale and the shift parameters for the channel; μ_B, σ_B ∈ R^c are the mean and standard deviation, respectively, computed across batch size and spatial dimensions independently for each feature channel.

Adding a dropout layer is an effective regularization technique to improve the generalization capability and mitigate overfitting of models. Dropout function can be formulated as follows [32]:

\begin{matrix} {\tilde{f}}^{l} (x_{i}) = f^{l} (x_{i}) - m_{i}^{l} * f^{l} (x_{i}), \end{matrix}

(7)

where ∗ denotes the element-wise product and f^l(x_i) and ${\tilde{f}}^{l} (x_{i})$ are the original feature and distorted features, respectively. In addition, ∈ m_i^l{0,1}^{d^l} is the binary mask applied on feature map f^l(x_i) in which d^l is the dimension of the feature map of l-th layer, and each element in m_i^l is drawn from Bernoulli distribution and set to 1 with the dropping probability. Undoubtedly, implementing dropout on the features in the training phase will force the given network paying more attention on those non-zero regions, and partially solve the overfitting.

In this decade, various CNN models and their advanced variants have been developed. Some common and popular CNN models are VGG [33], residual neural networks (ResNets) [34, 35], Inception [36, 37], extreme inception (Xception) [38], MobileNet [39, 40], DenseNet [41], NASNet [42], and EfficientNet [43]. These CNN-oriented models have different architectures, which are briefly introduced as follows:

(1) VGG. VGG [33] uses a very small kernel (3 × 3) rather than one of a previously common size, 5 × 5 or 7 × 7, which would have a wider scanning area. The small kernel is used uniformly throughout all layers. Although the overall architecture is simple, the VGG has an enormous number of parameters. Figure 7 displays the architectures of two common VGG models, VGG16 and VGG19, which comprise 16 and 19 deep layers, respectively. In the figure, the convolutional layer is denoted as “<kernel size> Conv, <filter>.”

(2) ResNet. Increasing the depth of a CNN by adding layers to its architecture up to a certain limit should help the corresponding CNN model to learn more complex features, but a vanishing gradient problem typically prevents the effective training of a CNN model in many-layered networks. A vanishing gradient problem can prevent the weights in the network from being updated, potentially stopping the training of the CNN model. To solve this problem in residual neural networks (ResNets), the network implements “residual connections” or “skip connections.”

A residual connection refers to a shortcut connection that is added inside a CNN architecture to allow information to be passed or added through layers of the convolutional block (Figure 8). In the original ResNet, a shortcut connection is added before the activation function is implemented, while in ResNet v2 [34], activation functions are implemented before the convolutional layer and the shortcut connection is added after. Figure 9 presents the architectures of ResNet50, ResNet101, and ResNet152, which comprise 50, 101, and 152 deep layers, respectively.

(3) Inception. Inception architecture [36] is the first CNN model architecture that exhibits the advantages of branching a convolutional path into multiple paths. In Inception, the CNN model uses filters of various sizes in various paths. At the end of the block, the model concatenates the outputs of the paths. In Inception-v3 [36], the Inception model is improved by changing the original 5 × 5 and 7 × 7 convolution kernels to two 3 × 3 and three 3 × 3 convolutional kernels, respectively. These changes in the architecture help the model reduce the amount of computation that is required during the training process.

In Inception-ResNet-v1 [37] and Inception-ResNet-v2 [37], the original inception blocks are converted into residual inception blocks. The Inception-ResNet-v2 model differs from the Inception-ResNet-v1 model in that it is more computationally burdensome. However, it outperforms the original Inception and ResNet models. Figure 10 displays the Inception-v3 and Inception-ResNet-v2 models' architectures.

Inception-v3 and Inception-ResNet-v2 models' architectures.

(4) Xception. The Xception (or Extreme Inception) [38] architecture (Figure 11) is inspired by the Inception model. In Xception, the original inception blocks are replaced by depthwise separable convolutions. A depthwise separable convolution consists of a depthwise convolution and a 1 × 1 convolution. A depthwise convolution is a spatial convolution that performs convolutional multiplications independently over each channel. In depthwise convolution, a convolutional kernel only iterates one channel of the input, not all channels.

(5) MobileNets. MobileNets [39] refer to a type of CNN model whose objectives are to reduce the number of parameters and the number of computations while maintaining accuracy. Accordingly, MobileNets use depthwise separable convolutions. They are typically used in mobile devices or embedded applications, and so have a small architecture. In MobileNets, width multiplier and resolution multiplier hyperparameters are implemented to thin the network and to rescale the input image, respectively.

Similar to the original MobileNet, MobileNetV2 [40] is built for mobile devices. In MobileNetV2, an inverted residual structure, which consists of linear bottleneck layers, is used. An inverted residual structure expands a low-dimensional feature map to a high-dimensional one, uses depthwise convolutions, and projects back features to a low-dimensional representation using a linear convolution. MobileNetV2 has fewer parameters than the original MobileNet. Figure 12 displays the original MobileNet and MobileNetV2 architectures.

MobileNet and MobileNetV2 models' architectures.

(6) DenseNet. The main intent of a dense convolutional network (DenseNet) [41] is to use short connections between layers by connecting the network layers to every other layer in the forward direction. Therefore, the inputs of each network layer include the feature maps of all preceding layers. This approach has been shown to improve the accuracy of a CNN. Figure 13 displays the DenseNet architecture.

(7) NASNet. The neural architecture search network (NASNet) [42] is used to solve the problem of finding a good CNN architecture by finding a neural network architecture or the best combination of parameters in a CNN with a recurrent neural network (RNN) acting as a controller. Figure 14 presents the neural architecture search method that is used in a NASNet model. Figure 15 displays one of the model architectures, NASNet-A, for the mobile version, which is found using the neural architecture search method.

(8) EfficientNet. EfficientNet is a type of CNN model that uniformly scales all depth, width, and resolution dimensions using a compound scaling coefficient. A total of eight CNN models are developed based on this idea. The models are named EfficientNets followed by B0, B1, B2, B3, B4, B5, B6, and B7. The EfficientNet architecture includes a total of seven network blocks (Figure 16). The number of subblocks inside varies with the EfficientNet models that are used [43].

3.2. Metaheuristic Optimization Algorithm: Jellyfish Search Optimizer

One of the challenges that is associated with the deep learning models is the finding of optimal hyperparameters. To solve this hyperparameter optimization problem, a metaheuristic optimization algorithm is frequently used. Considerable research has been done on the development of metaheuristic algorithms, and some of them have become well known for their effectiveness in solving optimization problems [44–46]. The metaheuristic algorithms primarily vary in the balance between their two main phases—exploration and exploitation [47].

A newly developed metaheuristic optimization algorithm, the Jellyfish Search (JS) optimizer [27], has considerably outperformed many other well-known metaheuristic optimization algorithms and it requires less algorithm-specific parameter tuning than some well-known metaheuristic algorithms. The optimizer requires the setting of only two controlling parameters, which are the number of iterations and population size. In a JS optimizer, the population of jellyfish is initialized using a logistic map, which generates varying initial populations.

Since the optimization algorithm is inspired by the behavior of jellyfish as they search for food in the ocean, the objective function of the JS optimizer is the location of jellyfish where it has the most food. In a JS optimizer, the exploration phase involves the movement of jellyfish as they follow ocean currents in search of food, while the exploitation phase involves the passive and active motions of the jellyfish inside a jellyfish swarm. Figure 17 presents the six phases of jellyfish in the ocean [27], including phase 1: jellyfish in the ocean; phase 2: following the ocean current; phases 3–5: passive and active motions inside the jellyfish swarm that are switched to each other according to a time control mechanism; and phase 6: reach the jellyfish bloom.

Phases of the jellyfish search algorithm.

3.2.1. Movement Following Ocean Current

Ocean currents carry a large amount of food, attracting jellyfish to them, and thus jellyfish follow them. The following equation represents the direction of the ocean current, ( $\overset{⟶}{t r e n d}$ ), and the new location of a jellyfish after it moves, X_i(t + 1) [27].

\begin{matrix} \overset{⟶}{trend} = X^{*} - 3 \times rand (0,1) \times μ, \\ X_{i} (t + 1) = X_{i} (t) + rand (0,1) \times \overset{⟶}{trend} . \end{matrix}

(8)

Here, X^∗ is the jellyfish at the best location, μ is the average location of all jellyfish, X_i(t) are the current locations of the jellyfish at time t, and X_i(t + 1) are the updated locations of the jellyfish at time (t+1).

3.2.2. Motions Inside Jellyfish Swarm

The motions of jellyfish in a swarm can be grouped into passive motion (type A) and active motion (type B). Passive motion signifies a movement of a jellyfish around its original position, and active motion signifies its movement to another position. Initially, most jellyfish exhibit type A motion, but after some time, more jellyfish exhibit type B motion [27]. The new location of a jellyfish that exhibits A motion is formulated as follows:

\begin{matrix} X_{i} (t + 1) = X_{i} (t) + 0.1 \times rand (0,1) \times (U b - L b), \end{matrix}

(9)

where rand(0,1) is a random number between 0 and 1, Ub is the upper bound on the search space, and Lb is the lower bound on the search space.

For type B motion, one other jellyfish, X_j, is randomly selected for use in determining the new location of the jellyfish of interest, X_i. If the amount of food at the location of X_j exceeds that at the location of X_i, then X_i will move toward X_j. Otherwise, X_i will move away from X_j. The direction of type B motion $(\overset{⟶}{Direction})$ and the updated jellyfish location are given by the following equations for minimization problems:

\begin{matrix} \overset{⟶}{Direction} = \{\begin{matrix} X_{j} (t) - X_{i} (t), if f (X_{i}) \geq f (X_{j}), \\ X_{i} (t) - X_{j} (t), if f (X_{i}) < f (X_{j}), \end{matrix} \\ X_{i} (t + 1) = X_{i} (t) + rand (0, 1) \times \overset{⟶}{Direction}, \end{matrix}

(10)

where f(X_i) and f(X_j) denote the objective functions at locations X_i and X_j, respectively.

3.2.3. Time Control Mechanism

A time control mechanism in a JS optimizer determines the type of jellyfish motion and controls the switching between the phases of the JS optimizer (following an ocean current and moving inside a jellyfish swarm). The equation below provides the time control function, c(t).

\begin{matrix} c (t) = |(1 - \frac{t}{M a x_{i t e r}}) \times (2 \times rand (0,1) - 1)| . \end{matrix}

(11)

Here, t is the time specified as the iteration number and Max_iter is the maximum number of iterations.

If the value of c(t) exceeds 0.5, then the jellyfish will follow the ocean current; if it is less than or equal to 0.5, the jellyfish will move in a jellyfish swarm [27]. To determine the type of jellyfish motion inside a jellyfish swarm (passive motion and active motion), the function 1 − c(t) is used. When rand(0,1) exceeds (1 − c(t)), the jellyfish will exhibit passive motion (type A). When rand(0,1) is less than (1 − c(t)), the jellyfish will exhibit active motion (type B). As t increases, the value of 1 − c(t) also increases [27].

3.2.4. Algorithmic Flowchart and Pseudocode

The algorithmic flowchart and the pseudocode of the JS algorithm, starting from problem definition, controlling parameters' definition, initialization, to the loop of phases, are presented in Figures 18 and 19, respectively.

Algorithmic flowchart of the jellyfish search algorithm.

Pseudocode of the jellyfish search algorithm.

3.3. Validation and Performance Evaluation

Validating the capability of the DL model that classifies data or analyzes datasets to predict a new dataset is essential. In neural network models, a loss function usually refers to the minimization of the prediction error. The training error, which is the average loss of the training sample, is not useful for evaluating the performance of the model because a low training error may indicate that the model is overfitting the training data, and so will generally perform poorly given new data [48]. The validation, therefore, should be conducted using a separate sample error.

During the development of a DL model, a dataset is typically split into three sets–the training set, the validation set, and the test set. The training set is used to learn the pattern of the inputs that correspond to a certain output; the validation set is used to evaluate the prediction error of the training model and to tune its hyperparameters; the test set is used to assess the error of the final model. No exact rule for splitting the dataset exists, as the split depends on the number and complexity of the available data.

3.3.1. Validation Method

A validation set is used as the input of a previously trained prediction model to evaluate the performance of the model when used with new, never-seen-before data. The validation process is repeated multiple times with various hyperparameter combinations, and thus the purpose of using a validation set is to assess the performance of the training model and to find the optimal hyperparameters.

Two of the most popular methods for evaluating the generalization ability of the prediction model are holdout method and cross-validation. The holdout method randomly splits the data into a training set, a validation set, and a test set. The cross-validation method partitions a dataset into several subsets, implements the learning process on all but one of those subsets, and evaluates the performance using the left-out subset in turn. The cross-validation method is particularly suitable for a small dataset to enhance model validity.

For practical use in the ready-mixed concrete plant, the model is built based on the accumulated historical data, and subsequently will be used for a new concrete dataset in the prediction of compressive strength. To fairly reflect the prediction accuracy on-site, this study adapted the holdout method by training/validating the model with the whole historical dataset and testing it with newly collected concrete data in the upcoming year. By doing so, one would not overestimate the model performance in practice and could prevent information leakage from model training.

3.3.2. Performance Metrics

The performance metrics that are used in this study are the mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), training time, and synthesis index (SI). The MAE is the average of the absolute differences between the actual and predicted values. Taking the absolute difference makes all error values positive, avoiding the false determination of an accurate prediction when negative and positive differences are summed.

Mean squared error (MSE) is the average of the squared differences between the actual and predicted values. The square root of the MSE, called the RMSE, is taken to the lower order of the MSE. MAPE is the average of the absolute errors divided by the actual values. The training time of various modeling techniques is compared to examine the implementation practicability.

A low value of MAE, RMSE, or MAPE indicates good performance; a short training time is desirable. SI is the mean of the sum of normalized values and indicates the overall model performance; it ranges from zero to one and zero indicates the best performance among all models. Table 1 provides the formulas for the performance measures.

Table 1.

Performance metrics.

Performance metric	Formula
Mean absolute error (MAE)	1/n∑_i=1ⁿ\|y − y′\|
Mean squared error (MSE)	1/n∑_i=1ⁿ(y′ − y)²
Root mean squared error (RMSE)	(1/n∑_i=1ⁿ(y′ − y)²)^1/2
Mean absolute percentage error (MAPE)	1/n∑_i=1ⁿ\|(y − y′)/y\|
Synthesis index (SI)	1/m∑_i=1^m\|(P − P_min)/(P_max − P_min)\|

Open in a new tab

Note. n, number of predictions; y, actual value; y′, predicted value; m, number of performance metrics; P, value of the performance metric; P_min, minimum value of performance metric; P_max, maximum value of performance metric.

4. Analytical Results and Discussion

4.1. Experimental Settings

4.1.1. Software and Hardware

Model building and testing were implemented in Anaconda software with the Python programming language on a machine (computer) with an NVIDIA GeForce RTX 2080 Ti graphics card. The Jupyter Notebook application in Anaconda [49] was used to display the inputs and outputs of the prediction models. Python packages support specific programming tasks and protect against their incompatibility. Numerous Python packages, which are available for use with Anaconda, are used (such as NumPy, pandas, and matplotlib). For building and testing the DNN models, the TensorFlow package [50] is used. For building and testing the CNN-based models, the Keras Application package [51] is used.

In particular, the Keras Application package supports the implementation of CNN models for prediction, fine-tuning, and feature extraction. It provides CNN models with pretrained weights from “ImageNet.” The package also provides a transfer learning feature that helps solve the practical problem of a lack of data resources and improves the accuracy of prediction using pretrained weights. Table 2 presents information about the models, with accuracies that were obtained using the 2012 ILSVRC ImageNet validation set [51]. The depth refers to the number of layers in the Keras Applications' CNN model, including the activation layer, batch normalization layer, and other layers.

Table 2.

Convolutional neural network-based models in the Keras Application.

Model	Top 1 accuracy	Top 5 accuracy	Depth	Size (MB)	Parameters	Reference

VGG16	0.713	0.901	23	528	138,357,544	[33]
VGG19	0.713	0.900	26	549	143,667,240	[33]
ResNet50	0.749	0.921	—	98	25,636,712	[34]
ResNet101	0.764	0.928	—	171	44,707,176
ResNet152	0.766	0.931	—	232	60,419,944
ResNet50V2	0.760	0.930	—	98	25,613,800	[35]
ResNet101V2	0.772	0.938	—	171	44,675,560
ResNet152V2	0.780	0.942	—	232	60,380,648
InceptionV3	0.779	0.937	159	92	23,851,784	[36]
InceptionResNetV2	0.803	0.953	572	215	55,873,736	[37]
Xception	0.790	0.945	126	88	22,910,480	[38]
MobileNet	0.704	0.895	88	16	4,253,864	[39]
MobileNetV2	0.713	0.901	88	14	3,538,984	[40]
DenseNet121	0.750	0.923	121	33	8,062,504	[41]
DenseNet169	0.762	0.932	169	57	14,307,880
DenseNet201	0.773	0.936	201	80	20,242,984
NASNetMobile	0.744	0.919	—	23	5,326,716	[42]
NASNetLarge	0.825	0.960	—	343	88,949,818	[42]
EfficientNetB0	—	—	—	29	5,330,571	[43]
EfficientNetB1	—	—	—	31	7,856,239
EfficientNetB2	—	—	—	36	9,177,569
EfficientNetB3	—	—	—	48	12,320,535
EfficientNetB4	—	—	—	75	19,466,823
EfficientNetB5	—	—	—	118	30,562,527
EfficientNetB6	—	—	—	166	43,265,143
EfficientNetB7	—	—	—	256	66,658,687

Open in a new tab

4.1.2. Collection and Preprocessing of Data

A total of 8,310 data samples about ready-mixed concrete, relating to 32 variables, were collected from 2001 to 2019 by the Taiwan Construction Research Institute (TCRI). The data were split at the time of data sample collection to enable a prediction model to be built using historical data and tested using new data.

Accordingly, 339 data samples that covered one year (2019) were used in the testing process and the remaining 7,971 data samples were used in the training process. Of the 339 data samples for testing, 15 were removed because the value of compressive strength was missing, creating a test set of 324 data samples. The 7,971 data samples for training were further preprocessed according to the practical recommendations by a panel of domain experts in TCRI.

Among the 32 variables, the manufacturer's name, category of data, and date of collection were removed because the corresponding data were apparently irrelevant to the variability of concrete compressive strength. Ten other variables were removed because data were incomplete; these were the amount of admixture, the surface moisture content of sand (from a computer report and sieve analysis, respectively), silt charge, fineness modulus of sand, the strength of cement, specific surface area of cement, percentage of active blast furnace slag, fineness of blast furnace slag, and the ratio of water-reducing admixtures.

Totally, there are 19 concrete variables to be used for the prediction of the concrete compressive strength. One output variable is the test value of ready-mixed concrete compressive strength, and the other 18 input variables are the design strength of concrete, target strength of concrete, slump test, chloride ion content, temperature (temperature of the concrete taken on site), water-binder ratio, the water content of concrete, cementitious material consumption, cement ratio, amount of cement, amount of slag powder, amount of fly ash, amount of fine aggregate, amount of coarse aggregate, sand ratio, location (north), location (middle), and location (south).

The preprocessed data were processed again to yield three sets of data with different variables for use in numerical experiments for various purposes. Dataset 1 included 13 variables that are recommended by the TCRI; dataset 2 included 7 variables that are frequently used in prior studies [52–55] on the prediction of compressive strength; and dataset 3 included the resulting 18 variables after preprocessing. Tables 3–5 display the variables in the dataset, the number of data points in the datasets, and the descriptive statistics of variables in the datasets, respectively.

Table 3.

Variables in the datasets.

Dataset variable	Dataset 1	Dataset 2	Dataset 3

Design strength of concrete	—	—	✓
Target strength of concrete	—	—	✓
Slump test	—	—	✓
Chloride ion content	—	—	✓
Temperature	—	—	✓
Water-binder ratio	✓	✓	✓
Water content of concrete	✓	✓	✓
Cementitious material consumption	✓	—	✓
Cement ratio	✓	—	✓
Amount of cement	✓	✓	✓
Amount of slag powder	✓	✓	✓
Amount of fly ash	✓	✓	✓
Amount of fine aggregate	✓	✓	✓
Amount of coarse aggregate	✓	✓	✓
Sand ratio	✓	—	✓
Location (north)	✓	—	✓
Location (middle)	✓	—	✓
Location (south)	✓	—	✓
Compressive strength test	✓	✓	✓

Open in a new tab

Note. Dataset 1 = industry recommendation; dataset 2 = suggested by research community; dataset 3 = all features considered. Variables in dataset 2 are frequently used to determine the compressive strength of concrete in the literature.

Table 4.

Number of data points in the datasets.

Number of data points	Dataset 1	Dataset 2	Dataset 3
Number of total samples	6705	6705	5856
Number of training samples	6381	6381	5532
Number of testing samples	324	324	324
Number of input variables	13	7	18
Number of output variables	1	1	1

Open in a new tab

Table 5.

Descriptive statistics of variables from the datasets.

Variables		Unit	Minimum	Maximum	Average

Dataset 1—industry recommendation
X6	Water-binder ratio	—	0.25	0.87	0.52
X7	Water content of concrete	(kg/m³)	121.00	250.25	184.98
X8	Cementitious material consumption	(kg/m³)	209.00	690.00	361.15
X9	Cement ratio	(%)	30.67	100.00	70.33
X10	Amount of cement	(kg/m³)	99.20	507.00	255.43
X11	Amount of slag powder	(kg/m³)	0.00	209.35	68.90
X12	Amount of fly ash	(kg/m³)	0.00	180.00	36.82
X13	Amount of coarse aggregate	(kg/m³)	344.24	1281.00	919.30
X14	Amount of fine aggregate	(kg/m³)	468.00	1376.96	860.22
X15	Sand ratio	(%)	0.00	80.00	48.32
X16	Location (north)	—	0.00	1.00	0.44
X17	Location (middle)	—	0.00	1.00	0.12
X18	Location (south)	—	0.00	1.00	0.44
Y	Compressive strength test	(kgf/cm²)	125.00	724.00	343.49

Dataset 2—suggested by the research community
X6	Water-binder ratio	—	0.25	0.87	0.52
X7	Water content of concrete	(kg/m³)	121.00	250.25	184.98
X10	Amount of cement	(kg/m³)	99.20	507.00	255.43
X11	Amount of slag powder	(kg/m³)	0.00	209.35	68.90
X12	Amount of fly ash	(kg/m³)	0.00	180.00	36.82
X13	Amount of fine aggregate	(kg/m³)	468.00	1376.96	860.22
X14	Amount of coarse aggregate	(kg/m³)	344.24	1281.00	919.30
Y	Compressive strength test	(kgf/cm²)	125.00	724.00	343.49

Dataset 3—all features considered
X1	Design strength of concrete	(kgf/cm²)	140.00	420.00	254.40
X2	Target strength of concrete	(kgf/cm²)	160.00	660.00	320.27
X3	Slump test	(cm)	8.50	69.00	19.48
X4	Chloride ion content	(%)	0.00	0.14	0.04
X5	Temperature	(°C)	14.00	35.00	26.19
X6	Water-binder ratio	—	0.25	0.83	0.52
X7	Water content of concrete	(kg/m³)	121.00	250.25	184.87
X8	Cementitious material consumption	(kg/m³)	209.00	690.00	363.42
X9	Cement ratio	(%)	30.67	100.00	70.10
X10	Amount of cement	(kg/m³)	99.20	507.00	256.17
X11	Amount of slag powder	(kg/m³)	0.00	209.35	68.85
X12	Amount of fly ash	(kg/m³)	0.00	180.00	38.40
X13	Amount of fine aggregate	(kg/m³)	468.00	1376.96	860.88
X14	Amount of coarse aggregate	(kg/m³)	344.24	1281.00	916.43
X15	Sand ratio	(%)	0.00	80.00	48.41
X16	Location (north)	—	0.00	1.00	0.48
X17	Location (middle)	—	0.00	1.00	0.13
X18	Location (south)	-	0.00	1.00	0.39
Y	Compressive strength	(kgf/cm²)	162.00	724.00	344.82

Open in a new tab

4.1.3. Converting Numerical Data into Images

The original numerical data were converted to images to be used as inputs to the CNN-based models. Each collection of values in a data sample for concrete was represented as an image. To create the image, the numerical data were first normalized to values between 0 and 1. These normalized data were then multiplied by 255 to encode them as grayscale values between 0 and 255 (Figure 20). Black represents 0 and white represents 255.

Conversion example of numerical data to image data.

For each of datasets 1 and 2, a total of 6705 images were created. For dataset 3, a total of 5856 images were created. Figure 21 presents the example (dataset 3) of the labeling of the image data. Each image is labeled with the corresponding continuous output value, the compressive strength value of the ready-mixed concrete.

Input images and corresponding output labels.

4.2. Implementation and Comparison

Prediction models and sensitivity experiments with various purposes were carried out (Table 6). Baseline models were used with the hyperparameters set to default values in the TensorFlow and Keras Applications. In the DNN, numerical data are input, while for the CNN-based models, the input numerical data are converted to image data. In this study, the size of the image input to each CNN-based model was the minimum possible size to meet the practical needs.

Table 6.

Experimental settings.

Research task	Data type	Purpose	Method
Comparison of deep learning models	Numerical data and image data	Search for the best CNN model (using image data) and compare the best CNN model with a DNN model (using numerical data)	CNNs and DNN: VGG, ResNet, ResNetV2, InceptionV3, InceptionResNetV2, MobileNets, MobileNetV2, NASNet, EfficientNets, DenseNet
Construction of optimized deep learning models	Image data	Enhance the best-performing models with optimized hyperparameters	Optimizing deep learning models by jellyfish search algorithm

Open in a new tab

4.2.1. Deep Learning Models and Performance

Since the same model and hyperparameters yielded different model performance values in different runs, each model was tested five times and the average model performance value was taken as the actual. For both the CNN and DNN models, the loss function was set to be the MSE. In the DNN model, 50 hidden layers with selected numbers of hidden nodes (Table 7) had the best prediction accuracy in comparison with other numbers of hidden layers and other numbers of hidden nodes. The architecture was thus used to build the baseline DNN prediction model.

Table 7.

Number of hidden nodes in each hidden layer of the deep neural network (DNN).

Hidden layer	1^st	2^nd	3^rd	4^th	5^th	6^th	7^th	8^th–10^th	11^th–20^th	21^st–30^th	31^st–40^th	41^st–50^th
Number of hidden nodes	4096	2048	1024	512	256	128	64	32	16	8	4	2

Open in a new tab

Tables 8–10 compare the performances of the DL models in predicting the compressive strength of ready-mixed concrete when they are trained and tested using the given data. The results indicate that the CNN models, ResNet50V2, MobileNet, and DenseNet121, with their default parameters, all performed best on the three datasets, respectively. The CNN models, ResNet50V2, MobileNet, and DenseNet121, with image data, outperformed the baseline DNN model with numerical data. The results also indicate that the best CNN models on each dataset outperformed the DNN in terms of each performance metric, except for the training time.

Table 8.

Deep learning model performance on dataset 1.

Model	Training time (h:m:s)	MAPE (%)	RMSE (kgf/cm²)	MAE (kgf/cm²)	SI

Xception	2:31:01	14.0264	76.4217	57.7471	0.285
VGG16	0:17:58	15.9598	85.6755	65.6588	0.249
VGG19	0:21:16	14.8719	79.9609	61.1835	0.147
ResNet50	0:18:28	15.0252	80.8499	62.5910	0.164
ResNet101	0:32:30	14.3817	75.4145	57.5325	0.091 (3)
ResNet152	0:44:12	14.3462	78.5345	59.2920	0.144
ResNet50V2	0:16:56	13.8000	73.7818	56.4419	0.027 (1)
ResNet101V2	0:29:17	14.7393	74.4318	58.7149	0.100
ResNet152V2	0:43:11	16.2188	85.7025	67.9747	0.318
InceptionV3	0:45:14	14.2727	75.2054	58.0613	0.111
InceptionResNetV2	1:43:38	14.7849	77.9702	60.6185	0.264
MobileNet	0:10:35	15.1504	79.4783	62.0158	0.141
MobileNetV2	0:12:09	17.2442	82.0049	63.9462	0.244
DenseNet121	0:18:49	15.6411	82.3700	64.8698	0.213
DenseNet169	0:24:41	15.3712	80.3540	63.4890	0.190
DenseNet201	0:31:28	15.4292	80.7504	63.7175	0.207
NASNetMobile	0:35:50	15.8436	82.1353	64.5839	0.244
EfficientnetB0	0:18:10	14.2585	75.3426	58.3330	0.069 (2)
EfficientnetB1	0:27:30	14.9977	80.3189	61.8780	0.169
EfficientnetB2	0:28:38	15.4503	80.6015	63.4244	0.200
EfficientnetB3	0:35:30	14.9118	80.7490	61.6706	0.181
EfficientnetB4	0:45:16	14.6289	79.1150	61.1313	0.173
EfficientnetB5	0:59:30	14.5750	76.2138	59.6796	0.164
EfficientnetB6	1:13:08	15.0979	81.5700	62.4580	0.261
EfficientnetB7	1:42:25	14.4783	74.6023	59.1117	0.218
DNN	0:00:46	21.4910	112.2759	87.9198	0.750

Open in a new tab

Table 9.

Deep learning model performance on dataset 2.

Model	Training time (h:m:s)	MAPE (%)	RMSE (kgf/cm²)	MAE (kgf/cm²)	SI

Xception	1:12:19	17.2430	89.4229	70.1275	0.493
VGG16	0:10:26	18.3202	100.0902	73.6012	0.387
VGG19	0:12:32	17.7289	98.4250	74.1325	0.373
ResNet50	0:12:24	15.0208	78.1490	61.1607	0.105
ResNet101	0:21:34	14.5379	74.5514	58.2454	0.086 (2)
ResNet152	0:30:17	16.5355	90.8698	67.9069	0.321
ResNet50V2	0:11:28	15.7710	80.7428	63.8803	0.154
ResNet101V2	0:20:19	15.1000	76.9871	60.4406	0.124
ResNet152V2	0:28:43	14.4184	73.6713	57.4732	0.098
InceptionV3	0:27:04	16.4996	87.7770	69.3648	0.302
InceptionResNetV2	1:02:49	15.8074	84.5667	66.3267	0.371
MobileNet	0:06:18	15.0699	77.6579	61.3514	0.084 (1)
MobileNetV2	0:07:26	18.6955	83.8692	68.5122	0.264
DenseNet121	0:12:41	15.3784	83.6683	64.4323	0.167
DenseNet169	0:17:17	15.5968	85.7416	65.6683	0.209
DenseNet201	0:22:02	14.9204	83.5047	62.5952	0.175
NASNetMobile	0:20:32	22.3714	115.6740	93.5611	0.745
EfficientnetB0	0:12:27	15.0674	80.2897	61.9535	0.124
EfficientnetB1	0:17:52	15.5417	81.8874	63.5534	0.174
EfficientnetB2	0:18:12	14.5659	77.0040	59.4023	0.096 (3)
EfficientnetB3	0:21:51	15.4034	81.4909	63.2014	0.180
EfficientnetB4	0:27:17	15.7882	83.7080	64.3034	0.228
EfficientnetB5	0:36:30	14.9359	78.5506	60.4780	0.185
EfficientnetB6	0:45:47	15.0286	78.3096	61.4788	0.225
EfficientnetB7	1:01:55	15.3541	80.7350	62.9946	0.313
DNN	0:00:40	23.9215	119.4923	95.5619	0.750

Open in a new tab

Table 10.

Deep learning model performance on dataset 3.

Model	Training time (h:m:s)	MAPE (%)	RMSE (kgf/cm²)	MAE (kgf/cm²)	SI

Xception	2:59:21	13.0524	64.3952	51.9907	0.326
VGG16	0:20:24	13.8902	69.6804	56.1444	0.165
VGG19	0:23:28	14.1965	69.1497	55.3852	0.169
ResNet50	0:20:26	13.9585	70.9084	57.2012	0.178
ResNet101	0:34:28	11.6479	59.5678	47.1273	0.049 (2)
ResNet152	0:48:51	11.7348	60.4651	47.2528	0.076
ResNet50V2	0:18:31	12.3769	64.2470	51.1966	0.083
ResNet101V2	0:31:55	12.1501	63.4072	49.7640	0.086
ResNet152V2	0:46:54	12.0281	61.4381	48.0813	0.087
InceptionV3	0:51:37	13.3665	64.9479	52.1857	0.157
InceptionResNetV2	1:57:42	13.5490	64.0620	51.9634	0.248
MobileNet	0:12:27	13.1265	65.2462	52.4753	0.100
MobileNetV2	0:14:00	13.0155	60.0301	47.8787	0.053
DenseNet121	0:20:09	11.7167	59.3511	47.1034	0.029 (1)
DenseNet169	0:25:51	11.7929	61.0411	47.9452	0.051 (3)
DenseNet201	0:32:50	11.6047	60.6139	47.7109	0.054
NASNetMobile	0:41:08	24.6483	109.9440	93.7760	0.780
EfficientnetB0	0:20:37	12.9179	64.7033	52.1055	0.103
EfficientnetB1	0:30:28	13.3862	68.6528	55.0783	0.159
EfficientnetB2	0:31:25	13.4363	68.1339	54.9089	0.159
EfficientnetB3	0:38:59	13.2016	67.0390	53.1422	0.150
EfficientnetB4	0:49:39	13.1075	65.5639	52.3979	0.153
EfficientnetB5	1:09:29	13.2787	67.5948	54.1713	0.203
EfficientnetB6	1:26:14	12.4606	63.7319	50.5952	0.174
EfficientnetB7	1:55:15	12.8867	63.4471	51.0061	0.224
DNN	0:00:44	22.8024	116.0370	90.6988	0.698

Open in a new tab

4.2.2. Optimized Convolutional Neural Network-Based Models

As CNN models, ResNet50V2, MobileNet, and DenseNet121, performed best in the corresponding datasets, a metaheuristic optimization algorithm, the jellyfish search (JS) optimizer, was used to optimize them. The CNN models were optimized to minimize the errors of prediction of the compressive strength of ready-mixed concrete using the best values of the hyperparameters.

The JS optimizer was used to find the best hyperparameter values in a set of ranges. Several hyperparameters of a CNN, such as the epsilon of batch normalization, batch size, epoch, learning rate, and dropout rate, were selected to be adjusted during the search herein. For DenseNet121, two additional hyperparameters were optimized—the growth rate and the reduction value. Table 11 presents the default values of hyperparameters in the reference papers [34, 41] and the range of hyperparameters to be finetuned in this study.

Table 11.

Hyperparameter settings for deep learning models.

Hyperparameter	Literature value	Search range in this study

ResNet50V2 [34]
Batch normalization-epsilon	1.001e − 5	[1.001e − 5, 0.00005, 0.0001, 0.0005, 0.001]
Batch size	64, 256	[8, 16, 32, 64]
Epochs	40, 90, 300	[10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
ADAM-learning rate	0.1	[0.001, 0.005, 0.01, 0.05, 0.1]
Dropout rate	0.5	0.00–0.99

MobileNet [39]
Batch size	64, 256	[8, 16, 32, 64]
Epochs	40, 90, 300	[10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
ADAM-learning rate	0.1	[0.001, 0.005, 0.01, 0.05, 0.1]
Dropout rate	0.5	0.00–0.99

DenseNet121 [41]
Growth rate	32	12–48
Batch normalization-epsilon	1.001e − 5	[1.001e − 5, 0.00005, 0.0001, 0.0005, 0.001]
Batch size	64, 256	[8, 16, 32, 64]
Epochs	40, 90, 300	[10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
Reduction	0.5	0.1–1.0
ADAM-learning rate	0.1	[0.001, 0.005, 0.01, 0.05, 0.1]
Dropout rate	0.2	0.00–0.99

Open in a new tab

Table 12 compares the performances of best CNN models using default hyperparameters and optimized by JS in predicting the compressive strength of ready-mixed concrete. The results indicate that using the JS optimizer on the hyperparameters improved the accuracy of the prediction models. Table 13 shows the best hyperparameter settings for each optimized CNN model.

Table 12.

Performance of the best and optimized CNN models.

Dataset	Model	MAPE (%)	RMSE (kgf/cm²)	MAE (kgf/cm²)
1	ResNet50V2	13.8000	73.7818	56.4419
1	JS-ResNet50V2	13.1327	68.5794	52.4591

2	MobileNet	17.0406	91.6945	71.1198
2	JS-MobileNet	17.0671	90.4711	70.0870

3	DenseNet121	11.7167	59.3511	47.1034
3	JS-DenseNet121	11.5443	58.4346	45.8917

Open in a new tab

Table 13.

Optimal hyperparameters of the best CNN models.

Hyperparameter	Optimal value
JS-ResNet50V2
Batch normalization-epsilon	0.0005
Batch size	64
Epochs	100
ADAM-learning rate	0.001
Dropout rate	0.26

JS-MobileNet
Batch size	16
Epochs	70
ADAM-learning rate	0.001
Dropout rate	0.65

JS-DenseNet121
Growth rate	38
Batch normalization-epsilon	0.00005
Batch size	64
Epochs	90
Reduction	0.7
ADAM-learning rate	0.001
Dropout rate	0.33

Open in a new tab

4.3. Influence of Feature and Image Pixel Orientation on Modeling Accuracy

To examine the sensitivity of the generalization ability of a prediction model, the resulting 18 variables (features) were experimented using the best CNN model (DenseNet121) by removing one of the variables and using the remaining variables for model training. For the location variables, three variables were removed simultaneously and the remaining variables were used for sensitivity analysis. These tests were conducted to investigate the effect of each feature (attribute) on the generalization ability of model prediction. Table 14 displays the performance results with MAPEs, in which the lower value of the MAPE stands for the better model performance without the specified attribute. The experiment demonstrated that the MAPEs do not differ much from one another. However, the slight increase of MAPE in each numerical experiment comparing to the baseline MAPE (11.72%) implies the inclusion of those variables (X1–X3, X5, and X16–X18) has a positive impact on the prediction accuracy of ready-mixed concrete compressive strength.

Table 14.

Sensitivity analysis of input features.

No.

X10

X11

X12

X13

X14

X15

X16

X17

X18

MAPE (%)

—

✓

11.94

✓

—

✓

11.91

✓

—

✓

11.95

✓

—

✓

11.66

✓

—

✓

11.99

✓

—

✓

11.01

✓

—

✓

11.32

✓

—

✓

11.05

✓

—

✓

11.57

✓

—

✓

11.17

✓

—

✓

11.63

✓

—

✓

11.69

✓

—

✓

11.29

✓

—

✓

11.59

✓

—

✓

11.26

✓

—

11.98

✓

11.72

Open in a new tab

Note. X1 = design strength of concrete, X2 = target strength of concrete, X3 = slump test, X4 = chloride ion content, X5 = temperature, X6 = water-binder ratio, X7 = water content of concrete, X8 = cementitious material consumption, X9 = cement ratio, X10 = amount of cement, X11 = amount of slag powder, X12 = amount of fly ash, X13 = amount of fine aggregate, X14 = amount of coarse aggregate, X15 = sand ratio, X16 = location (north), X17 = location (middle), and X18 = location (south).

Another numerical experiment was conducted to examine the influence of image pixel orientation (pixel row order) on the computer vision-based modeling performance. Two types of image pixel orientation (IPO) formed by the input attributes (pixels) were tested, namely, the original pixel array and the correlated pixel array, according to the correlation values between the input attributes and the compressive strength. Specifically, the input image data were shaped by arranging the input attributes (pixels) in random order and descending the pixels order by their correlation coefficients, respectively.

Table 15 displays the correlation coefficients between the input variables and the compressive strength of ready-mixed concrete. Ordering the IPO based on the magnitude of the correlation coefficients, two new datasets were created. One IPO is arranged by descending the original values of the correlation coefficients and the other IPO is arranged by descending their absolute values.

Table 15.

Correlation between the feature and compressive strength of ready-mixed concrete.

Feature	Correlation coefficient between feature and Y
X1	0.75
X2	0.82
X3	0.23
X4	0.05
X5	−0.15
X6	−0.74
X7	−0.15
X8	0.73
X9	0.14
X10	0.46
X11	0.05
X12	0.02
X13	−0.44
X14	−0.06
X15	−0.25
X16	0.24
X17	0.06
X18	−0.29

Open in a new tab

Table 16 presents the sensitivity analysis of image pixel orientation on the computer vision-based modeling performance. It is observable that all metrics with the correlated order of image pixel orientation show worse performance than that obtained using the original ordered image by the same optimized CNN model (JS-DenseNet121). Therefore, the analytical results indicate that the correlated order of image pixel orientation for the image converting of ready-mixed concrete data does not significantly influence the performance of the prediction model.

Table 16.

Results of the order importance analysis of the image-like dataset.

Image pixel orientation	Type of pixel order	MAPE (%)	RMSE (kgf/cm²)	MAE (kgf/cm²)
Original order	Random arrangement	11.5443	58.4346	45.8917
Correlated order	Descending by correlated values	12.0831	61.3435	48.1922
Correlated order	Descending by absolute correlated values	12.5888	64.5037	50.7435

Open in a new tab

5. Conclusions

The effectiveness of computer vision in predicting the compressive strength of ready-mixed concrete was analyzed to improve the predictions thereof. Deep learning (DL) models were constructed by imaging the numerical data as inputs to predict the compressive strength of ready-mixed concrete. Various prediction models were compared and the best DL prediction models were identified for different sets of input concrete-related features and optimized after their performances were further analyzed.

The models for the prediction of concrete compressive strength are frequently built with the use of cross-validation or random split in-sample data for evaluating prediction accuracy, which often gives optimistic results (overfitting) in the training/test process while exhibiting poor performance in future use. It's mainly because the processes, materials, machines, and technicians that are involved to manufacture ready-mixed concrete in batch plants are being continually improved and replaced periodically. Up-to-date samples for ready-mixed concrete might be derived differently from the evolving development of batch processes.

A prediction model is built using historical data; it uses newly collected data, which should be irrelevant to the training data samples, to make predictions; therefore, the optimality of using random split in-sample data to test models in the prediction of concrete compressive strength in the literature is now doubted. To capture the actual performance of predicting the compressive strength of concrete, out-of-sample data (newly collected data) should be used for model testing to avoid potential information leakage. Although the model accuracy may be decreased in comparison with that obtained by in-sample cross-validation or randomly split data for training and test, using such an approach for the out-of-sample test reflects the real predictive performance in practice.

Furthermore, CNN-oriented models are often trained without tuning the hyperparameters. This study adopts a metaheuristic optimization algorithm to optimize the prediction model. The predictive accuracy of computer vision-based deep learning models was improved herein using the jellyfish search (JS) metaheuristic optimization algorithm. The JS optimizer finds the best hyperparameters, optimizing the performance metrics of the CNN models. This study contributes to the novel application of the computer vision-based method, which integrates the latest CNN models with a newly developed JS optimizer to predict the compressive strength of ready-mixed concrete. The analytical experiments show that modeling with image-converting data outperforms the models using the original numerical data.

In this investigation, the training data were samples on ready-mixed concrete only. Using data on high-performance concrete or more complex engineering data would improve this work of the computer vision approach to predicting a numerical output like the compressive strength of concrete. More cases should be studied to confirm the effectiveness of imaging data on ready-mixed concrete and other types of concrete to identify patterns of compressive strength by the bio-inspired metaheuristic optimization of computer vision-based deep learning models.

Future studies could consider environment-oriented factors that may affect the ready-mixed concrete compressive strength, such as the type of manufacturing equipment, transporting process of concrete, and the handling speed of on-site operators in addition to the material-oriented attributes herein. A fair comparison between laboratory-determined concrete compressive strength and on-site evaluation of concrete compressive strength should be investigated.

Acknowledgments

The authors would like to thank Taiwan Construction Research Institute and the Ministry of Science and Technology, Taiwan, for financially supporting this research under grants NTUST-TCRI-No.109-0139-9257 and MOST 109-2221-E-011-040-MY3, respectively.

Abbreviations

AI:: Artificial intelligence
ANN:: Artificial neural network
BN:: Batch normalization
CNN/ConvNet:: Convolutional neural network
DenseNet:: Dense convolutional network
DL:: Deep learning
DNN:: Deep neural network
FNN:: Feedforward neural network
IN:: Instance normalization
IPO:: Image pixel orientation
JS:: Jellyfish search
LN:: Layer normalization
MAE:: Mean absolute error
MAPE:: Mean absolute percentage error
MSE:: Mean squared error
NASNet:: Neural architecture search network
ReLU:: Rectified linear unit
ResNet:: Residual neural network
RF:: Random forest
RMSE:: Root mean squared error
RNN:: Recurrent neural network
SI:: Synthesis index
SVR:: Support vector regression
TCRI:: Taiwan Construction Research Institute
VGG:: Visual geometry group
Xception:: Extreme inception.

Symbols

w × h × c:: Width of image × height of image × number of channels
m:: Dimension of an input image
n:: Filter size
C:: Convolution map
I:: Input image data
Θ:: Convolution operation
F:: Filter
o:: Dimension of convolution map
s:: Stride
zp:: Zero padding
f:: Nonlinear activation function
C _m:: Convolution map after applying the nonlinear activation function f
P _m:: Pooling map
P _o:: Pooling operation
Y ⁱ:: Model output of the ith fully connected hidden layer
H ⁱ:: Weight sum vector
B ⁱ:: The activation level of the artificial neurons
BN(x):: Batch normalization at a given layer from x
γ:: Scale parameter for the channel
β:: Shift parameter for the channel
μ _B:: Mean of the batch
σ _B:: Standard deviation of the batch
∗:: Element-wise product
f ^l(x_i):: Original feature
${\tilde{f}}^{l} (x_{i})$ :: Distorted features
m _i ^l:: Binary mask
d^l:: Dimension of the feature map of the l-th layer
$\overset{⟶}{trend}$ :: Direction of the ocean current
X ^∗:: Jellyfish with the optimal location
μ:: Average location of all jellyfish
X _i:: Jellyfish of interest
t:: Time specified as an iteration number
X _i(t):: Current location of a jellyfish
X _i(t+1):: New location of a jellyfish after a movement
rand(0,1):: Random number between 0 and 1
Ub:: Upper bounds of the search spaces
Lb:: Lower bounds of the search spaces
X _j:: Jellyfish other than the jellyfish of interest
f(X_i):: Quantity of food at the location of X_i
f(X_j):: Quantity of food at the location of X_j
$\overset{⟶}{Direction}$ :: Direction of active motion (type B) of jellyfish
c(t):: Time control function
Max_iter:: Maximum number of iterations
n:: Number of predictions
y:: Actual value
y′:: Predicted value
m:: Number of performance metrics
P:: Value of performance metric
P _min:: Minimum value of performance metric
P _max:: Maximum value of performance metric.

Data Availability

The datasets, codes, and replication of results generated and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

1.Kellouche Y., Boukhatem B., Ghrici M., Tagnit-Hamou A. Exploring the major factors affecting fly-ash concrete carbonation using artificial neural network. Neural Computing & Applications . 2019;31(6):969–988. doi: 10.1007/s00521-017-3052-2. [DOI] [Google Scholar]
2.Adil M., Ullah R., Noor S., Gohar N. Effect of number of neurons and layers in an artificial neural network for generalized concrete mix design. Neural Computing & Applications . 2022;34(11):8355–8363. doi: 10.1007/s00521-020-05305-8. [DOI] [Google Scholar]
3.Abuodeh O. R., Abdalla J. A., Hawileh R. A. Assessment of compressive strength of Ultra-high Performance Concrete using deep machine learning techniques. Applied Soft Computing . 2020;95 doi: 10.1016/j.asoc.2020.106552.106552 [DOI] [Google Scholar]
4.Deng F., He Y., Zhou S., Yu Y., Cheng H., Wu X. Compressive strength prediction of recycled concrete based on deep learning. Construction and Building Materials . 2018;175:562–569. doi: 10.1016/j.conbuildmat.2018.04.169. [DOI] [Google Scholar]
5.Sharma A., Tirumuruhan B., Muthuvel G. S., Gupta A. K., Sujith R. Optimization of process parameters of boron carbide-reinforced Al-Zn-Mg-Cu matrix composite produced by pressure-assisted sintering. Journal of Materials Engineering and Performance . 2022;31(1):328–340. doi: 10.1007/s11665-021-06210-4. [DOI] [Google Scholar]
6.Lessard M., Chaalla O., Aitcin P.-C. Testing high-strength concrete compressive strength. ACI Materials Journal . 1993;90(4):303–308. doi: 10.1016/S0950-0618(96)00020-7. [DOI] [Google Scholar]
7.Ben-Zeitun A. E. Use of pulse velocity to predict compressive strength of concrete. International Journal of Cement Composites and Lightweight Concrete . 1986;8(1):51–59. doi: 10.1016/0262-5075(86)90024-2. [DOI] [Google Scholar]
8.Chung K. L., Wang L., Ghannam M., Guan M., Luo J. Prediction of concrete compressive strength based on early-age effective conductivity measurement. Journal of Building Engineering . 2021;35 doi: 10.1016/j.jobe.2020.101998.101998 [DOI] [Google Scholar]
9.Zain M. F. M., Abd S. M. Multiple regression model for compressive strength prediction of high performance concrete. Journal of Applied Sciences . 2009;9(1):155–160. doi: 10.3923/jas.2009.155.160. [DOI] [Google Scholar]
10.Bharatkumar B. H., Narayanan R., Raghuprasad B. K., Ramachandramurthy D. S. Mix proportioning of high performance concrete. Cement and Concrete Composites . 2001;23(1):71–80. doi: 10.1016/s0958-9465(00)00071-8. [DOI] [Google Scholar]
11.Luke J. V. R., Snell M., Norval D. W. Predicting early concrete strength. Concrete International . 1989;11(12) https://www.concrete.org/publications/internationalconcreteabstractsportal/m/details/id/2092 . [Google Scholar]
12.Essam Y., Kumar P., Ahmed A. N., Murti M. A., El-Shafie A. Exploring the reliability of different artificial intelligence techniques in predicting earthquake for Malaysia. Soil Dynamics and Earthquake Engineering . 2021;147 doi: 10.1016/j.soildyn.2021.106826.106826 [DOI] [Google Scholar]
13.Jin H., Zhao J. Real-time energy consumption detection simulation of network node in internet of things based on artificial intelligence. Sustainable Energy Technologies and Assessments . 2021;44 doi: 10.1016/j.seta.2021.101004.101004 [DOI] [Google Scholar]
14.Adeli H. Four Decades of Computing in Civil Engineering . Singapore: Springer Singapore; 2020. [DOI] [Google Scholar]
15.Xu Y., Zhou Y., Sekula P., Ding L. Machine Learning in Construction: From Shallow to Deep Learning. Developments in the Built Environment . 2021;6(6) doi: 10.1016/j.dibe.2021.100045.100045 [DOI] [Google Scholar]
16.Garcia-Garcia A., Orts-Escolano S., Oprea S., Villena-Martinez V., Martinez-Gonzalez P., Garcia-Rodriguez J. A survey on deep learning techniques for image and video semantic segmentation. Applied Soft Computing . 2018;70:41–65. doi: 10.1016/j.asoc.2018.05.018. [DOI] [Google Scholar]
17.Korbar B., Olofson A. M., Miraflor A. P., et al. Deep learning for classification of colorectal polyps on whole-slide images. Journal of Pathology Informatics . 2017;8(1):p. 30. doi: 10.4103/jpi.jpi_34_17. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Chou J.-S., Karundeng M. A., Truong D.-N., Cheng M.-Y. Identifying deflections of reinforced concrete beams under seismic loads by bio-inspired optimization of deep residual learning. Structural Control and Health Monitoring . 2022;29(4) doi: 10.1002/stc.2918.e2918 [DOI] [Google Scholar]
19.Hatcher W. G., Yu W. A survey of deep learning: platforms, applications and emerging research trends. IEEE Access . 2018;6:24411–24432. doi: 10.1109/access.2018.2830661. [DOI] [Google Scholar]
20.Chen N., Zhao S. B., Gao Z. W., et al. Virtual mix design: prediction of compressive strength of concrete with industrial wastes using deep data augmentation. Construction and Building Materials . 2022;323:p. 13. doi: 10.1016/j.conbuildmat.2022.126580. [DOI] [Google Scholar]
21.Maier H. R., Razavi S., Kapelan Z., Matott L. S., Kasprzyk J., Tolson B. A. Introductory overview: optimization using evolutionary algorithms and other metaheuristics. Environmental Modelling & Software . 2019;114:195–213. doi: 10.1016/j.envsoft.2018.11.018. [DOI] [Google Scholar]
22.Liu Q., Li X., Liu H., Guo Z. Multi-objective metaheuristics for discrete optimization problems: a review of the state-of-the-art. Applied Soft Computing . 2020;93 doi: 10.1016/j.asoc.2020.106382.106382 [DOI] [Google Scholar]
23.Black P. E. Greedy Algorithm, Dictionary of Algorithms and Data Structures 2021. 2005. http://www.nist.gov/dads .
24.Baldi P. Gradient descent learning algorithm overview: a general dynamical systems perspective. IEEE Transactions on Neural Networks . 1995;6(1):182–195. doi: 10.1109/72.363438. [DOI] [PubMed] [Google Scholar]
25.Pearl J. Heuristics: intelligent search strategies for computer problem solving. 1984. https://www.gbv.de/dms/weimar/toc/021186472_toc.pdf .
26.Chou J.-S., Nguyen N.-M. FBI inspired meta-optimization. Applied Soft Computing . 2020;93 doi: 10.1016/j.asoc.2020.106339.106339 [DOI] [Google Scholar]
27.Chou J.-S., Truong D.-N. A novel metaheuristic optimizer inspired by behavior of jellyfish in ocean. Applied Mathematics and Computation . 2021;389 doi: 10.1016/j.amc.2020.125535.125535 [DOI] [Google Scholar]
28.Schmidhuber J. Deep learning in neural networks: an overview. Neural Networks . 2015;61:85–117. doi: 10.1016/j.neunet.2014.09.003. [DOI] [PubMed] [Google Scholar]
29.Wagaa N., Kallel H., Mellouli N. Improved Arabic alphabet characters classification using convolutional neural networks (CNN) Computational Intelligence and Neuroscience . 2022;2022:16. doi: 10.1155/2022/9965426.9965426 [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Ramachandran P., Zoph B., Le Q. V. Searching for Activation Functions. 2017. https://arxiv.org/abs/1710.05941 . [DOI]
31.Koçyiğit M., Sevilla-Lara L., Hospedales T. M., Bilen H. Unsupervised batch normalization. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops; June 2020; Seattle, WA, USA. pp. 918–919. [DOI] [Google Scholar]
32.Tang Y., Wang Y., Xu Y., et al. Beyond dropout: feature map distortion to regularize deep neural networks. Proceedings of the AAAI Conference on Artificial Intelligence . 2020;34(4):5964–5971. doi: 10.1609/aaai.v34i04.6057. [DOI] [Google Scholar]
33.Simonyan K., Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. 2014. https://arxiv.org/abs/1409.1556 . [DOI]
34.He K., Zhang X., Ren S., Sun J. Identity mappings in deep residual networks. Computer Vision - ECCV 2016 . 2016;9908:630–645. doi: 10.1007/978-3-319-46493-0_38. [DOI] [Google Scholar]
35.He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); June 2016; Las Vegas, NV, USA. pp. 770–778. [DOI] [Google Scholar]
36.Szegedy C., Vanhoucke V., Ioffe S., Shlens J., Wojna Z. Rethinking the inception architecture for computer vision. Proceedings of the IEEE conference on computer vision and pattern recognition; June 2016; Las Vegas, NV, USA. pp. 2818–2826. [DOI] [Google Scholar]
37.Szegedy C., Ioffe S., Vanhoucke V., Alemi A. Inception-v4, inception-resnet and the impact of residual connections on learning. Proceedings of the AAAI Conference on Artificial Intelligence; February 2017; San Francisco, CA, USA. [DOI] [Google Scholar]
38.Chollet F. Xception: deep learning with depthwise separable convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; July 2017; Honolulu, HI, USA. pp. 1251–1258. [DOI] [Google Scholar]
39.Howard A. G., Zhu M., Chen B., et al. Efficient Convolutional Neural Networks for mobile Vision Applications. 2017. https://arxiv.org/abs/1704.04861 . [DOI]
40.Sandler M., Howard A., Zhu M., Zhmoginov A., Chen L.-C. Mobilenetv2: inverted residuals and linear bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; June 2018; Salt Lake City, UT, USA. pp. 4510–4520. [DOI] [Google Scholar]
41.Huang G., Liu Z., Van Der Maaten L., Weinberger K. Q. Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; July 2017; Honolulu, HI, USA. pp. 4700–4708. [DOI] [Google Scholar]
42.Zoph B., Vasudevan V., Shlens J., Le Q. V. Learning transferable architectures for scalable image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition; June 2018; San Francisco, CA, USA. pp. 8697–8710. [DOI] [Google Scholar]
43.Tan M., Le Q. Efficientnet: Rethinking Model Scaling for Convolutional Neural Networks. 2019. pp. 6105–6114. https://arxiv.org/abs/1905.11946 . [DOI]
44.Holland J. H. Genetic algorithms. Scientific American . 1992;267(1):66–72. doi: 10.1038/scientificamerican0792-66. [DOI] [Google Scholar]
45.Kennedy J., Eberhart R. Particle swarm optimization. Proceedings of the ICNN’95 - International Conference on Neural Networks; November 1995; Perth, WA, Australia. pp. 1942–1948. [DOI] [Google Scholar]
46.Rao R. V., Savsani V. J., Vakharia D. P. Teaching-learning-based optimization: a novel method for constrained mechanical design optimization problems. Computer-Aided Design . 2011;43(3):303–315. doi: 10.1016/j.cad.2010.12.015. [DOI] [Google Scholar]
47.Ma R., Karimzadeh M., Ghabussi A., et al. Assessment of Composite Beam Performance Using GWO–ELM Metaheuristic Algorithm. Engineering with Computers . 2021 doi: 10.1007/s00366-021-01363-1. [DOI] [Google Scholar]
48.Nordhausen K. Hastie T., Tibshirani R., Friedman J., editors. The elements of statistical learning: data mining, inference, and prediction, second edition by trevor hastie, robert tibshirani, jerome friedman. International Statistical Review . (Second Edition) 2009;77(3):p. 482. doi: 10.1111/j.1751-5823.2009.00095_18.x. [DOI] [Google Scholar]
49.Distribution A. S. Conda, Anaconda. 2021. https://www.anaconda.com/products/distribution .
50.Abadi M., Agarwal A., Barham P., et al. Tensorflow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. 2016. https://arxiv.org/abs/1603.04467 . [DOI]
51.Chollet F. Deep Learning with Python . Shelter Island, NY, USA: Manning Publications Co; 2021. [Google Scholar]
52.Chou J.-S., Pham A.-D. Enhanced artificial intelligence for ensemble approach to predicting high performance concrete compressive strength. Construction and Building Materials . 2013;49:554–563. doi: 10.1016/j.conbuildmat.2013.08.078. [DOI] [Google Scholar]
53.Khashman A., Akpinar P. Non-destructive prediction of concrete compressive strength using neural networks. Procedia Computer Science . 2017;108:2358–2362. doi: 10.1016/j.procs.2017.05.039. [DOI] [Google Scholar]
54.Han Q., Gui C., Xu J., Lacidogna G. A generalized method to predict the compressive strength of high-performance concrete by improved random forest algorithm. Construction and Building Materials . 2019;226:734–742. doi: 10.1016/j.conbuildmat.2019.07.315. [DOI] [Google Scholar]
55.Zhang H. Y., Cheng X. W., Li Y., Du X. L. Prediction of failure modes, strength, and deformation capacity of RC shear walls through machine learning. Journal of Building Engineering . 2022;50:p. 22. doi: 10.1016/j.jobe.2022.104145. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets, codes, and replication of results generated and/or analyzed during the current study are available from the corresponding author on reasonable request.

[B1] 1.Kellouche Y., Boukhatem B., Ghrici M., Tagnit-Hamou A. Exploring the major factors affecting fly-ash concrete carbonation using artificial neural network. Neural Computing & Applications . 2019;31(6):969–988. doi: 10.1007/s00521-017-3052-2. [DOI] [Google Scholar]

[B2] 2.Adil M., Ullah R., Noor S., Gohar N. Effect of number of neurons and layers in an artificial neural network for generalized concrete mix design. Neural Computing & Applications . 2022;34(11):8355–8363. doi: 10.1007/s00521-020-05305-8. [DOI] [Google Scholar]

[B3] 3.Abuodeh O. R., Abdalla J. A., Hawileh R. A. Assessment of compressive strength of Ultra-high Performance Concrete using deep machine learning techniques. Applied Soft Computing . 2020;95 doi: 10.1016/j.asoc.2020.106552.106552 [DOI] [Google Scholar]

[B4] 4.Deng F., He Y., Zhou S., Yu Y., Cheng H., Wu X. Compressive strength prediction of recycled concrete based on deep learning. Construction and Building Materials . 2018;175:562–569. doi: 10.1016/j.conbuildmat.2018.04.169. [DOI] [Google Scholar]

[B5] 5.Sharma A., Tirumuruhan B., Muthuvel G. S., Gupta A. K., Sujith R. Optimization of process parameters of boron carbide-reinforced Al-Zn-Mg-Cu matrix composite produced by pressure-assisted sintering. Journal of Materials Engineering and Performance . 2022;31(1):328–340. doi: 10.1007/s11665-021-06210-4. [DOI] [Google Scholar]

[B6] 6.Lessard M., Chaalla O., Aitcin P.-C. Testing high-strength concrete compressive strength. ACI Materials Journal . 1993;90(4):303–308. doi: 10.1016/S0950-0618(96)00020-7. [DOI] [Google Scholar]

[B7] 7.Ben-Zeitun A. E. Use of pulse velocity to predict compressive strength of concrete. International Journal of Cement Composites and Lightweight Concrete . 1986;8(1):51–59. doi: 10.1016/0262-5075(86)90024-2. [DOI] [Google Scholar]

[B8] 8.Chung K. L., Wang L., Ghannam M., Guan M., Luo J. Prediction of concrete compressive strength based on early-age effective conductivity measurement. Journal of Building Engineering . 2021;35 doi: 10.1016/j.jobe.2020.101998.101998 [DOI] [Google Scholar]

[B9] 9.Zain M. F. M., Abd S. M. Multiple regression model for compressive strength prediction of high performance concrete. Journal of Applied Sciences . 2009;9(1):155–160. doi: 10.3923/jas.2009.155.160. [DOI] [Google Scholar]

[B10] 10.Bharatkumar B. H., Narayanan R., Raghuprasad B. K., Ramachandramurthy D. S. Mix proportioning of high performance concrete. Cement and Concrete Composites . 2001;23(1):71–80. doi: 10.1016/s0958-9465(00)00071-8. [DOI] [Google Scholar]

[B11] 11.Luke J. V. R., Snell M., Norval D. W. Predicting early concrete strength. Concrete International . 1989;11(12) https://www.concrete.org/publications/internationalconcreteabstractsportal/m/details/id/2092 . [Google Scholar]

[B12] 12.Essam Y., Kumar P., Ahmed A. N., Murti M. A., El-Shafie A. Exploring the reliability of different artificial intelligence techniques in predicting earthquake for Malaysia. Soil Dynamics and Earthquake Engineering . 2021;147 doi: 10.1016/j.soildyn.2021.106826.106826 [DOI] [Google Scholar]

[B13] 13.Jin H., Zhao J. Real-time energy consumption detection simulation of network node in internet of things based on artificial intelligence. Sustainable Energy Technologies and Assessments . 2021;44 doi: 10.1016/j.seta.2021.101004.101004 [DOI] [Google Scholar]

[B14] 14.Adeli H. Four Decades of Computing in Civil Engineering . Singapore: Springer Singapore; 2020. [DOI] [Google Scholar]

[B15] 15.Xu Y., Zhou Y., Sekula P., Ding L. Machine Learning in Construction: From Shallow to Deep Learning. Developments in the Built Environment . 2021;6(6) doi: 10.1016/j.dibe.2021.100045.100045 [DOI] [Google Scholar]

[B16] 16.Garcia-Garcia A., Orts-Escolano S., Oprea S., Villena-Martinez V., Martinez-Gonzalez P., Garcia-Rodriguez J. A survey on deep learning techniques for image and video semantic segmentation. Applied Soft Computing . 2018;70:41–65. doi: 10.1016/j.asoc.2018.05.018. [DOI] [Google Scholar]

[B17] 17.Korbar B., Olofson A. M., Miraflor A. P., et al. Deep learning for classification of colorectal polyps on whole-slide images. Journal of Pathology Informatics . 2017;8(1):p. 30. doi: 10.4103/jpi.jpi_34_17. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B18] 18.Chou J.-S., Karundeng M. A., Truong D.-N., Cheng M.-Y. Identifying deflections of reinforced concrete beams under seismic loads by bio-inspired optimization of deep residual learning. Structural Control and Health Monitoring . 2022;29(4) doi: 10.1002/stc.2918.e2918 [DOI] [Google Scholar]

[B19] 19.Hatcher W. G., Yu W. A survey of deep learning: platforms, applications and emerging research trends. IEEE Access . 2018;6:24411–24432. doi: 10.1109/access.2018.2830661. [DOI] [Google Scholar]

[B20] 20.Chen N., Zhao S. B., Gao Z. W., et al. Virtual mix design: prediction of compressive strength of concrete with industrial wastes using deep data augmentation. Construction and Building Materials . 2022;323:p. 13. doi: 10.1016/j.conbuildmat.2022.126580. [DOI] [Google Scholar]

[B21] 21.Maier H. R., Razavi S., Kapelan Z., Matott L. S., Kasprzyk J., Tolson B. A. Introductory overview: optimization using evolutionary algorithms and other metaheuristics. Environmental Modelling & Software . 2019;114:195–213. doi: 10.1016/j.envsoft.2018.11.018. [DOI] [Google Scholar]

[B22] 22.Liu Q., Li X., Liu H., Guo Z. Multi-objective metaheuristics for discrete optimization problems: a review of the state-of-the-art. Applied Soft Computing . 2020;93 doi: 10.1016/j.asoc.2020.106382.106382 [DOI] [Google Scholar]

[B23] 23.Black P. E. Greedy Algorithm, Dictionary of Algorithms and Data Structures 2021. 2005. http://www.nist.gov/dads .

[B24] 24.Baldi P. Gradient descent learning algorithm overview: a general dynamical systems perspective. IEEE Transactions on Neural Networks . 1995;6(1):182–195. doi: 10.1109/72.363438. [DOI] [PubMed] [Google Scholar]

[B25] 25.Pearl J. Heuristics: intelligent search strategies for computer problem solving. 1984. https://www.gbv.de/dms/weimar/toc/021186472_toc.pdf .

[B26] 26.Chou J.-S., Nguyen N.-M. FBI inspired meta-optimization. Applied Soft Computing . 2020;93 doi: 10.1016/j.asoc.2020.106339.106339 [DOI] [Google Scholar]

[B27] 27.Chou J.-S., Truong D.-N. A novel metaheuristic optimizer inspired by behavior of jellyfish in ocean. Applied Mathematics and Computation . 2021;389 doi: 10.1016/j.amc.2020.125535.125535 [DOI] [Google Scholar]

[B28] 28.Schmidhuber J. Deep learning in neural networks: an overview. Neural Networks . 2015;61:85–117. doi: 10.1016/j.neunet.2014.09.003. [DOI] [PubMed] [Google Scholar]

[B29] 29.Wagaa N., Kallel H., Mellouli N. Improved Arabic alphabet characters classification using convolutional neural networks (CNN) Computational Intelligence and Neuroscience . 2022;2022:16. doi: 10.1155/2022/9965426.9965426 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B30] 30.Ramachandran P., Zoph B., Le Q. V. Searching for Activation Functions. 2017. https://arxiv.org/abs/1710.05941 . [DOI]

[B31] 31.Koçyiğit M., Sevilla-Lara L., Hospedales T. M., Bilen H. Unsupervised batch normalization. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops; June 2020; Seattle, WA, USA. pp. 918–919. [DOI] [Google Scholar]

[B32] 32.Tang Y., Wang Y., Xu Y., et al. Beyond dropout: feature map distortion to regularize deep neural networks. Proceedings of the AAAI Conference on Artificial Intelligence . 2020;34(4):5964–5971. doi: 10.1609/aaai.v34i04.6057. [DOI] [Google Scholar]

[B33] 33.Simonyan K., Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. 2014. https://arxiv.org/abs/1409.1556 . [DOI]

[B34] 34.He K., Zhang X., Ren S., Sun J. Identity mappings in deep residual networks. Computer Vision - ECCV 2016 . 2016;9908:630–645. doi: 10.1007/978-3-319-46493-0_38. [DOI] [Google Scholar]

[B35] 35.He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); June 2016; Las Vegas, NV, USA. pp. 770–778. [DOI] [Google Scholar]

[B36] 36.Szegedy C., Vanhoucke V., Ioffe S., Shlens J., Wojna Z. Rethinking the inception architecture for computer vision. Proceedings of the IEEE conference on computer vision and pattern recognition; June 2016; Las Vegas, NV, USA. pp. 2818–2826. [DOI] [Google Scholar]

[B37] 37.Szegedy C., Ioffe S., Vanhoucke V., Alemi A. Inception-v4, inception-resnet and the impact of residual connections on learning. Proceedings of the AAAI Conference on Artificial Intelligence; February 2017; San Francisco, CA, USA. [DOI] [Google Scholar]

[B38] 38.Chollet F. Xception: deep learning with depthwise separable convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; July 2017; Honolulu, HI, USA. pp. 1251–1258. [DOI] [Google Scholar]

[B39] 39.Howard A. G., Zhu M., Chen B., et al. Efficient Convolutional Neural Networks for mobile Vision Applications. 2017. https://arxiv.org/abs/1704.04861 . [DOI]

[B40] 40.Sandler M., Howard A., Zhu M., Zhmoginov A., Chen L.-C. Mobilenetv2: inverted residuals and linear bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; June 2018; Salt Lake City, UT, USA. pp. 4510–4520. [DOI] [Google Scholar]

[B41] 41.Huang G., Liu Z., Van Der Maaten L., Weinberger K. Q. Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; July 2017; Honolulu, HI, USA. pp. 4700–4708. [DOI] [Google Scholar]

[B42] 42.Zoph B., Vasudevan V., Shlens J., Le Q. V. Learning transferable architectures for scalable image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition; June 2018; San Francisco, CA, USA. pp. 8697–8710. [DOI] [Google Scholar]

[B43] 43.Tan M., Le Q. Efficientnet: Rethinking Model Scaling for Convolutional Neural Networks. 2019. pp. 6105–6114. https://arxiv.org/abs/1905.11946 . [DOI]

[B44] 44.Holland J. H. Genetic algorithms. Scientific American . 1992;267(1):66–72. doi: 10.1038/scientificamerican0792-66. [DOI] [Google Scholar]

[B45] 45.Kennedy J., Eberhart R. Particle swarm optimization. Proceedings of the ICNN’95 - International Conference on Neural Networks; November 1995; Perth, WA, Australia. pp. 1942–1948. [DOI] [Google Scholar]

[B46] 46.Rao R. V., Savsani V. J., Vakharia D. P. Teaching-learning-based optimization: a novel method for constrained mechanical design optimization problems. Computer-Aided Design . 2011;43(3):303–315. doi: 10.1016/j.cad.2010.12.015. [DOI] [Google Scholar]

[B47] 47.Ma R., Karimzadeh M., Ghabussi A., et al. Assessment of Composite Beam Performance Using GWO–ELM Metaheuristic Algorithm. Engineering with Computers . 2021 doi: 10.1007/s00366-021-01363-1. [DOI] [Google Scholar]

[B48] 48.Nordhausen K. Hastie T., Tibshirani R., Friedman J., editors. The elements of statistical learning: data mining, inference, and prediction, second edition by trevor hastie, robert tibshirani, jerome friedman. International Statistical Review . (Second Edition) 2009;77(3):p. 482. doi: 10.1111/j.1751-5823.2009.00095_18.x. [DOI] [Google Scholar]

[B49] 49.Distribution A. S. Conda, Anaconda. 2021. https://www.anaconda.com/products/distribution .

[B50] 50.Abadi M., Agarwal A., Barham P., et al. Tensorflow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. 2016. https://arxiv.org/abs/1603.04467 . [DOI]

[B51] 51.Chollet F. Deep Learning with Python . Shelter Island, NY, USA: Manning Publications Co; 2021. [Google Scholar]

[B52] 52.Chou J.-S., Pham A.-D. Enhanced artificial intelligence for ensemble approach to predicting high performance concrete compressive strength. Construction and Building Materials . 2013;49:554–563. doi: 10.1016/j.conbuildmat.2013.08.078. [DOI] [Google Scholar]

[B53] 53.Khashman A., Akpinar P. Non-destructive prediction of concrete compressive strength using neural networks. Procedia Computer Science . 2017;108:2358–2362. doi: 10.1016/j.procs.2017.05.039. [DOI] [Google Scholar]

[B54] 54.Han Q., Gui C., Xu J., Lacidogna G. A generalized method to predict the compressive strength of high-performance concrete by improved random forest algorithm. Construction and Building Materials . 2019;226:734–742. doi: 10.1016/j.conbuildmat.2019.07.315. [DOI] [Google Scholar]

[B55] 55.Zhang H. Y., Cheng X. W., Li Y., Du X. L. Prediction of failure modes, strength, and deformation capacity of RC shear walls through machine learning. Journal of Building Engineering . 2022;50:p. 22. doi: 10.1016/j.jobe.2022.104145. [DOI] [Google Scholar]

PERMALINK

Jellyfish Search-Optimized Deep Learning for Compressive Strength Prediction in Images of Ready-Mixed Concrete

Jui-Sheng Chou

Stela Tjandrakusuma

Chi-Yun Liu

Abstract

1. Introduction

2. Literature Review

2.1. Conventional Compressive Strength Prediction of Ready-Mixed Concrete

Figure 1.

2.2. Deep Learning to Determine Concrete Compressive Strength

2.3. Hyperparameter Optimization with Metaheuristic Algorithm

3. Methodology

3.1. Deep Learning and Computer Vision-Based Techniques

3.1.1. Deep Neural Networks

Figure 2.

Figure 3.

3.1.2. Convolutional Neural Network-Based Models

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

Figure 10.

Figure 11.

Figure 12.

Figure 13.

Figure 14.

Figure 15.

Figure 16.

3.2. Metaheuristic Optimization Algorithm: Jellyfish Search Optimizer

Figure 17.

3.2.1. Movement Following Ocean Current

3.2.2. Motions Inside Jellyfish Swarm

3.2.3. Time Control Mechanism

3.2.4. Algorithmic Flowchart and Pseudocode

Figure 18.

Figure 19.

3.3. Validation and Performance Evaluation

3.3.1. Validation Method

3.3.2. Performance Metrics

Table 1.

4. Analytical Results and Discussion

4.1. Experimental Settings

4.1.1. Software and Hardware

Table 2.

4.1.2. Collection and Preprocessing of Data

Table 3.

Table 4.

Table 5.

4.1.3. Converting Numerical Data into Images

Figure 20.

Figure 21.

4.2. Implementation and Comparison

Table 6.

4.2.1. Deep Learning Models and Performance

Table 7.

Table 8.

Table 9.

Table 10.

4.2.2. Optimized Convolutional Neural Network-Based Models

Table 11.

Table 12.

Table 13.

4.3. Influence of Feature and Image Pixel Orientation on Modeling Accuracy

Table 14.

Table 15.

Table 16.

5. Conclusions

Acknowledgments

Abbreviations

Symbols

Data Availability

Conflicts of Interest

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK