Skip to main content
PLOS One logoLink to PLOS One
. 2020 Dec 17;15(12):e0243030. doi: 10.1371/journal.pone.0243030

Design deep neural network architecture using a genetic algorithm for estimation of pile bearing capacity

Tuan Anh Pham 1,*, Van Quan Tran 1, Huong-Lan Thi Vu 1, Hai-Bang Ly 1
Editor: Le Hoang Son2
PMCID: PMC7746167  PMID: 33332377

Abstract

Determination of pile bearing capacity is essential in pile foundation design. This study focused on the use of evolutionary algorithms to optimize Deep Learning Neural Network (DLNN) algorithm to predict the bearing capacity of driven pile. For this purpose, a Genetic Algorithm (GA) was developed to select the most significant features in the raw dataset. After that, a GA-DLNN hybrid model was developed to select optimal parameters for the DLNN model, including: network algorithm, activation function for hidden neurons, number of hidden layers, and the number of neurons in each hidden layer. A database containing 472 driven pile static load test reports was used. The dataset was divided into three parts, namely the training set (60%), validation (20%) and testing set (20%) for the construction, validation and testing phases of the proposed model, respectively. Various quality assessment criteria, namely the coefficient of determination (R2), Index of Agreement (IA), mean absolute error (MAE) and root mean squared error (RMSE), were used to evaluate the performance of the machine learning (ML) algorithms. The GA-DLNN hybrid model was shown to exhibit the ability to find the most optimal set of parameters for the prediction process.The results showed that the performance of the hybrid model using only the most critical features gave the highest accuracy, compared with those obtained by the hybrid model using all input variables.

1. Introduction

In pile foundation design, the axial pile bearing capacity (Pu) is considered one of the most critical parameters [1]. Throughout years of research and development, five main approaches to determine the pile bearing capacity have been adopted, namely the static analysis, dynamic analysis, dynamic testing, pile load testing, and in-situ testing [2]. It is needless to say each of the above methods possesses advantages and disadvantages. However, the pile load test is considered as one of the best methods to determine the pile bearing capacity in view of the fact that the testing process is close to the working mechanism of driven piles [3]. Having said that, this method remains time-consuming and unaffordable for small projects [3], the development of a more feasible approach is vital. Thus, many studies have been conducted to determine the pile bearing capacity in taking advantage of the in-situ test results [4]. Meanwhile, the European standard (Euro code 7) [5] recommends using several ground field tests such as the dynamic probing test (DP), press-in and screw-on probe test (SS), standard penetration test (SPT), pressuremeter tests (PMT), plate loading test (PLT), flat dilatometer test (DMT), field vane test (FVT), cone penetration tests with the measurement of pore pressure (CPTu). Among the above approaches, the SPT is commonly used to determine the bearing capacity of piles [6].

Many contributions in the literature relying on the SPT results have been suggested to predict the bearing capacity of piles. As examples, Meyerhof [7], Bazaraa and Kurkur [8], Robert [9], Shioi and Fukui [10], Shariatmadari et al. [11] have proposed several empirical formulations for determining the bearing capacity of piles in sandy ground. Besides, Lopes and Laprovitera [12], Decort [13], the Architectural Institute of Japan (AIJ) [14] have introduced several formulations to determine the pile bearing capacity for various types of soil, including sandy and clayed ground. Overall, traditional methods have used several main parameters to estimate the mechanical properties of piles, such as pile diameter, pile length, soil type, number of SPT blow counts of each soil layer. However, the choice of appropriate parameters, along with the failure in covering other parameters, have led to the disagreement of results given by these methods [15]. Therefore, the development of an universal approach for the selection of a suitable set of parameters is imperative.

Over a half-decade, a newly developed approach using machine learning (ML) algorithms has been widely used to deal with real-world problems [16], especially in civil engineering applications. Employing ML algorithms, many practical problems have been successfully addressed and thus, paved the way for many promising opportunities in the construction industry [1726]. Moreover, miscellaneous ML algorithms have been developed, for instance, decision tree [22], hybrid artificial intelligence approaches [2729], artificial neural network (ANN) [3035], adaptive neuro-fuzzy inference system (ANFIS) [36,37] and support vector machine (SVM) [16], for analyzing technical problems, including the prediction of pile mechanical behavior.

It is worth noticing that the development of the artificial neural network (ANN) algorithm has gained intense attention to treat design issues in pile foundation. For example, Goh et al. [38,39] have presented an ANN model to predict the friction capacity of driven piles in clays, in which the algorithm was trained by on-field data records. Besides, Shahin et al. [4043] have used an ANN model to predict the driven piles loading capacity and drilled shafts using a dataset containing in-situ load tests along with CTP results. Moreover, Nawari et al. [44] have presented an ANN algorithm to predict the settlement of drilled shafts based on SPT data and shaft geometry. Momeni et al. [45] have developed an ANN model to predict the axial bearing capacity of concrete piles using Pile Driving Analyzer (PDA) from project sites. Last but not least, Pham et al.[15] have also developed an ANN algorithm and Random Forest (RF) to estimate the axial bearing capacity of driven pile. Regarding other ML models, Support Vector Machine Regression (SVR) and “nature inspired” meta-heuristic algorithm, namely Particle Swarm Optimization (PSO-SVR) [46] have bene used to predict the soil shear strength. Furthermore, Pham et al. [47] have presented a hybrid ML model combining RF and PSO (PSO-RF) to predict the undrained shear strength of soil. Also, Momeni et al. [48] have developed an ANN-based predictive model optimized with Genetic Algorithm (GA) technique to choose the best weights and biases of ANN model in predicting the bearing capacity of piles. In addition, Hossain et al. [49] used GA to optimize parameters of three hidden layers deep belief neural network (DBNN), include number of epochs, number of hidden units and learning rates in the hidden layers. It is interesting to notice that all the studies have confirmed the effectiveness when implementing the hybrid ML models as a practical and efficient tool in solving geotechnical problems, and particularly the axial bearing capacity of pile. Despite the recent successes of machine learning, this method has some limitations to keep in mind: It requires large amounts of of hand-crafted, structured training data and cannot be learned in real time. In addition, ML models still lack the ability to generalize conditions other than those encountered during the training. Therefore, the ML model only correctly predicts in a certain data range but is not generalized in all cases.

With a particular interest in a recently developed Deep Learning Neural Network (DLNN), which has gained tremendous success in many areas of application [5054], the main objective of this study is dedicated to the development of a novel hybrid ML algorithm using DLNN and GA to predict the axial load capacity of driven piles. For this aim, a dataset consisting of 472 pile load test reports from the construction sites of Ha Nam—Vietnam was gathered. The database was then divided into the training, validation, and testing subsets, relating to the learning, validation and phases of the ML models. Next, a novel ML algorithm using GA-DLNN hybrid model was developed. ML model using GA is used to select the most important input variables to create a new smaller dataset due to the reason that many unimportant input variables could reduce the accuracy of output forecasting. Next, a GA-DLNN hybrid model was used to optimize the parameters of the DLNN model. The optimal architecture of DLNN is used to test with the new dataset and compare with the full-size case of input variables. Besides, DLNN model can be optimized to better estimate axial load capacity of pile, including number of hidden layers, number of neurons in each hidden layer, activation function for hidden layers and training algorithm. Various error criteria, especially, the mean absolute error (MAE), root mean squared error (RMSE), the coefficient of determination (R2) and Index of Agreement (IA)—were applied to evaluate the prediction capability of the algorithms. In addition, 1000 simulations relating to the random shuffling of dataset were conducted for each model in order to evaluate the accuracy of final DLNN model precisely.

2. Significance of the research study

The numerical or experimental methods in the existing literature still have some limitations, such as lack of data set samples (Marto et al.[55] with 40 samples; Momeni et al. [45] with 36 samples; Momeni et al.[56] with 150 samples; Bagińska and Srokosz [57] with 50 samples; Teh et al. [58] with 37 samples), refinement of ML approaches or failure to fully consider key parameters which affects the predicting results of the model.

For this, the contribution of the present work can be marked through the following ideas: (i) large data set, including 472 experimental tests; (ii) reduce the input variables from 10 to 4 which help the model achieve more accurate results with faster training time, (iii) automatically design the optimal architecture for the DLNN model, all key parameters are considered, include: the number of hidden layers, the number of neurons in each hidden layer, the activation function and the training algorithm. In which, the number of hidden layers is not fixed but can be selected through cross-mating between the parent with different chromosome length. Besides, the randomness in the order of the training data set is also considered to assess the stability of predicting result of models with the training, validate and testing set.

3. Data collection and preparation

3.1. Experimental measurement of bearing capacity

The experimental database used in this study was derived from pile load test results conducted on 472 reinforced concrete piles at the test site in Ha Nam province–Vietnam (Fig 1A). In order to obtain the measurements, pre-cast square-section piles with closed tips were driven to the ground by hydraulic pile presses machine with a constant rate of penetration. The tests started at least 7 days after the piles had been driven, and the experimental layout is depicted in Fig 1B. It can be seen that the load increased gradually in each pile test. Depending on the design requirements, the load could be varied up to 200% of the pile load design. The time required to reach 100%, 150%, and 200% of the load could last for about 6 h to 12 h or 24 h, respectively. The bearing capacity of piles was determined following these two principles: (i) when the settlement of pile top at the current load level was 5 times or higher than the settlement of pile top at the previous load level, the pile bearing capacity was taken as the given failure load; (ii) when the load—settlement curve was nearly linear at the last load level, condition (i) could not be used. In this case, the pile bearing capacity was approximated as the load level when the settlement of the pile top exceeded 10% of the pile diameter.

Fig 1.

Fig 1

(a) Experimental location(*); (b) experimental layout. (*): Source: CIA Maps.

3.2. Data preparation

The primary goal of the development of ML algorithms is to estimate the axial bearing capacity of the pile accurately. Therefore, as a first attempt, all the known factors affecting the pile bearing capacity were considered. Besides, it was found that most traditional approaches have used three groups of parameters: the pile geometry, pile constituent material properties, and soil properties [714]. It is worth noticing that the depth of the water table was not considered since it is shown that this effect have already been accounted in SPT blow counts [59]. The bearing capacity of piles was predicted based on the soil properties, determined through SPT blow counts (N) along the embedded length of the pile. In this study, the average number of SPT blows along the pile shaft (Nsh), and tip (Nt) was used. In addition, according to Meyerhof's recommendation (1976) [7], the average SPT (Nt) value for 8D above and 3D below the pile tip was also utilized, where D represented the pile diameter.

Consequently, the input variables in this work were: (1) pile diameter (D); (2) thickness of first soil layer that pile embedded (Z1); (3) thickness of second soil layer that pile embedded (Z2); (4) thickness of third soil layer that pile embedded (Z3); (5) elevation of the natural ground (Zg); (6) elevation of pile top (Zp); (7) elevation of extra segment pile top (Zt); (8) deepness of pile tip (Zm); (9) the average SPT blow count along the pile shaft (Nsh) and (10) the average SPT blow count at the pile tip (Nt). The axial pile bearing capacity was considered as the single output (Pu). For illustration purposes, a diagram for soil stratigraphy and input, output parameters are depicted in Fig 2.

Fig 2. Diagram for stratigraphy and pile parameters.

Fig 2

The dataset containing 472 samples is statistically introduced and summarized in Table 1, including several pile tests, min, max, average and standard deviation of the input and output variables. As showed in Table 1, the pile diameter (D) ranged from 300 to 400 mm. The thickness of the first soil layer that pile embedded (Z1) ranged from 3.4 m to 5.7 m. The thickness of the second soil layer that pile embedded (Z2) varied from 1.5 m to 8 m. The thickness of the third soil layer that pile embedded (Z3) ranged from 0 m to 1.7 m, where a value of 0 means that the pile was not embedded in this layer. Besides, the elevation of pile top (Zp) varied from 0.7 m to 3.4 m. The elevation of natural ground (Zg) ranged from 3.0 m to 4.1 m. The elevation of extra segment pile top (Zt) varied from 1.0 m to 7.1 m. The deepness of pile tip (Zm) ranged from 8.3 m to 16.1 m. The average SPT blow count along the pile shaft (Nsh) ranged from 5.6 to 15.4. The average SPT blow count at the pile tip (Nt) ranged from 4.4 to 7.8. The axial bearing capacity load of pile (Pu), ranged from 407.2 kN to 1551 kN with a mean value of 955.3 kN, and a standard deviation of 355.4 kN. Besides, the histograms of all the input and output variables are shown in Fig 3. An example of 100 data samples is given in the appendix (S1 Appendix).

Table 1. Inputs and output of the present study.

D Z1 Z2 Z3 Zp Zg Zt Zm Nsh Nt Pu
Unit mm m m m m m m m - - kN
1 400 4.35 8 0.95 2.05 3.41 2.06 15.35 13.3 7.6 1110.6
2 300 3.4 5.25 0 3.4 3.47 3.42 12.05 8.65 6.75 610.7
3 400 4.35 8 1.06 2.05 3.56 2.1 15.46 13.41 7.66 1224.8
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
470 300 3.4 5.2 0 3.4 3.43 3.43 12 8.6 6.73 585.35
471 400 3.45 8 0.19 2.95 3.56 2.97 14.59 11.64 7.52 1318
472 400 3.45 8 0.27 2.95 3.63 2.96 14.67 11.72 7.57 1152
Min 300.0 3.4 1.5 0.0 0.7 3.0 1.0 8.3 5.6 4.4 407.2
Average 359.4 3.8 6.5 0.3 2.9 3.5 3.0 13.4 10.5 7.0 955.3
Max 400.0 5.7 8.0 1.7 3.4 4.1 7.1 16.1 15.4 7.8 1551.0
SD 49.2 0.5 1.6 0.4 0.6 0.1 0.6 1.8 2.2 0.6 355.4

SD = Standard deviation.

Fig 3. Histograms of the variables used in this study.

Fig 3

In this study, the collected dataset was divided into the training, validation, and testing datasets. The training part (60% of the total data) was used to train the ML models. The validation part (20% of the total data) was used to give an estimate of model skill and tuning model’s hyperparameters whereas testing data (20% of the remaining data), which was unknown during the training and validation phases, was used to validate the performance of the ML models.

4. Machine learning methods

4.1. Deep learning neural network (DLNN) with multi-layer perceptron

The multi-layer perceptron (MLP) is a kind of feedforward artificial neural network [60]. In general, the MLP includes at least three units, called the layers: the input layer, the hidden layer, and the output layer. When the hidden layer consists of more than two layers, the multi-layer perceptron could be called Deep learning neural network (DLNN) [61,62]. In DLNN, each node in a layer is associated with a certain weight, denoted as wij, with every node in the other layers creating a fully linked neural system [63]. Except for the input layer, each node is a neuron that uses a non-linear activation function [64]. Besides, MLP uses a supervised learning technique called backpropagation for the training process [64]. Thanks to its multi-layer, non-linear activation functions, DLNN could distinguish non-linear separable data. Fig 4 shows the DLNN architecture used in this investigation consisting of 10 inputs, three hidden layer and one output variable

Fig 4. Illustration of the DLNN used in this study, including 10 inputs, three hidden layers, and one output variable.

Fig 4

A multi-layer perceptron having a linear activation function associated with all neurons represents a linear function network that links the weighted inputs to the output. Using linear algebra, it has been proved that such a network, with any number of layers, can be reduced to a two-layer input-output model. Therefore, the development of the DLNN network using non-linear activation functions is crucial to enhance the accuracy of the model, and better mimic the working mechanism of biological neurons. The use of sigmoid functions is commonly adopted in DLNN network, with two conventional activation functions as below:

y(vi)=tanh(vi)andy(vi)=(1+evi)1 (1)

The first one represents a hyperbolic tangent, ranges from -1 to 1, whereas the second one is a logistic function with similar shape but ranges from 0 to 1. In these functions, y(vi) represents the output of the ith node, and vi is the total weight of the input connection. Besides, alternative activation functions, such as the rectifier, or more specialized function, namely radial basis functions, are also proposed.

In function of the errors of the output compared with the target, the connection weights and biases are adjusted, making the learning process occurs. This could be considered as an example of the supervised learning process using the least-squares average algorithm, which is generalized as a backpropagation algorithm. Precisely, an error in the output node j in the nth data point is given by:

ej(n)=dj(n)yj(n) (2)

where d refers to the target value, y denotes the value generated by the perceptron system. The following expression relies on error correction to minimize errors of the predicted output to determine the node weights:

ε(n)=12jej2(n) (3)

Furthermore, the following expression uses the gradient descent algorithm to calculate the change, or the correction, for each weight:

Δωji(n)=ηε(n)vj(n)yi(n) (4)

where yi denotes the output of the previous neuron, refers to the learning rate. These parameters are chosen to ensure that the error quickly converges without oscillation. Besides, the derivative is calculated based on the local field induced vj, which can be expressed as:

ε(n)vj(n)=ej(n)ϕ(vj(n)) (5)

where ϕ′ is the derivative of the activation function. With the change in weight associated with a hidden node, the relevant derivative can be shown as:

ε(n)vj(n)=ϕ(vj(n))kε(n)vk(n)ωkj(n) (6)

This function depends on the weight changes of nodes representing the kth output layer. This algorithm reflects the inverse backpropagation process, as the output weights change according to the activation function derivative, then the weights of the hidden layer change accordingly.

4.2. Genetic Algorithm (GA)

Holland was the first researcher who proposed a genetic algorithm (GA), a stochastic search algorithm, and optimization technique [65]. Later, GA has been investigated by other scientists, especially Deb et al. [66], Houck et al. [67]. Generally, GA is considered a simple solution for complex non-linear problems [68]. The basis of the method lies in the process of mating, breeding in an initial population, along with several activities such as selection, cross-exchange, and mutation, which help to create new, more optimal individuals [69]. In GA algorithm, the population size is an important factor reflecting the total number of solutions and significantly affects the results of the problem [70], whereas the so-called “generations” refers to the iterations of the optimization process. This process could be conditioned by several selected stopping criteria [71].

Practically, GA method has shown many benefits in finding an optimal resource set to optimize both cost and production [69]. In the field of construction, especially when evaluating the load capacity of piles, many studies have successfully and efficiently used GA method. As an example, Ardalan et al. [72] have used GA algorithm combined with neural network to predict driven piles unit shaft resistance from pile loading tests. In another study, 50 PDA (Pile Driving Analyzer) restriction tests were conducted on pre-cast concrete piles to predict the pile bearing capacity. The proposed hybrid method has provided excellent results with R2 of 0.99 [71]. Moreover, other studies on the behavior of piles in soil using the GA method whose effectiveness has been clearly proven [68,70,7274].

In this work, taking advantage of the GA algorithm, such an optimization technique was used to optimize DLNN to predict the bearing capacity of driven pile. The pseudo algorithm is summarized below (Table 2):

Table 2. Pseudo algorithm of the GA algorithm used in this study.

FOR each chromosome i in Population
For each gene j
  Initialize Gij randomly within a permissible range
End FOR
End FOR
Generation k = 1
DO
FOR each chromosome i in Population
  Calculate the fitness value of Gi
End FOR
Mating the best chromosomes to produce more children
Mutates some children randomly to attempt to find even better candidates
Remove the weakest chromosomes, based on fitness value, from the Population
k = k + 1
WHILE maximum generation

4.3. Features selection with GA

It is well-known that the training process with DLNN is a time-consuming and costly method due to the use of computer resource procession [75,76]. In addition, some features in the dataset might affect the regression results, as well as unnecessary features might generate noises and reduce prediction accuracy [77]. The selection of appropriate features requires considerable effort, for instance, sum of combinations C(10,i) for i from 1 to 10 could be generated with a dataset containing 10. In order to facilitate the feature selection process, the GA algorithm was used to choose the appropriate features within the dataset, expecting that fewer input variables could enhance the prediction accuracy of GA-DLNN. The detailed process of the selection mechanism is summarized in the following parts.

Firstly, genes inside the chromosome should be selected. In this study, each feature affecting the pile bearing capacity is considered as a gene. As a result, the length of the chromosome is 10, corresponding to 10 features, or 10 genes (Fig 5).

Fig 5. Representation chromosome of features selection.

Fig 5

Considering the chromosome, each gene is associated with a unique value, i.e., 1 when it is selected or 0 in the other case [78]. Next, to create the population, original chromosomes are randomly selected [78]. After that, several parents were chosen for mating to create offspring chromosomes based on their fitness value associated with each solution (i.e., chromosome). The fitness value is calculated using a fitness function. The support vector regression (SVR) is chosen as the fitness function for this investigation. In the next step, the regression model is trained with the training dataset, and evaluated on the validation (or testing) dataset. In this study, the mean absolute error (MAE) cost function was used to evaluate the accuracy of the fitness function. The lower the fitness value shows a better solution. Based on the fitness function, the “parents” are filtered from the current population. The nature of GA lies in the hypothesis that mating two good solutions could produce the best solution [79]. Children born to parents can randomly choose their parents' genes. Mutations are then applied to make new genes in the next generation.

4.4. Evolution of DLNN parameters using GA and parameters tuning process

It is universally challenging to find out an optimal neural network architecture. A broad and continuous discussion of this problematic work has been the subject of intense researches. To date, no universal rules are given to define the proper number of hidden layers, neurons in each hidden layer, or functions that connecting the neurons. Considering that in the DLNN algorithm, various possibilities could be assembled to build the final network structure, the selection process becomes unachievable. To overcome this problem, the GA could be used to find the best DLNN architecture in an automatic manner. The mechanism of GA could be summarized as the following.

Firstly, the genes inside the chromosome are determined. Four parameters to be investigated are selected, including (i) the network optimizer algorithm, (ii) the activation function of the hidden layers, (iii) the number of hidden layers, and (iv) the number neurons in each hidden layer. As the number of neurons in each hidden layer is different, more genes are required. Each gene contains data representing the number of neurons in each hidden layer. Considering the maximum number of hidden layers is P2, then the maximum length of the chromosome is L = (3 + P2). In particular, the first three genes refer to the first three parameters of the model, previously presented. It is worth noticing that in this case, each chromosome has a different length, depending on the corresponding number of hidden layers. Hence, the parameters used for the DLNN architecture could be depicted in Fig 6, such as network optimizer algorithm (P0), the activation function of hidden layers (P1), the number of hidden layers (P2), and the number neurons in each hidden layer (P3…PL).

Fig 6. Chromosome representation of the parameters selection process.

Fig 6

The considered fitness function is DLNN model, along with four cost functions to evaluate the performance, namely R2, IA, MAE, and RMSE. Detailed descriptions of these criteria are given in the next section. Given that the length of the chromosome might be different, the mating progress occurs under the following principles:

  1. If the length of the parents' chromosomes is similar, the child will randomly select the number of hidden layers and the number of neurons from father or mother.

  2. If the length of the parents' chromosomes is different, two cases could be considered in this case. In the first case, supposing the child chooses the number of hidden layers from a person with fewer genes, the selection will be random from the parents. In the second case, a child chooses to take the number of hidden layers from a parent that has more genes. The only option is to select the missing gene from a person with a higher chromosome length, and other genes are taken randomly from their parents. The mating process is highlighted in Fig 7.

Fig 7. The mating process with different chromosomes length.

Fig 7

During the mutation process, few children are selected. Besides, a random gene is selected and replaced with another random value within a given range. Particularly, since the DLNN model has many parameters, the mutation rate is set at 50% of the number of children born in order to maximize the chance to find the best genes. Finally, the parameters of DLNN were finely tuned by GA through population generations to find out the best prediction performance. Table 3 summarizes the tuned parameters and their tuning ranges and options.

Table 3. Parameters of DLNN and their tuning ranges/options to be optimized by GA.

Parameter Explanation Range/Option
1 P0 Network optimizer algorithm Quasi-Newton, Stochastic gradient descent, Adam
2 P1 Activation function of hidden layers Identity, Logistic, Tanh, Relu
3 P2 Number of hidden layers 2–10
4 P3 Number neurons in hidden layer 1 2–80
5 P4 Number neurons in hidden layer 2 2–80
. . . .
. . . .
L—1 PL-1 Number neurons in hidden layer (L-3) 2–80
L PL Number neurons in hidden layer (L-2) 2–80

L = Length of the chromosome, L = (3 + P2).

4.5. Performance evaluation

In order to verify the effectiveness and performance of the ML algorithms, four different criteria were selected in this study, namely, root mean square error (RMSE), mean absolute error (MAE), the coefficient of determination (R2), and Willmott’s index of agreement (IA). The criterion RMSE is the mean squared difference between the predicted outputs and targets, whereas MAE is the mean magnitude of the errors. The similarity between the two error criteria RMSE and MAE is that the closer these errors' criterion values to 0, the better performance of the model. The criterion R2 is the correlation between targets and outputs [80]. The accuracy of the model is superior in the cases of small values of RMSE and MAE. The values of R2 are in the range of [−∞÷1], where higher accuracy is obtained when the values are close to 1. The Index of Agreement (IA) was presented by Willmott [81,82]. The IA points out the ratio of the mean square error and the potential error. Similar to R2, the values of IA vary between −∞ and 1, in which 1 indicates a perfect correlation, and negative value indicates no agreement. These coefficients can be calculated using the following formulas [83,84]:

MAE=1ki=1k(viv¯i) (7)
RMSE=1ki=1k(viv¯i) (8)
R2=1i=1k(viv¯i)2i=1k(viv¯)2 (9)
IA=1i=1N(vv¯i)2i=1N(|vv¯|+|viv¯|)2 (10)

Where k inferred the number of the samples, vi and v¯i were the actual and predicted outputs, respectively, and v¯ was the average value of the vi.

5. Results and discussion

5.1. Feature selection

The results of the feature selection process using the GA model is presented in this section. The initialization parameters of GA used in this study are given in Table 4. Fig 8 illustrates the evolution of MAE values using GA after 200 generations. It can be seen that the MAE value was progressively decreased with the generation of GA. The lowest MAE was 116.91 (kN) at the first generation and decreased to 95.54 (kN) at the 87th generation. This value was unchanged from the 87th to the 200th generation. The optimum representation chromosome of feature selection were [0, 1, 1, 0, 0, 1, 0, 0, 0, 1]. This result suggested a new dataset, more compact, corresponded to [Z1, Z2, Zg, Nt]. Therefore, the number of input variables for a compact dataset included 4 variables. As a result, the input space was reduced by 6 variables compared to the original dataset.

Table 4. GA feature selection initialization parameters.

Parameters Value and Description
Number of population 25
Number of generation 200
Mating pool size 10
Mutation rate 0.5
Fitness function Support Vector Regression (SVR)
Cost function MAE
Data used Training/ Validation dataset

Fig 8. Features selection using the GA model.

Fig 8

5.2. Optimization of DLNN architecture

The evolutionary results in predicting the pile bearing capacity of GA-DLNN model are evaluated in this section. The initialization parameters of GA-DLNN used in this study are given in Table 5. Fig 9 illustrates the evolution of the GA-DLNN model through 200 generations with 4 and 10 input variables. A summary of the best predictability of the models is presented in Table 6. For the sake of conparison and highlight the performance of the reduced input space, three different scenarios were performed. The first one used the 4-input space and simulated with GA-DLNN, denoted as 4-input GA-DLNN model. The second one contained the initial input space and performed with GA-DLNN, denoted as the 10-input GA-DLNN model. The last scenario referred to the case using 4 input variables but using DLNN as a predictor, denoted as 4-input DLNN model.

Table 5. GA-DLNN initialization parameters.

Parameters Value and Description
Number of population 25
Number of generation 200
Mating pool size 24
Mutation rate 0.5
Fitness function DLNN
Cost function R2, MAE, RMSE, IA
Data used Training/ Validation dataset

Fig 9. Parameters tuning using the model using the GA-DLNN model with 4 and 10 inputs.

Fig 9

Table 6. Summary of best prediction capability of models.

Model R2 MAE (kN) RMSE (kN) IA Normalized time
4-input GA-DLNN 0.923 75.927 95.118 0.981 0.7
10-input GA-DLNN 0.918 75.838 97.092 0.980 1.0
4-input DLNN 0.858 90.785 123.788 0.967 -

It can be seen that the 4-input GA-DLNN model performed better accuracy, the best generation yielded correlation of R2 = 0.923, MAE = 75.927, RMSE = 95.118 and IA = 0.981. Compared to the first generation, the 4-input DLNN model produce accurate intermediate precision (R2 = 0.858, MAE = 90.785, RMSE = 123.788 and IA = 0.967).

The results also show that the 4-input GA-DLNN model gives slightly better performance than the 10-input GA-DLNN model. The GA-DLNN model with 10 variables predicts correlation results at most efficient generation as follows: R2 = 0.918, MAE = 75.838, RMSE = 97.092 and IA = 0.980. The analysis time cost through 200 generations of the 4-input model is much lower than the 10-input model with the normalized time of the two models, respectively: 0.7 and 1.0.

The optimum parameters of models are shown in Table 7. It shows that all three models choose the same network optimization algorithm (Quasi-Newton), the number of hidden layers range from 2 to 4 and the number of neurons in each hidden layer is relatively complex, ranging from 9 to 80. However, each model chooses a different type of activation function.

Table 7. The optimum parameter of models.

Parameter 4-input GA-DLNN 10-input GA-DLNN 4-input DLNN
Network optimizer algorithm Quasi-Newton Quasi-Newton Quasi-Newton
Activation function of hidden layers logistic relu relu
Number of hidden layers 2 4 3
Number neurons in each hidden layer (33, 80) (74, 17, 24, 12) (9, 50, 29)

5.3. Predictive capability of the models

Fig 10 shows a visual comparison of test results and predictions based on Pu from a representative ML model. The performance of ML models has been tested on all three datasets: training, validation and testing. In this case, two representative DLNN models were selected based on the best performance through the model evolution (Fig 9), corresponding to input variables 4 and 10. One 4-input DLNN model which has the best fitness value in the first generation, was chosen to compare with the two optimal models to prove the effectiveness of model evolution. The predictive capability of the models is also summarized in Table 8.

Fig 10.

Fig 10

Measured and predicted values of axial bearing capacity of pile using the models: 4-input GA-DLNN model for training (a), validation (b), testing dataset (c); 10-input GA-DLNN model for training (d), validation (e), testing dataset (f); 4-input DLNN model for training (g), validation (h), testing dataset (i).

Table 8. Predictive capability of the models.

Dataset Cost function 4-input GA-DLNN 10-input GA-DLNN 4-input DLNN
Training R2 0.944 0.927 0.910
MAE 64.235 72.744 75.929
RMSE 83.593 94.873 105.884
IA 0.985 0.981 0.976
Validation R2 0.923 0.918 0.858
MAE 75.927 75.838 90.785
RMSE 95.118 97.092 123.788
IA 0.981 0.980 0.967
Testing R2 0.887 0.844 0.809
MAE 86.573 93.074 92.867
RMSE 110.176 132.490 142.896
IA 0.969 0.956 0.947

From a statistical standpoint, the performance of ML algorithms should be fully evaluated. As mentioned during the simulation, 60% of the test data was randomly selected to train ML models. The performance of such a model can be affected by the selection order of the training data set. Therefore, a total of 1000 simulations were performed next, taking into account the random splitting effect in the dataset. The result is shown in Fig 11 and Tables 912. It can be seen that the performance of the 4-input GA-DLNN model was improved after tuning the parameters of the DLNN model and outperformed the best model in the first generation (4-input DLNN). On training set, R2 value has increased from 0.919 to 0.932. The result can be also observed on the validation set, in which the R2 value is increased (from 0.884 to 0.898). The most difference can be seen in the testing set in which R2 increased from 0.777 to 0.882. Compared to the 10-input GA-DLNN model, the R2 value is similar in training and validation, the big difference only appears in the test data set, whereas R2 value of the 4-input GA-DLNN model gives better results (R2 = 0.882) compared to 10-input GA-DLNN models (R2 = 0.8). On testing set, SD value of 4-input GA-DLNN model is smallest (SD = 0.008) compare to 10-input GA-DLNN and 4-input DLNN model (SD = 0.0351, 0.0718, respectively), indicating more stable 4-input GA-DLNN modelling.

Fig 11. Predictive capability of the models with 1000 simulations.

Fig 11

Table 9. Summary of the 1000 simulations using R2 criteria.

Model Dataset Average Min Max SD
4-input GA-DLNN Training 0.932 0.917 0.945 0.0073
Validation 0.899 0.868 0.934 0.0138
Testing 0.882 0.831 0.905 0.0082
4-input DLNN Training 0.919 0.905 0.928 0.0038
Validation 0.884 0.872 0.905 0.0040
Testing 0.777 0.514 0.902 0.0718
10-input GA-DLNN Training 0.924 0.907 0.932 0.0052
Validation 0.918 0.895 0.930 0.0054
Testing 0.800 0.671 0.890 0.0351

Table 12. Summary of the 1000 simulations using MAE criteria.

Model Dataset Average Min Max SD
4-input GA-DLNN Training 68.211 61.977 72.091 1.7106
Validation 85.937 72.163 95.921 3.3777
Testing 87.459 78.075 95.510 2.2845
4-input DLNN Training 73.960 66.832 78.319 2.2208
Validation 91.629 80.999 98.967 2.9166
Testing 96.997 79.877 127.320 5.6667
10-input GA-DLNN Training 69.458 64.106 76.256 1.6571
Validation 81.074 74.631 93.133 2.8590
Testing 93.507 83.376 105.197 3.1047

Table 10. Summary of the 1000 simulations using IA criteria.

Model Dataset Average Min Max SD
4-input GA-DLNN Training 0.982 0.978 0.986 0.0020
Validation 0.973 0.964 0.982 0.0038
Testing 0.969 0.955 0.974 0.0021
4-input DLNN Training 0.978 0.975 0.981 0.0010
Validation 0.968 0.964 0.973 0.0012
Testing 0.941 0.676 0.975 0.0215
10-input GA-DLNN Training 0.983 0.979 0.986 0.0009
Validation 0.978 0.970 0.981 0.0017
Testing 0.949 0.919 0.970 0.0088

Table 11. Summary of the 1000 simulations using RMSE criteria.

Model Dataset Average Min Max SD
4-input GA-DLNN Training 91.537 82.268 101.353 4.9118
Validation 113.764 91.962 130.376 7.8739
Testing 109.965 98.902 131.414 3.7948
4-input DLNN Training 100.444 95.014 106.750 2.3025
Validation 122.405 112.008 128.483 2.0521
Testing 150.528 99.600 299.543 27.3845
10-input GA-DLNN Training 89.901 82.692 98.562 2.3062
Validation 102.720 95.497 118.908 3.4459
Testing 143.002 107.493 180.817 12.7996

Table 13 presents some research results on ML applications in foundation engineering. The results of this study as well as previous studies show that the expected foundation effectiveness of ML technique in foundation engineering with prediction results of foundation load is mostly reaching R2 from 0.8 to 0.9 on test data set. However, due to the use of different data sets, a comparison between these results is unwarranted. A project that uses different data sets is needed to give a generalized model to foundation engineering.

Table 13. Comparison with other studies.

Author Model Foundation type Number of samples R2 RMSE
Momeni el al. [56] ANFIS Thin-walls 150 0.875 0.048
ANN 0.71 0.529
Momeni el al. [85] GPR Piles 296 0.84 -
Kulkarni el al. [86] GA-ANN Rock-socketed piles 132 0.86 0.0093
Jahed Armaghani el al. [87] ANN 0.808 0.135
PSO-ANN 0.918 0.063
The present study

GA-DNN

Piles 472 0.882 109.965

6. Conclusions

The main achievement of this study is to provide an efficient GA-DLNN hybrid model in predicting pile load capacity. The model has the ability to self-evolve to find the optimal model structure, where the optimal number of hidden layers can be treated as a variable and discovered during the model's evolution, besides to the other important parameters. In addition, an evolutionary model was developed to mitigate the number of input variables of the model, while ensuring the accuracy of the regression results.

The results showed that, on the training data set, all three models: 4 -input GA-DLNN, 10-input GA-DLNN and 4-input DLNN have good predict results, in which, the leading is the model GA-DLNN with 4 inputs. On the validation data set, the 4-input GA-DLNN model gave similar results to the 10-input GA-DLNN model and outperformed the 4-input DLNN model with satisfactory accuracy (R2 = 0.923, MAE = 75.927, RMSE = 95.118 kN, IA = 0.981 using 4-input GA-DLNN compared with R2 = 0.918, MAE = 75.838 kN, RMSE = 97.092 kN, IA = 0.98 using 10-input GA-DLNN and R2 = 0.858, MAE = 90.785, RMSE = 113.788 kN, IA = 0.967 kN using 4-input DLNN). Meanwhile, the time cost for the 4-input GA-DLNN model is much lower than the 10-input GA-DLNN hybrid model (the normalize time is respectively 0.7 and 1.0). On testing data, the predictability of the 4-input GA-DLNN model proved to be superior to the other two models. The forecast result of 1000 simulations shows that the average value of R2 is 0.882, 0.8, 0.777 respectively for 4-input GA-DLNN models, 10-input GA-DLNN and 4-input DLNN. In addition, the oscillation range (minimum, maximum) of R2 value of input model GA-DLNN 4 is smaller than the other 2 models, indicating the model's stability.

As research shows that the best results are obtained by GA-DLNN with the number of hidden layers from 2 to 4. The number of neurons in each hidden layer is completely different and is distributed complexly in the hidden layers. It suggests that a DLNN model with 2, 3, 4 hidden layers might be optimal for the problem related to predicting the bearing capacity of driven piles. However, it is recommended to select the number of neurons in each hidden layer by evolutionary methods to bring out high performance for the DLNN model. The results obtained from the evolution of the DLNN model by GA show that the activation function of hidden layers mainly choose one of two categories: relu or logistic and the Quasi-Newton optimal algorithm is most suitable for predicting bearing capacity of pile.

Supporting information

S1 File

(CSV)

S1 Appendix

(DOCX)

Data Availability

All relevant data are within the paper and supporting information files.

Funding Statement

The authors received no specific funding for this work.

References

  • 1.Drusa M., Gago F., Vlček J., Contribution to Estimating Bearing Capacity of Pile in Clayey Soils, Civil and Environmental Engineering. 12 (2016) 128–136. 10.1515/cee-2016-0018. [DOI] [Google Scholar]
  • 2.Shooshpasha I., Hasanzadeh A., Taghavi A., Prediction of the axial bearing capacity of piles by SPT-based and numerical design methods, International Journal of GEOMATE. 4 (2013) 560–564. [Google Scholar]
  • 3.Birid K.C., Evaluation of Ultimate Pile Compression Capacity from Static Pile Load Test Results, in: Abu-Farsakh M., Alshibli K., Puppala A. (Eds.), Advances in Analysis and Design of Deep Foundations, Springer International Publishing, Cham, 2018: pp. 1–14. 10.1007/978-3-319-61642-1_1. [DOI] [Google Scholar]
  • 4.Kozłowski W., Niemczynski D., Methods for Estimating the Load Bearing Capacity of Pile Foundation Using the Results of Penetration Tests—Case Study of Road Viaduct Foundation, Procedia Engineering. 161 (2016) 1001–1006. 10.1016/j.proeng.2016.08.839. [DOI] [Google Scholar]
  • 5.Bond A.J., Schuppener B., Scarpelli G., Orr T.L.L., Dimova S., Nikolova B., et al. , European Commission, Joint Research Centre, Institute for the Protection and the Security of the Citizen, Eurocode 7: geotechnical design worked examples., Publications Office, Luxembourg, 2013. http://dx.publications.europa.eu/10.2788/3398 (accessed December 12, 2019). [Google Scholar]
  • 6.Bouafia A., Derbala A., Assessment of SPT-based method of pile bearing capacity–analysis of a database, in: Proceedings of the International Workshop on Foundation Design Codes and Soil Investigation in View of International Harmonization and Performance-Based Design, 2002: pp. 369–374. [Google Scholar]
  • 7.Meyerhof G.G., Bearing Capacity and Settlement of Pile Foundations, Journal of the Geotechnical Engineering Division. 102 (1976) 197–228. [Google Scholar]
  • 8.Bazaraa A.R., Kurkur M.M., N-values used to predict settlements of piles in Egypt, in: Use of In Situ Tests in Geotechnical Engineering, ASCE, 1986: pp. 462–474. [Google Scholar]
  • 9.Robert Y., A few comments on pile design, Can. Geotech. J. 34 (1997) 560–567. 10.1139/t97-024. [DOI] [Google Scholar]
  • 10.Shioi Y and Fukui J, Application of N-value to design of foundations in Japan, Proceeding of the 2nd ESOPT. (1982) 159–164. [Google Scholar]
  • 11.Shariatmadari N., ESLAMI A.A., KARIM P.F.M., Bearing capacity of driven piles in sands from SPT–applied to 60 case histories, Iranian Journal of Science and Technology Transaction B—Engineering. 32 (2008) 125–140. [Google Scholar]
  • 12.Lopes F.R and Laprovitera H, On the prediction of the bearing capacity of bored piles from dynamic penetration tests., Proceedings of Deep Foundations on Bored and Au-Ger Piles BAP’88, Van Impe (Ed). (1988) 537–540. [Google Scholar]
  • 13.Decourt L., Prediction of load-settlement relationships for foundations on the basis of the SPT, Ciclo de Conferencias Internationale, Leonardo Zeevaert, UNAM, Mexico. (1985) 85–104. [Google Scholar]
  • 14.Architectural Institute of Japan, (AIJ), Recommendations for Design of building foundation, Architectural Institute of Japan, Tokyo, 2004.
  • 15.Pham T.A., Ly H.-B., Tran V.Q., Giap L.V., Vu H.-L.T., Duong H.-A.T., Prediction of Pile Axial Bearing Capacity Using Artificial Neural Network and Random Forest, Applied Sciences. 10 (2020) 1871 10.3390/app10051871. [DOI] [Google Scholar]
  • 16.Pham B.T., Nguyen M.D., Dao D.V., Prakash I., Ly H.-B., Le T.-T., et al. , Development of artificial intelligence models for the prediction of Compression Coefficient of soil: An application of Monte Carlo sensitivity analysis, Science of The Total Environment. 679 (2019) 172–184. 10.1016/j.scitotenv.2019.05.061 [DOI] [PubMed] [Google Scholar]
  • 17.Asteris P.G., Ashrafian A., Rezaie-Balf M., Prediction of the compressive strength of self-compacting concrete using surrogate models, 1. 24 (2019) 137–150. [Google Scholar]
  • 18.Asteris P.G., Moropoulou A., Skentou A.D., Apostolopoulou M., Mohebkhah A., Cavaleri L., et al. , Stochastic Vulnerability Assessment of Masonry Structures: Concepts, Modeling and Restoration Aspects, Applied Sciences. 9 (2019) 243. [Google Scholar]
  • 19.Hajihassani M., Abdullah S.S., Asteris P.G., Armaghani D.J., A gene expression programming model for predicting tunnel convergence, Applied Sciences. 9 (2019) 4650. [Google Scholar]
  • 20.Huang L., Asteris P.G., Koopialipoor M., Armaghani D.J., Tahir M.M., Invasive Weed Optimization Technique-Based ANN to the Prediction of Rock Tensile Strength, Applied Sciences. 9 (2019) 5372. [Google Scholar]
  • 21.Le L.M., Ly H.-B., Pham B.T., Le V.M., Pham T.A., Nguyen D.-H., et al. , Hybrid Artificial Intelligence Approaches for Predicting Buckling Damage of Steel Columns Under Axial Compression, Materials. 12 (2019) 1670 10.3390/ma12101670 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Ly H.-B., Pham B.T., Dao D.V., Le V.M., Le L.M., Le T.-T., Improvement of ANFIS Model for Prediction of Compressive Strength of Manufactured Sand Concrete, Applied Sciences. 9 (2019) 3841 10.3390/app9183841. [DOI] [Google Scholar]
  • 23.Ly H.-B., Le L.M., Duong H.T., Nguyen T.C., Pham T.A., Le T.-T., et al. , Hybrid Artificial Intelligence Approaches for Predicting Critical Buckling Load of Structural Members under Compression Considering the Influence of Initial Geometric Imperfections, Applied Sciences. 9 (2019) 2258 10.3390/app9112258. [DOI] [Google Scholar]
  • 24.Ly H.-B., Le T.-T., Le L.M., Tran V.Q., Le V.M., Vu H.-L.T., et al. , Development of Hybrid Machine Learning Models for Predicting the Critical Buckling Load of I-Shaped Cellular Beams, Applied Sciences. 9 (2019) 5458 10.3390/app9245458. [DOI] [Google Scholar]
  • 25.Ly H.-B., Le L.M., Phi L.V., Phan V.-H., Tran V.Q., Pham B.T., et al. , Development of an AI Model to Measure Traffic Air Pollution from Multisensor and Weather Data, Sensors. 19 (2019) 4941 10.3390/s19224941 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Xu H., Zhou J., Asteris P. G., Jahed Armaghani D., Tahir M.M., Supervised Machine Learning Techniques to the Prediction of Tunnel Boring Machine Penetration Rate, Applied Sciences. 9 (2019) 3715 10.3390/app9183715. [DOI] [Google Scholar]
  • 27.Chen H., Asteris P.G., Jahed Armaghani D., Gordan B., Pham B.T., Assessing Dynamic Conditions of the Retaining Wall: Developing Two Hybrid Intelligent Models, Applied Sciences. 9 (2019) 1042 10.3390/app9061042. [DOI] [Google Scholar]
  • 28.Nguyen H.-L., Le T.-H., Pham C.-T., Le T.-T., Ho L.S., Le V.M., et al. , Development of Hybrid Artificial Intelligence Approaches and a Support Vector Machine Algorithm for Predicting the Marshall Parameters of Stone Matrix Asphalt, Applied Sciences. 9 (2019) 3172 10.3390/app9153172. [DOI] [Google Scholar]
  • 29.Nguyen H.-L., Pham B.T., Son L.H., Thang N.T., Ly H.-B., Le T.-T., et al. , Adaptive Network Based Fuzzy Inference System with Meta-Heuristic Optimizations for International Roughness Index Prediction, Applied Sciences. 9 (2019) 4715 10.3390/app9214715. [DOI] [Google Scholar]
  • 30.Asteris P.G., Nozhati S., Nikoo M., Cavaleri L., Nikoo M., Krill herd algorithm-based neural network in structural seismic reliability evaluation, Mechanics of Advanced Materials and Structures. 26 (2019) 1146–1153. 10.1080/15376494.2018.1430874. [DOI] [Google Scholar]
  • 31.Asteris P.G., Armaghani D.J., Hatzigeorgiou G.D., Karayannis C.G., Pilakoutas K., Predicting the shear strength of reinforced concrete beams using Artificial Neural Networks, Computers and Concrete. 24 (2019) 469–488. 10.12989/cac.2019.24.5.469. [DOI] [Google Scholar]
  • 32.Asteris P.G., Apostolopoulou M., Skentou A.D., Moropoulou A., Application of artificial neural networks for the prediction of the compressive strength of cement-based mortars, Computers and Concrete. 24 (2019) 329–345. 10.12989/cac.2019.24.4.329. [DOI] [Google Scholar]
  • 33.Asteris P.G., Kolovos K.G., Self-compacting concrete strength prediction using surrogate models, Neural Comput & Applic. 31 (2019) 409–424. 10.1007/s00521-017-3007-7. [DOI] [Google Scholar]
  • 34.Asteris P.G., Mokos V.G., Concrete compressive strength using artificial neural networks, Neural Comput & Applic. (2019). 10.1007/s00521-019-04663-2. [DOI] [Google Scholar]
  • 35.Dao D.V., Ly H.-B., Trinh S.H., Le T.-T., Pham B.T., Artificial Intelligence Approaches for Prediction of Compressive Strength of Geopolymer Concrete, Materials. 12 (2019) 983 10.3390/ma12060983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Dao D.V., Trinh S.H., Ly H.-B., Pham B.T., Prediction of Compressive Strength of Geopolymer Concrete Using Entirely Steel Slag Aggregates: Novel Hybrid Artificial Intelligence Approaches, Applied Sciences. 9 (2019) 1113 10.3390/app9061113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Qi C., Ly H.-B., Chen Q., Le T.-T., Le V.M., Pham B.T., Flocculation-dewatering prediction of fine mineral tailings using a hybrid machine learning approach, Chemosphere. 244 (2020) 125450 10.1016/j.chemosphere.2019.125450 [DOI] [PubMed] [Google Scholar]
  • 38.Goh A.T.C., Back-propagation neural networks for modeling complex systems, Artificial Intelligence in Engineering. 9 (1995) 143–151. 10.1016/0954-1810(94)00011-S [DOI] [Google Scholar]
  • 39.Goh Anthony T. C., Kulhawy Fred H., Chua C. G., Bayesian Neural Network Analysis of Undrained Side Resistance of Drilled Shafts, Journal of Geotechnical and Geoenvironmental Engineering. 131 (2005) 84–93. 10.1061/(ASCE)1090-0241(2005)131:1(84) [DOI] [Google Scholar]
  • 40.Shahin M.A., Jaksa M.B., Neural network prediction of pullout capacity of marquee ground anchors, Computers and Geotechnics. 32 (2005) 153–163. 10.1016/j.compgeo.2005.02.003. [DOI] [Google Scholar]
  • 41.Shahin M.A., Load–settlement modeling of axially loaded steel driven piles using CPT-based recurrent neural networks, Soils and Foundations. 54 (2014) 515–522. 10.1016/j.sandf.2014.04.015. [DOI] [Google Scholar]
  • 42.Shahin M.A., State-of-the-art review of some artificial intelligence applications in pile foundations, Geoscience Frontiers. 7 (2016) 33–44. 10.1016/j.gsf.2014.10.002. [DOI] [Google Scholar]
  • 43.Shahin M.A., Intelligent computing for modeling axial capacity of pile foundations, Canadian Geotechnical Journal. 47 (2010) 230–243. [Google Scholar]
  • 44.Nawari N.O., Liang R., Nusairat J., Artificial intelligence techniques for the design and analysis of deep foundations, Electronic Journal of Geotechnical Engineering. 4 (1999) 1–21. [Google Scholar]
  • 45.Momeni E., Nazir R., Armaghani D.J., Maizir H., Application of Artificial Neural Network for Predicting Shaft and Tip Resistances of Concrete Piles, Earth Sciences Research Journal. 19 (2015) 85–93. 10.15446/esrj.v19n1.38712. [DOI] [Google Scholar]
  • 46.Nhu V.-H., Hoang N.-D., Duong V.-B., Vu H.-D., Tien Bui D., A hybrid computational intelligence approach for predicting soil shear strength for urban housing construction: a case study at Vinhomes Imperia project, Hai Phong city (Vietnam), Engineering with Computers. 36 (2020) 603–616. 10.1007/s00366-019-00718-z. [DOI] [Google Scholar]
  • 47.Pham B.T., Qi C., Ho L.S., Nguyen-Thoi T., Al-Ansari N., Nguyen M.D., et al A Novel Hybrid Soft Computing Model Using Random Forest and Particle Swarm Optimization for Estimation of Undrained Shear Strength of Soil, Sustainability. 12 (2020) 2218 10.3390/su12062218. [DOI] [Google Scholar]
  • 48.Momeni E., Nazir R., Armaghani D.J., Maizir H., Prediction of pile bearing capacity using a hybrid genetic algorithm-based ANN, Measurement. 57 (n.d.) 122–131. [Google Scholar]
  • 49.Hossain D., Capi G., Jindai M., Optimizing Deep Learning Parameters Using Genetic Algorithm for Object Recognition and Robot Grasping, 16 (2018) 6. [Google Scholar]
  • 50.Liang X., Nguyen D., Jiang S., Generalizability issues with deep learning models in medicine and their potential solutions: illustrated with Cone-Beam Computed Tomography (CBCT) to Computed Tomography (CT) image conversion, ArXiv:2004.07700 [Physics]. (2020). http://arxiv.org/abs/2004.07700 (accessed April 17, 2020). [Google Scholar]
  • 51.Sommers G.M., Andrade M.F.C., Zhang L., Wang H., Car R., Raman Spectrum and Polarizability of Liquid Water from Deep Neural Networks, ArXiv:2004.07369 [Cond-Mat, Physics:Physics]. (2020). http://arxiv.org/abs/2004.07369 (accessed April 17, 2020). 10.1039/d0cp01893g [DOI] [PubMed] [Google Scholar]
  • 52.Rojas F., Maurin L., Dünner R., Pichara K., Classifying CMB time-ordered data through deep neural networks, Monthly Notices of the Royal Astronomical Society. (2020) staa1009. 10.1093/mnras/staa1009. [DOI] [Google Scholar]
  • 53.Ezzat D., ell Hassanien A., Ella H.A., GSA-DenseNet121-COVID-19: a Hybrid Deep Learning Architecture for the Diagnosis of COVID-19 Disease based on Gravitational Search Optimization Algorithm, ArXiv:2004.05084 [Cs, Eess]. (2020). http://arxiv.org/abs/2004.05084 (accessed April 17, 2020). [Google Scholar]
  • 54.Bagińska M., Srokosz P.E., The Optimal ANN Model for Predicting Bearing Capacity of Shallow Foundations trained on Scarce Data, KSCE J Civ Eng. 23 (2019) 130–137. 10.1007/s12205-018-2636-4. [DOI] [Google Scholar]
  • 55.Marto A., Hajihassani M., Momeni E., Bearing Capacity of Shallow Foundation’s Prediction through Hybrid Artificial Neural Networks, AMM. 567 (2014) 681–686. 10.4028/www.scientific.net/AMM.567.681. [DOI] [Google Scholar]
  • 56.Momeni E., Armaghani D.J., Fatemi S.A., Nazir R., Prediction of bearing capacity of thin-walled foundation: a simulation approach, Engineering with Computers. 34 (2018) 319–327. 10.1007/s00366-017-0542-x. [DOI] [Google Scholar]
  • 57.Bagińska M., Srokosz P.E., The Optimal ANN Model for Predicting Bearing Capacity of Shallow Foundations trained on Scarce Data, KSCE J Civ Eng. 23 (2019) 130–137. 10.1007/s12205-018-2636-4. [DOI] [Google Scholar]
  • 58.Teh C. I., Wong K. S., Goh A. T. C., Jaritngam S., Prediction of Pile Capacity Using Neural Networks, Journal of Computing in Civil Engineering. 11 (1997) 129–138. 10.1061/(ASCE)0887-3801(1997)11:2(129) [DOI] [Google Scholar]
  • 59.Pooya Nejad F., Jaksa M.B., Kakhi M., McCabe B.A., Prediction of pile settlement using artificial neural networks based on standard penetration test data, Computers and Geotechnics. 36 (2009) 1125–1133. 10.1016/j.compgeo.2009.04.003. [DOI] [Google Scholar]
  • 60.Hastie T., Tibshirani R., Friedman J., The Elements of Statistical Learning–Data Mining, Inference, and Prediction, n.d. [Google Scholar]
  • 61.CFA Institute, CFA Program Curriculum 2020 Level II Volumes 1–6 Box Set | Wiley, Wiley, 2019. https://www.wiley.com/en-vn/CFA+Program+Curriculum+2020+Level+II+Volumes+1+6+Box+Set-p-9781946442956 (accessed July 7, 2020).
  • 62.Zemouri R., Omri N., Fnaiech F., Zerhouni N., Fnaiech N., A new growing pruning deep learning neural network algorithm (GP-DLNN), Neural Comput & Applic. (2019). 10.1007/s00521-019-04196-8. [DOI] [Google Scholar]
  • 63.Lipták B.G., ed., Process Control Instrument Engineer, Elsevier, 1995. https://www.sciencedirect.com/book/9780750622554/process-control#book-info (accessed July 7, 2020). [Google Scholar]
  • 64.Van Der Malsburg C., Frank Rosenblatt: Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms, in: Palm G., Aertsen A. (Eds.), Brain Theory, Springer, Berlin, Heidelberg, 1986: pp. 245–248. 10.1007/978-3-642-70911-1_20. [DOI] [Google Scholar]
  • 65.Holland J.H., P. of P. and of E.E. and Holland C.S.J.H., S.L. in Holland H.R.M., Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, MIT Press, 1992. [Google Scholar]
  • 66.A fast and elitist multiobjective genetic algorithm: NSGA-II—IEEE Journals & Magazine, (n.d.). https://ieeexplore.ieee.org/abstract/document/996017 (accessed April 25, 2020).
  • 67.Houck C.R., Joines J., Kay M.G., A genetic algorithm for function optimization: a Matlab implementation, Ncsu-Ie Tr. 95 (1995) 1–10. [Google Scholar]
  • 68.Liu L., Moayedi H., Rashid A.S.A., Rahman S.S.A., Nguyen H., Optimizing an ANN model with genetic algorithm (GA) predicting load-settlement behaviours of eco-friendly raft-pile foundation (ERP) system, Engineering with Computers. 36 (2020) 421–433. [Google Scholar]
  • 69.Hegazy T., Kassab M., Resource optimization using combined simulation and genetic algorithms, Journal of Construction Engineering and Management. 129 (2003) 698–705. [Google Scholar]
  • 70.Moayedi H., Raftari M., Sharifi A., Jusoh W.A.W., Rashid A.S.A., Optimization of ANFIS with GA and PSO estimating α ratio in driven piles, Engineering with Computers. 36 (2020) 227–238. [Google Scholar]
  • 71.Momeni E., Nazir R., Armaghani D.J., Maizir H., Prediction of pile bearing capacity using a hybrid genetic algorithm-based ANN, Measurement. 57 (2014) 122–131. [Google Scholar]
  • 72.Ardalan H., Eslami A., Nariman-Zadeh N., Shaft resistance of driven piles based on CPT and CPTu results using GMDH-type neural networks and genetic algorithms, in: The 12th International Conference of International Association for Computer Methods and Advances in Geomechanics (IACMAG), Citeseer, 2008: pp. 1850–1858. [Google Scholar]
  • 73.Luo Z., Hasanipanah M., Amnieh H.B., Brindhadevi K., Tahir M.M., GA-SVR: a novel hybrid data-driven model to simulate vertical load capacity of driven piles, Engineering with Computers. (2019) 1–9. [Google Scholar]
  • 74.Ardalan H., Eslami A., Nariman-Zadeh N., Piles shaft capacity from CPT and CPTu data by polynomial neural networks and genetic algorithms, Computers and Geotechnics. 36 (2009) 616–625. [Google Scholar]
  • 75.Zemouri R., Omri N., Fnaiech F., Zerhouni N., Fnaiech N., A new growing pruning deep learning neural network algorithm (GP-DLNN), Neural Comput & Applic. (2019). 10.1007/s00521-019-04196-8. [DOI] [Google Scholar]
  • 76.Dimiduk D.M., Holm E.A., Niezgoda S.R., Perspectives on the Impact of Machine Learning, Deep Learning, and Artificial Intelligence on Materials, Processes, and Structures Engineering, Integr Mater Manuf Innov. 7 (2018) 157–172. 10.1007/s40192-018-0117-8. [DOI] [Google Scholar]
  • 77.Abusamra H., A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma, Procedia Computer Science. 23 (2013) 5–14. 10.1016/j.procs.2013.10.003. [DOI] [Google Scholar]
  • 78.Hopgood A.A., Intelligent Systems for Engineers and Scientists, CRC Press, 2012. [Google Scholar]
  • 79.Ting C.-K., On the Mean Convergence Time of Multi-parent Genetic Algorithms Without Selection, in: Capcarrère M.S., Freitas A.A., Bentley P.J., Johnson C.G., Timmis J. (Eds.), Advances in Artificial Life, Springer, Berlin, Heidelberg, 2005: pp. 403–412. 10.1007/11553090_41. [DOI] [Google Scholar]
  • 80.Chai T., Draxler R.R., Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature, Geoscientific Model Development. 7 (2014) 1247–1250. [Google Scholar]
  • 81.Willmott C.J., Wicks D.E., An Empirical Method for the Spatial Interpolation of Monthly Precipitation within California, Physical Geography. 1 (1980) 59–73. 10.1080/02723646.1980.10642189. [DOI] [Google Scholar]
  • 82.Willmott C.J., On the Validation of Models, Physical Geography. 2 (1981) 184–194. 10.1080/02723646.1981.10642213. [DOI] [Google Scholar]
  • 83.Dao D.V., Trinh S.H., Ly H.-B., Pham B.T., Prediction of compressive strength of geopolymer concrete using entirely steel slag aggregates: Novel hybrid artificial intelligence approaches, Applied Sciences. 9 (2019) 1113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Nguyen H.-L., Le T.-H., Pham C.-T., Le T.-T., Ho L.S., Le V.M., et al. , Development of Hybrid Artificial Intelligence Approaches and a Support Vector Machine Algorithm for Predicting the Marshall Parameters of Stone Matrix Asphalt, Applied Sciences. 9 (2019) 3172. [Google Scholar]
  • 85.Momeni E., Dowlatshahi M.B., Omidinasab F., Maizir H., Armaghani D.J., Gaussian Process Regression Technique to Estimate the Pile Bearing Capacity, Arab J Sci Eng. 45 (2020) 8255–8267. 10.1007/s13369-020-04683-4. [DOI] [Google Scholar]
  • 86.Kulkarni R. U., Dewaikar D. M., Indian Institute of Technology Bombay, Prediction of Interpreted Failure Loads of Rock-Socketed Piles in Mumbai Region using Hybrid Artificial Neural Networks with Genetic Algorithm, IJERT. V6 (2017) IJERTV6IS060196. 10.17577/IJERTV6IS060196. [DOI] [Google Scholar]
  • 87.Jahed Armaghani D., Shoib R.S.N.S.B.R., Faizi K., Rashid A.S.A., Developing a hybrid PSO–ANN model for estimating the ultimate bearing capacity of rock-socketed piles, Neural Comput & Applic. 28 (2017) 391–405. 10.1007/s00521-015-2072-z. [DOI] [Google Scholar]

Decision Letter 0

Le Hoang Son

6 Oct 2020

PONE-D-20-26359

Design Deep Neural Network Architecture using a Genetic Algorithm for Estimation of Pile Bearing Capacity

PLOS ONE

Dear Dr. Tran,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Nov 20 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Le Hoang Son, Ph.D

Academic Editor

PLOS ONE

Journal requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. In your Data Availability statement, it is unclear why you have selected the option 'No - some restrictions will apply'. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

3. Thank you for stating the following financial disclosure:

"The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

At this time, please address the following queries:

  1. Please clarify the sources of funding (financial or material support) for your study. List the grants or organizations that supported your study, including funding received from your institution.

  2. State what role the funders took in the study. If the funders had no role in your study, please state: “The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.”

  3. If any authors received a salary from any of your funders, please state which authors and which funders.

  4. If you did not receive any funding for this study, please state: “The authors received no specific funding for this work.”

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

4.  We note that Figure 1 in your submission contain map images which may be copyrighted. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For these reasons, we cannot publish previously copyrighted maps or satellite images created using proprietary data, such as Google software (Google Maps, Street View, and Earth). For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright.

We require you to either (1) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (2) remove the figures from your submission:

4.1.    You may seek permission from the original copyright holder of Figure 1 to publish the content specifically under the CC BY 4.0 license. 

We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text:

“I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.”

Please upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission.

In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].”

4.2.    If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only.

The following resources for replacing copyrighted map figures may be helpful:

USGS National Map Viewer (public domain): http://viewer.nationalmap.gov/viewer/

The Gateway to Astronaut Photography of Earth (public domain): http://eol.jsc.nasa.gov/sseop/clickmap/

Maps at the CIA (public domain): https://www.cia.gov/library/publications/the-world-factbook/index.html and https://www.cia.gov/library/publications/cia-maps-publications/index.html

NASA Earth Observatory (public domain): http://earthobservatory.nasa.gov/

Landsat: http://landsat.visibleearth.nasa.gov/

USGS EROS (Earth Resources Observatory and Science (EROS) Center) (public domain): http://eros.usgs.gov/#

Natural Earth (public domain): http://www.naturalearthdata.com/

5. We note you have included a table to which you do not refer in the text of your manuscript. Please ensure that you refer to Table 9, 10, 11 in your text; if accepted, production will need this reference to link the reader to the Table.

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: No

Reviewer #2: No

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Reviewer #1

I have read the paper entitled "Design Deep Neural Network Architecture using a Genetic Algorithm for Estimation of Pile Bearing Capacity". In essence, the paper suggests a deep ANN-based predictive model for pile bearing capacity. It is interesting that authors used GA for reducing the number of features from 10 to 4. The paper is well written and well organized. Although compared to the previous publications, slight contribution was observed, presenting new sets of real data is always of interest as it can constitute common sense. Hence, firstly authors are requested to present at least 100 sets of data in the appendix. Further comments are presented in the following lines:

2. Include VAF performance index.

3. Enhance the literature review considerably by providing a Table of previous AI-based works in the field of foundation engineering including deep foundation, shallow foundation, thin-walled foundations below are some recommendations however authors do not have to cite them necessarily if they find them irrelevant. the implemented soft computing technique, type of foundations, dataset size, R or R2 should be highlighted in this table.

4. It should be clearly highlighted in the introduction that in what aspect the presented paper is different from other studies (like implementation of deep learning)

5. despite AI advantages, limitations of these methods should be clearly highlighted.

6. A competitor like conventional BP-ANN is needed for comparison purposes or the prediction performance of the proposed AI-based predictive model should be checked against other works.

7. checking the English is suggested.

Marto, A., Hajihassani, M., & Momeni, E. (2014). Bearing Capacity of Shallow Foundation's Prediction through Hybrid Artificial Neural Networks. In Applied Mechanics and Materials (Vol. 567, pp. 681-686). Trans Tech Publications Ltd.

Momeni, E., Armaghani, D. J., Fatemi, S. A., & Nazir, R. (2018). Prediction of bearing capacity of thin-walled foundation: a simulation approach. Engineering with Computers, 34(2), 319-327.

Momeni, E., Dowlatshahi, M. B., Omidinasab, F., Maizir, H., & Armaghani, D. J. (2020). Gaussian Process Regression Technique to Estimate the Pile Bearing Capacity. Arabian Journal for Science and Engineering, 1-13.

Nazir, R., Momeni, E., Marsono, K., & Maizir, H. (2015). An artificial neural network approach for prediction of bearing capacity of spread foundations in sand. Jurnal Teknologi, 72(3).

Rezaei, H., Nazir, R., & Momeni, E. (2016). Bearing capacity of thin-walled shallow foundations: an experimental and artificial intelligence-based study. Journal of Zhejiang University-SCIENCE A, 17(4), 273-285.

=======================

Reviewer #2

Introduction: As there are plenty of studies involving the GA optimized DNNs in this field I strongly advise to explain the novelty clearly and justify the need for this particular research.

Section 2.2. Data preparation: line 2 "[...] all the factors affecting the pile bearing capacity were considered.". I suggest to put it that way "all the known factors" as all the factors affecting the bearing capacity might not be discovered yet.

Section 4.2. Optimization of DLNN Architecture: line 10 "[...] model performed well better performance"?

Section 4.3. Predictive Capability of the Models: In Tab. 7. you compare the "predictive capability of the models" on three datasets (training, validation and testing). Low error achieved on the training and validation dataset does not mean that the model will predict accurately (i.e. perform good on testing set). When the function fits the training data very well the model's predictions can often be not so accurate (overfitting), cause the model has lower generalization ability. Therefore the predictive capability of the model can only be measured with the error obtained on the testing dataset.

Conclusions: I suggest pointing out the main achievement of this study, maybe mentioning possible applications of the developed model and future research perspectives.

=======================

Reviewer #3

The manuscript describes a technically sound scientific study with data that supports the conclusions. The experiments were performed rigorously with appropriate controls (four control conditions: R2, IA, RMSE and MAE), replication and sample size (1000 replicates, 3 structure configurations). On the basis of the obtained data, appropriate conclusions were drawn. The statistical analysis was performed appropriately and rigorously, although the presentation of the results shows a lack of consistency in the data:

- Figure 11 shows the R values whose R2 equivalents do not match the values summarized in Table 7.

- The results of the analyzes presented in Tables 9, 10 and 11 are identical and should contain collected values according to different three criteria.

The authors provided graphical access to all the data underlying the findings in their manuscript. The manuscript is clearly presented and written in standard English, but there are some minor typing errors in the text, e.g.:

- page 12: “The initialization parameters of GA used in this study are given in Tables 3.” (should be: “… in Table 3.”).

- page 21: “On the validation data set, the 4-input GA-DLNN model gave similar results to the 10-input GA-DLNN model and outperformed the 4-input GA-DLNN model with satisfactory accuracy…” (should be: “… the 4-input DLNN model with satisfactory accuracy…”).

The authors of the manuscript made every effort to ensure that the final version of the text was of the highest possible scientific and editorial level, but the reviewer believes that the combination of graphs presented in Figures 9 and 10 would allow for an easier comparative analysis of the results. The aim of the graphical presentation of the results in Figure 12 was a comparative analysis - according to the reviewer, changing the vertical scale (starting from higher values) will allow to emphasize the differences between the examined structures and the types of tests performed.

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Dec 17;15(12):e0243030. doi: 10.1371/journal.pone.0243030.r002

Author response to Decision Letter 0


8 Nov 2020

RESPONSES OF THE ACADEMIC EDITOR AND REVIEWER’S COMMENTS

I. Responses to academic editor

Comment 1: Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

Response:

We thank the academic editor for your announcements. We have already corrected our manuscript follow styling requests and resubmitted them with the revise version.

Comment 2: In your Data Availability statement, it is unclear why you have selected the option 'No - some restrictions will apply'. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

Response:

We thank the academic editor for your announcements. We have uploaded the required data which are found in the Appendix section of paper.

Comment 3: Thank you for stating the following financial disclosure:

"The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

At this time, please address the following queries:

a. Please clarify the sources of funding (financial or material support) for your study. List the grants or organizations that supported your study, including funding received from your institution.

b. State what role the funders took in the study. If the funders had no role in your study, please state: “The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.”

c. If any authors received a salary from any of your funders, please state which authors and which funders.

d. If you did not receive any funding for this study, please state: “The authors received no specific funding for this work.”

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

Response:

We added the statement “The authors received no specific funding for this work.” in the cover letter

Comment 4: We note that Figure 1 in your submission contain map images which may be copyrighted. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For these reasons, we cannot publish previously copyrighted maps or satellite images created using proprietary data, such as Google software (Google Maps, Street View, and Earth). For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright.

We require you to either (1) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (2) remove the figures from your submission:

4.1. You may seek permission from the original copyright holder of Figure 1 to publish the content specifically under the CC BY 4.0 license.

We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text:

“I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.”

Please upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission.

In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].”

4.2. If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only.

The following resources for replacing copyrighted map figures may be helpful:

USGS National Map Viewer (public domain): http://viewer.nationalmap.gov/viewer/

The Gateway to Astronaut Photography of Earth (public domain): http://eol.jsc.nasa.gov/sseop/clickmap/

Maps at the CIA (public domain): https://www.cia.gov/library/publications/the-world-factbook/index.html and https://www.cia.gov/library/publications/cia-maps-publications/index.html

NASA Earth Observatory (public domain): http://earthobservatory.nasa.gov/

Landsat: http://landsat.visibleearth.nasa.gov/

USGS EROS (Earth Resources Observatory and Science (EROS) Center) (public domain): http://eros.usgs.gov/#

Natural Earth (public domain): http://www.naturalearthdata.com/

Response:

We thank the academic editor for his announcements. We have replaced Figure 1 with an image for public use, specifically from CIA maps at domain:

https://www.cia.gov/library/publications/the-world-factbook/index.html

Comment 5: We note you have included a table to which you do not refer in the text of your manuscript. Please ensure that you refer to Table 9, 10, 11 in your text; if accepted, production will need this reference to link the reader to the Table.

Response:

We agree with you. We have corrected these errors. All changes have been marked in red color in the revised manuscript.  

II. Responses to Reviewer #1

General comment: I have read the paper entitled "Design Deep Neural Network Architecture using a Genetic Algorithm for Estimation of Pile Bearing Capacity". In essence, the paper suggests a deep ANN-based predictive model for pile bearing capacity. It is interesting that authors used GA for reducing the number of features from 10 to 4. The paper is well written and well organized. Although compared to the previous publications, slight contribution was observed, presenting new sets of real data is always of interest as it can constitute common sense.

Response:

We thank Reviewer #1 for his nice and constructive comments, which help us in improving the quality of our work.

Comment 1: Hence, firstly authors are requested to present at least 100 sets of data in the appendix

Response: Yes, we do agree and already added our data as request in the appendix.

Comment 2

Response:

We have already used 4 performance indicators in our study: R2, MAE, RMSE and IA. We believe that it is appropriate to use the VAF index more, but in this study, since many other indicators have been used, we will pay attention to the use of VAF in the upcoming studies.

Comment 3: Enhance the literature review considerably by providing a Table of previous AI-based works in the field of foundation engineering including deep foundation, shallow foundation, thin-walled foundations below are some recommendations however authors do not have to cite them necessarily if they find them irrelevant. the implemented soft computing technique, type of foundations, dataset size, R or R2 should be highlighted in this table.

Response:

We agree with reviewer comments. We found that supplementing the previous research results was the right thing to do, in particular, we added Table 12 to the manuscript. The contents have been added in the revised manuscript as follow:

Table 12. Comparison with other studies

Author Model Foundation type Number of samples R2 RMSE

Momeni el al. [1]

ANFIS Thin-walls 150 0.875 0.048

ANN 0.71 0.529

Momeni el al. [2]

GPR Piles 296 0.84 -

Kulkarni el al. [3]

GA-ANN Rock-socketed piles 132 0.86 0.0093

Jahed Armaghani el al. [4]

ANN 0.808 0.135

PSO-ANN 0.918 0.063

The present study GA-DNN Piles 472 0.882 109.965

Table 12 presents some research results on ML applications in foundation engineering. The results of this study as well as previous studies show that the expected foundation effectiveness of ML technique in foundation engineering with prediction results of foundation load is mostly reaching R2 from 0.8 to 0.9.

Comment 4: It should be clearly highlighted in the introduction that in what aspect the presented paper is different from other studies (like implementation of deep learning)

Response:

We agree with reviewer comments. We have already added section 2. Significance of the research study into our revise manuscript. The contents have been added in the revised manuscript as follow:

The numerical or experimental methods in the existing literature still have some limitations, such as lack of data set samples (Marto et al. [55] with 40 samples; Momeni et al. [45] with 36 samples; Momeni et al.[56] with 150 samples; Bagińska and Srokosz [57] with 50 samples; Teh et al. [58] with 37 samples), refinement of ML approaches or failure to fully consider key parameters which affects the predicting results of the model.

For this, the contribution of the present work can be marked through the following ideas: (i) large data set, including 472 experimental tests; (ii) reduce the input variables from 10 to 4 which help the model achieve more accurate results with faster training time, (iii) automatically design the optimal architecture for the DLNN model, all key parameters are considered, include: the number of hidden layers, the number of neurons in each hidden layer, the activation function and the training algorithm. In which, the number of hidden layers is not fixed but can be selected through cross-mating between the parent with different chromosome length. Besides, the randomness in the order of the training data set is also considered to assess the stability of predicting result of models with the training, validate and testing set.

Comment 5: Despite AI advantages, limitations of these methods should be clearly highlighted.

Response

We thank you for this very interesting comment. We have already added some limitations of machine learning method to our revise script. The contents have been added in the revised manuscript as follow:

Despite the recent successes of machine learning, this method has some limitations to keep in mind: It requires large amounts of of hand-crafted, structured training data and cannot be learned in real time. In addition, ML models still lack the ability to generalize conditions other than those encountered during the training. Therefore, the ML model only correctly predicts in a certain data range but is not generalized in all cases.

Comment 6: A competitor like conventional BP-ANN is needed for comparison purposes or the prediction performance of the proposed AI-based predictive model should be checked against other works.

Response:

Thanks for your interesting comment. Back Propagation (BP) is a gradient descent optimization algorithm and commonly used to find out optimal weights and biases of neural networks through training. Of course, the Genetic Algorithm is one of the optimal algorithms that does not use gradient descent, however, in this study, we did not use the genetic algorithm in the training model but use it to self to automate choice of network architecture (include as follows: number of hidden layers, number of neurons in each hidden layer, training algorithm and activation function for hidden neurons). Therefore, we don’t believe that it is make sense when compare GA - DLNN model with BP - ANN (BP - DLNN) model.

Comment 7: Checking the English is suggested.

Response: Thanks, we would check carefully.

III. Responses to Reviewer #2

General comment: Introduction: As there are plenty of studies involving the GA optimized DNNs in this field I strongly advise to explain the novelty clearly and justify the need for this particular research.

Response:

We agree with reviewer comments. We have already added section 2. Significance of the research study into our revise manuscript. The contents have been added in the revised manuscript as follow:

The numerical or experimental methods in the existing literature still have some limitations, such as lack of data set samples (Marto et al. [55] with 40 samples; Momeni et al. [45] with 36 samples; Momeni et al. [56] with 150 samples; Bagińska and Srokosz [57] with 50 samples; Teh et al. [58] with 37 samples), refinement of ML approaches or failure to fully consider key parameters which affects the predicting results of the model.

For this, the contribution of the present work can be marked through the following ideas: (i) large data set, including 472 experimental tests; (ii) reduce the input variables from 10 to 4 which help the model achieve more accurate results with faster training time, (iii) automatically design the optimal architecture for the DLNN model, all key parameters are considered, include: the number of hidden layers, the number of neurons in each hidden layer, the activation function and the training algorithm. In which, the number of hidden layers is not fixed but can be selected through cross-mating between the parent with different chromosome length. Besides, the randomness in the order of the training data set is also considered to assess the stability of predicting result of models with the training, validate and testing set.

Comment 1: Section 2.2. Data preparation: line 2 "[...] all the factors affecting the pile bearing capacity were considered.". I suggest to put it that way "all the known factors" as all the factors affecting the bearing capacity might not be discovered yet.

Response:

Thank you for your comment. We found it to be very helpful.

Comment 2: Section 4.2. Optimization of DLNN Architecture: line 10 "[...] model performed well better performance"?

Response:

Thank you for your comment. That was a mistake and we fixed it in revise manuscript.

Comment 3: Section 4.3. Predictive Capability of the Models: In Tab. 7. you compare the "predictive capability of the models" on three datasets (training, validation and testing). Low error achieved on the training and validation dataset does not mean that the model will predict accurately (i.e. perform good on testing set). When the function fits the training data very well the model's predictions can often be not so accurate (over fitting), cause the model has lower generalization ability. Therefore, the predictive capability of the model can only be measured with the error obtained on the testing dataset.

Response:

Thank you for your comment. I agree with your ideas, the predictive capability of the model can only be measured with the error obtained on the testing dataset and when the model too fits with training data, it’s might cause over fitting. Amongst our three DLNN models, the 4-in-DA-DLNN model gives the best results on the training and validation set and the test set. In my opinion, this model is not too over fitting, however, in upcoming studies, we will try to adjust the model carefully to get more desired results on testing set to get a more general model.

Comment 4: Conclusions: I suggest pointing out the main achievement of this study, maybe mentioning possible applications of the developed model and future research perspectives.

Response:

Thank you for your comment. We also have already added this content section 2.

IV. Responses to Reviewer #3

General comment: The manuscript describes a technically sound scientific study with data that supports the conclusions. The experiments were performed rigorously with appropriate controls (four control conditions: R2, IA, RMSE and MAE), replication and sample size (1000 replicates, 3 structure configurations). On the basis of the obtained data, appropriate conclusions were drawn. The statistical analysis was performed appropriately and rigorously, although the presentation of the results shows a lack of consistency in the data:

Response:

We thank Reviewer #3 for his nice and constructive comments, which help us in improving the quality of our work.

Comment 1: Figure 11 shows the R values whose R2 equivalents do not match the values summarized in Table 7.

Response: Thank you for your comment. That was a mistake and we fixed it in revise manuscript.

Comment 2: The results of the analyzes presented in Tables 9, 10 and 11 are identical and should contain collected values according to different three criteria.

The authors provided graphical access to all the data underlying the findings in their manuscript. The manuscript is clearly presented and written in standard English, but there are some minor typing errors in the text, e.g.:

Response: Thank you for your comment. That was a mistake and we fixed it in revise manuscript.

Comment 3: page 12: “The initialization parameters of GA used in this study are given in Tables 3.” (should be: “… in Table 3.”).

Response: Thank you for your comment. That was a mistake and we fixed it in revise manuscript.

Comment 4: page 21: “On the validation data set, the 4-input GA-DLNN model gave similar results to the 10-input GA-DLNN model and outperformed the 4-input GA-DLNN model with satisfactory accuracy…” (should be: “… the 4-input DLNN model with satisfactory accuracy…”).

Response: Thank you for your comment. That was a mistake and we fixed it in revise manuscript.

Comment 5: The authors of the manuscript made every effort to ensure that the final version of the text was of the highest possible scientific and editorial level, but the reviewer believes that the combination of graphs presented in Figures 9 and 10 would allow for an easier comparative analysis of the results. The aim of the graphical presentation of the results in Figure 12 was a comparative analysis - according to the reviewer, changing the vertical scale (starting from higher values) will allow to emphasize the differences between the examined structures and the types of tests performed.

Response:

Thank you for your comment. We found it very helpful and combined the Figure 9 and 10.

(1) Please ensure that you refer to Figure 4 in your text as, if accepted, production will need this reference to link the reader to the figure.

Figure 4 is linked in text of the manuscript.

References

[1] E. Momeni, D. J. Armaghani, S. A. Fatemi, and R. Nazir, “Prediction of bearing capacity of thin-walled foundation: a simulation approach,” Engineering with Computers, vol. 34, no. 2, pp. 319–327, Apr. 2018, doi: 10.1007/s00366-017-0542-x.

[2] E. Momeni, M. B. Dowlatshahi, F. Omidinasab, H. Maizir, and D. J. Armaghani, “Gaussian Process Regression Technique to Estimate the Pile Bearing Capacity,” Arab J Sci Eng, vol. 45, no. 10, pp. 8255–8267, Oct. 2020, doi: 10.1007/s13369-020-04683-4.

[3] R. U. Kulkarni, D. M. Dewaikar, and Indian Institute of Technology Bombay, “Prediction of Interpreted Failure Loads of Rock-Socketed Piles in Mumbai Region using Hybrid Artificial Neural Networks with Genetic Algorithm,” IJERT, vol. V6, no. 06, Art. no. 06, Jun. 2017, doi: 10.17577/IJERTV6IS060196.

[4] D. Jahed Armaghani, R. S. N. S. B. R. Shoib, K. Faizi, and A. S. A. Rashid, “Developing a hybrid PSO–ANN model for estimating the ultimate bearing capacity of rock-socketed piles,” Neural Comput & Applic, vol. 28, no. 2, pp. 391–405, Feb. 2017, doi: 10.1007/s00521-015-2072-z.

Decision Letter 1

Le Hoang Son

16 Nov 2020

Design Deep Neural Network Architecture using a Genetic Algorithm for Estimation of Pile Bearing Capacity

PONE-D-20-26359R1

Dear Dr. Pham,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Le Hoang Son, Ph.D

Academic Editor

PLOS ONE

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Reviewer #1: Authors have addressed all the comments properly and the paper can be accepted. The paper is now more interesting. The significance as well as the limitation of the work is now highlighted. The literature review is enhanced and the result of this study is compared with other relevant works.

Reviewer #2: (No Response)

Reviewer #3: The revised version of the work meets the reviewer's expectations and will certainly find great interest among readers dealing with this type of research.

The reviewer believes that minor editorial errors (such as table numbering) will be removed during the publication process.

Acceptance letter

Le Hoang Son

3 Dec 2020

PONE-D-20-26359R1

Design deep neural network architecture using a genetic algorithm for estimation of pile bearing capacity

Dear Dr. Pham:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Prof. Le Hoang Son

Academic Editor

PLOS ONE


Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES