Abstract
Local Linear Regression (LLR) is a nonparametric regression model applied in the modeling phase of Response Surface Methodology (RSM). LLR does not make reference to any fixed parametric model. Hence, LLR is flexible and can capture local trends in the data that might be too complicated for the OLS. However, besides the small sample size and sparse data which characterizes RSM, the performance of the LLR model nosedives as the number of explanatory variables considered in the study increases. This phenomenon, popularly referred to as curse of dimensionality, results in the scanty application of LLR in RSM. In this paper, we propose a novel locally adaptive bandwidths selector, unlike the fixed bandwidths and existing locally adaptive bandwidths selectors, takes into account both the number of the explanatory variables in the study and their individual values at each data point. Single and multiple response problems from the literature and simulated data were used to compare the performance of the with those of the OLS, and . Neural network activation functions such ReLU, Leaky-ReLU, SELU and SPOCU was considered and give a remarkable improvement on the loss function (Mean Squared Error) over the regression models utilized in the three data.
KEYWORDS: Desirability function, locally adaptive bandwidths selector, Local Linear Regression model, SPOCU activation function, Response Surface Methodology
1. Introduction
Response Surface Methodology (RSM) is a collection of mathematical and statistical techniques applied in the modeling and analysis of data in which a response is influenced by one or more explanatory variables, [5]. There are three distinct phases in RSM, namely, the Design of Experiment Phase, the Modeling Phase, and the Optimization Phase, see [6].
In the Modeling Phase of RSM, a fundamental assumption is that the relationship between the response variable and explanatory variables may be represented as:
(1) |
where the mean function denotes the true but unknown relationship between the response variable and the explanatory variables, are random error terms assumed to have a normal distribution with mean zero and constant variance and is the sample size, see [27,33].
The OLS and the LLR are existing regression models applied in the estimation of the unknown function in (1) [3,18]. The OLS model is applied in the estimation of the unknown parameters (coefficients) in the parametric (polynomial) model that the experimenter assumes adequate to approximate in (1), see [33].
1.1. The parametric regression model
(2) |
The OLS estimate response in the data point is given as:
(3) |
where is a 1 vector of response, is a model matrix, is the number of model parameters (coefficients), is the transpose of the matrix , and is the row vector of the matrix , see [29].
In matrix notation, the vector of OLS estimated response is expressed as:
(4) |
where the vector is the row of the OLS Hat matrix .
The OLS model requires several assumptions to be met for valid interpretation of its parameter estimates. Furthermore, it performs poorly if the assumed polynomial model is inadequate for the data, also see [33].
1.2. The Local Linear Regression model
The LLR model is a nonparametric regression version of the weighted least squares model, also see [14,12,23]. The LLR estimate, of , is given as:
(5) |
where is the row of the LLR model matrix given as:
where , denotes the value of the explanatory variable in the data point, is a diagonal weights matrix given as:
1.3. Bandwidths for nonparametric regression model
The bandwidth is the most crucial parameter in nonparametric regression and estimation procedure because the choice selected determines the shape of the fitted curve, see [14,22,30]. If for all and all , we say that smoothing in the LLR model is done using fixed bandwidth or global otherwise , , , are referred to as locally adaptive bandwidths given as:
is the simplified Gaussian kernel function and , , , are the bandwidths (smoothing parameters), see [28,29].
In a situation where more than one explanatory are used in the model matrix , the kernel weight is a product kernel given as:
(6) |
Each value of , and , may be thought of as an entry in a matrix say given as:
In the matrix are referred to as locally adaptive bandwidths.
In RSM, the matrix comprising the vector of optimal bandwidths is obtained from the minimization of the Penalized Prediction Error Sum of Squares ( ):
(7) |
where is the maximum Sum of Squared of Errors obtained as the , , all tend to infinity, is the Sum of Squared of Errors of , , , is the trace of the LLR Hat matrix and is the delete-one cross-validation estimate of where the observation left out, see [33].
The LLR model is flexible and can capture local trends which may be overlooked by the OLS model. However, its performance is generally poor when applied in studies that involve explanatory variables. This poor performance is referred to as ‘curse of dimensionality’ in the nonparametric regression literature [13].
1.4. Genetic algorithm
Once the data has been modeled, the resulting fitted curve is used for determining the setting of the explanatory variables that optimizes the response based on the production requirement. This task summarizes the aim of the Optimization Phase of RSM, also see [20,24]. In this paper, we perform all the optimization tasks using the Genetic Algorithm (GA) Optimization toolbox available in Matlab software.
The GA was introduced by Holland, see [19]. The GA procedure is based on natural selection and other genetic concepts including population, chromosomes, selection, crossover, mutation, etc., [7,16]. The GA is an evolutionary optimization tool that can be applied to solve a variety of problems that are not well suited for standard optimization algorithms including problems in which the objective function lacks closed-form expression such as the LLR model, discontinuous, non-differentiable, stochastic or highly nonlinear, see [2,29,32,35] and Figure 1.
Figure 1.
A basic GA flowchart.
For multiple response studies that involve response, it is essential that we get an optimal setting of the explanatory variables that simultaneously optimize all the responses with respect to their individual production requirements, see [18,31,34]. The most popular criterion applied in the optimization of multiple responses is the Desirability function, see [1,8,15].
1.5. Desirability function
Based on the production requirement of a response, the Desirability function transforms the estimated response, into a scalar measure,
If the response is of nominal-the-better (NTB) type where the response acceptable value lies between an upper limit, U and a lower limit, L, is given as:
(8) |
where is the target value of the response.
If the objective is to maximize the response, is given by a one-sided transformation as:
(9) |
where is interpreted as large enough value of the response.
If the objective is to minimize the response, is given by a one-sided transformation as:
(10) |
where is the small enough value of the response.
In all cases, are the parameters that control the shape of the desirability function, enabling the user to accommodate nonlinear desirability functions. However, for RSM data, the values of are taken to be 1, see [6,18].
1.6. The overall desirability
The overall objective of the Desirability criterion is to obtain the setting of the explanatory variables that maximizes the geometric mean (D) of all the individual desirability measures given as:
(11) |
The remainder of the paper is organized as follows: A review of existing locally adaptive bandwidths selector in RSM concludes the current Section. In Section 2, the proposed locally adaptive bandwidths selector is presented with an algorithm. Using three examples and simulated data, comparisons of the results of the LLR utilizing the bandwidths from the proposed locally adaptive bandwidths selector with these from the OLS, the LLR utilizing fixed bandwidth and the LLR utilizing the bandwidths from existing locally adaptive bandwidth selector is presented in Section 3, as well as using neural network activation function such as SPOCU activation function. The paper concludes in Section 4.
1.7. A review of locally adaptive bandwidths for data from RSM
Locally adaptive bandwidths perform better than their fixed counterpart because of their comparatively better sensitivity to local trends and patterns in the data; also see [4,11,36]. Locally adaptive bandwidths selectors modeled as functions of local information at each data point. Such local information includes the values of the explanatory variables, or of the response, , or both, allowing for different degrees of smoothing at each data point, see [4].
Also see [9], presented a data-driven locally adaptive bandwidth selector given as:
(12) |
where is the sample size, is the response at data point, and is a tuning parameter.
A drawback of the locally adaptive bandwidths selected by (12) is that they tend to cluster around a small range of values in the interval with a very small difference between the largest and the smallest bandwidths, see [10]. In order to address the clustering of the bandwidths selected by (12), see [10] presented a locally adaptive bandwidths selector given as:
(13) |
where is a fixed optimal bandwidth, is , , or any ordered statistic that reflects the inadequacies in the OLS estimates of the response, and are tuning parameters. The tuning parameters in (12) and (13) and in (13) are chosen based on the minimization the criterion in (7), see [10].
The selector in (13) performs very well, giving outstanding results when applied to problems from the literature. However, the curse of dimensionality in LLR originates from the number of explanatory variables, which is not considered in (12) or (13).
The idea that motivates this paper is that a bandwidths selector that assigns a unique bandwidth to each data point per the number of explanatory variables can proffer stronger remedial measures on the curse of dimensionality than the one which ignores such vital information about the data.
2. Methodology
We propose a locally adaptive bandwidths selector that incorporates important information of RSM data, namely, the value of the explanatory variables at each data point and the number of explanatory variables in the study.
The mathematical procedure for the modeling of the proposed locally adaptive bandwidths selector is as follows:
Denote the value of the bandwidth at the data point and for the explanatory variable as and assume that is proportional to a value of a weight function of :
(14) |
(15) |
where is the constant of proportionality may require scaling either upward or downward by the value of the weight function in order to achieve the optimum smoothing requirement in the data point.
An important attribute of a weight function is the ability to assign relatively smaller weight to relatively larger , and vice versa, according to the smoothing requirement of the data, see [10]. For instance, if we may either get or .
Mathematically, one of the ways (15) can incorporate the attribute is to express it as:
(16) |
where a real number and the exponent ‘2′ ensures nonnegative weights that could arise in some data points from the difference . plays two key roles: One, for a fixed Z, ensures that no data point is assigned a zero weight. Two, it ensures that the attribute in (ii) above is embedded, and accomplished in the proposed bandwidths selector.
In order to avoid the clustering of bandwidths, we proceed to obtain the optimal value of Z that would ensure that the difference between the largest bandwidth and the smallest one in the interval (0, 1) is as large as possible.
From Equation (16) we have:
(17) |
Set and in (17) to get:
(18) |
(19) |
Let and represent the range of By Mean Value Theorem, we have:
(20) |
(21) |
Subtracting Equation (18) from (19) and dividing the result by , we have:
(22) |
The Left-hand side of Equations (21) and (22) are equal. So, we can write:
(23) |
Differentiating Equation (17) with respect to we have:
(24) |
(25) |
Equating Equation (23) and (25), we have:
(26) |
(27) |
(28) |
Therefore, , is the optimal value of Z in [0,1] that guarantee minimum clustering of the locally adaptive bandwidths from (16). Substituting in (16) gives:
(29) |
The matrix, of the locally adaptive optimal bandwidths from Equation (29) is obtained at optimally selected values of , , (hereafter referred to as and , respectively), , based on the minimization of the criterion in (7).
The optimal values of the tuning parameters ( , for the proposed bandwidths selector in (29), for example, see [10] in (13)) and the locally adaptive optimal bandwidths for k explanatory variables are presented in Tables 1 and 2.
Table 1.
Optimal values of the tuning parameters and and the bandwidths of the proposed bandwidth selector.
1 | ( ) | |||
2 | ||||
N |
Table 2.
Optimal values of the tuning parameters for and and the bandwidths from [10].
1 | ( ) | |||
2 | ||||
N |
Unlike the bandwidths from the proposed bandwidths selector, ( ) = ( ) = … = ( ), see [10] since and , are same for the explanatory variables and Figure 2 and Figure 3.
Figure 2.
Plot of bandwidths against explanatory variables for fixed values of T2 less than 1.
Figure 3.
Plot of bandwidths against explanatory variables for fixed values of T2 greater than 1.
From the plots in Figures 2 and 3, we observe that as increases from 0.05 through 0.25, the value of bandwidth increases as values of the explanatory variable increase from 0 to 1. At , we notice the beginning of a new trend which culminates in a parabolic curve at For these values of through , the vertices of the parabolas show a gradual shift from towards At these plots with pseudo-parabolas, the local bandwidths decreases as increases from 0 to the data point where the vertex for the particular plot is located, and thereafter increases as the x increases. Elsewhere, for through , the bandwidth increases as x increases. The reverse is displayed at the plots for through . At , we have a horizontal curve that indicates fixed or global bandwidth for all the data points. In addition, no data point is assigned a negative bandwidth. These observations are graphical assertions of the set objectives regarding the modeling of the proposed locally adaptive bandwidth selector.
2.1. Algorithm: Leave-one-out cross-validation technique for selecting locally adaptive bandwidths for LLR model
Step 1: define the bandwidths for ith location and kth number of explanatory variables.
Step 2: obtain a set of acceptable values of bandwidths (for RSM data, ) for which the locally adaptive bandwidths are located values.
Step 3: define the leave-one-out cross validation for all likely set if over the complete range in the set
Step 4: define the criterion:
for selecting locally adaptive bandwidths on the interval (0, 1) and obtain the estimated response at location i, leaving out the ith observation for the set of locally adaptive bandwidths
Step 5: obtain as tend to infinity, in
Step 6: obtain for a particular set of locally adaptive bandwidths:
Step 7: lastly, outline the optimal locally adaptive bandwidths:
( ) is the convergence results that minimized criterion [13].
Step 8: Stop.
3. Application
In the first part of this Section, a single response and two multiple response problems are used in order to compare the performance of the proposed locally adaptive bandwidths selector and the locally adaptive selector, see [10]. The aim of the comparisons is to validate the bandwidths selector that has more capacity to reduce the curse of dimensionality in the LLR model applied to RSM data. The goodness-of-fit used for comparison include the Sum of Squared Errors , the Mean Squared Error (MSE), the Coefficient of Determination, ( ), the Adjusted Coefficient of Determination , the criterion given in (7) and the where is the leave-one-out estimate of , is the trace of the Hat matrix, and refers to any of the regression models OLS or LLR. In the second part, we use simulated data to further compare the respective performances.
The SSE and MSE indicate how close the estimated responses are to their observed values. A measure of the amount of variability present in the data is explained by the regression model is given by the and . and give a measure of the model’s predictive accuracy.
The results from LLR that utilizes fixed bandwidth, locally adaptive bandwidths, see [10] and the those from the proposed locally adaptive bandwidths selector are designated , and , respectively, where the subscript PAB stands for Proposed Adaptive Bandwidths selector.
3.1. Single response chemical process data
The problem of the study as given in [10,27,29] was to relate chemical yield (y) to temperature ( ) and time ( ) with the aim to maximize the chemical yield. The data obtained using the Central Composite Design (CCD) is given in Table 3.
Table 3.
Single response chemical process data generated from the Central Composite Design.
1 | −1 | −1 | 88.55 |
2 | 1 | −1 | 85.80 |
3 | −1 | 1 | 86.29 |
4 | 1 | 1 | 80.44 |
5 | −1.414 | 0 | 85.50 |
6 | 1.414 | 0 | 85.39 |
7 | 0 | −1.414 | 86.22 |
8 | 0 | 1.414 | 85.70 |
9 | 0 | 0 | 90.21 |
10 | 0 | 0 | 90.85 |
11 | 0 | 0 | 91.31 |
Source: see [27].
3.2. Transformation of data from Central Composite Design
Following nonparametric regression procedures in RSM, the values of the explanatory variables are coded between 0 and 1. The data collected via a CCD is transformed by a mathematical relation:
(30) |
where is the transformed value, is the target value that needed to be transformed in the vector containing the old coded value, represented as , and are the minimum and maximum values in the vector , respectively [27].
The natural or coded variables in Table 3 can be transformed to explanatory variables in Table 4 using Equation (30)
Table 4.
The transformed single response chemical process data.
1 | 0.1464 | 0.1464 | 88.55 |
2 | 0.8536 | 0.1464 | 85.80 |
3 | 0.1464 | 0.8536 | 86.29 |
4 | 0.8536 | 0.8536 | 80.44 |
5 | 0.0000 | 0.5000 | 85.50 |
6 | 1.0000 | 0.5000 | 85.39 |
7 | 0.5000 | 0.0000 | 86.22 |
8 | 0.5000 | 1.0000 | 85.70 |
9 | 0.5000 | 0.5000 | 90.21 |
10 | 0.5000 | 0.5000 | 90.85 |
11 | 0.5000 | 0.5000 | 91.31 |
Source: see [27].
Target points needed to be transformed for location 1 under the coded variables are given below:
Target points ;
Target points needed to be transformed for location 2 under the coded variables are given below:
Target points ;
Target points needed to be transformed for location 6 under the coded variables are given below:
Target points ;
Repeating the process up to location 11, then we obtain the entries for explanatory variables and , respectively, in Table 4.
The explanatory variables are coded between 0 and 1 by a mathematical relation as given in equation (30). Thus, the transformed data is given in Table 4.
The proposed local adaptive optimal bandwidths are presented in Table 5 and the goodness-of-fit statistics are presented in Table 6, respectively.
Table 5.
Proposed optimal tuning parameters and optimal locally adaptive bandwidths for the single response chemical process data.
1 | 0.2672 | 0.1826 |
2 | 0.0597 | 0.1826 |
3 | 0.2672 | 0.1446 |
4 | 0.0597 | 0.1446 |
5 | 0.3288 | 0.0006 |
6 | 0.0353 | 0.0006 |
7 | 0.1448 | 0.3534 |
8 | 0.1448 | 0.2996 |
9 | 0.1448 | 0.0006 |
10 | 0.1448 | 0.0006 |
11 | 0.1448 | 0.0006 |
Table 6.
Comparison of the goodness-of-fit statistics of each method for the single response chemical process data.
Method | |||||||||
---|---|---|---|---|---|---|---|---|---|
– | 5.000 | 3.1600 | 15.8182 | 0.8388 | 0.6777 | 109.5179 | 21.9036 | 21.9036 | |
0.5200 | 5.6509 | 5.7000 | 32.2355 | 0.6717 | 0.4190 | 93.2835 | 16.5076 | 8.9508 | |
* | 2.9261 | 0.5974 | 1.7481 | 0.9822 | 0.9391 | 46.0765 | 15.7467 | 4.2858 | |
* | 2.0537 | 0.3947 | 0.8106 | 0.9917 | 0.9598 | 45.2734 | 22.0443 | 4.5398 |
Generally, the results in Table 6 show that the performs better than OLS, and in terms of SSE, MSE, , and PRESS statistics. Whereas, performs better than OLS, and in terms of and . Also from Table 6, ‘‘*’’ represents AB of [10] and the PAB, respectively.
From Figure 4, the residual plots for different models are shown, and averagely, the gives the best representation of how well the regression model estimates as given in Equation (1).
Figure 4.
Graph of Model Residuals for Single Response Chemical Process Data.
From Table 7, provides the best chemical yield over OLS, and and the two settings of the explanatory variables give the best process satisfaction.
Table 7.
Comparison of optimization results for the single chemical process data.
Approach | |||
---|---|---|---|
OLS | 0.43930 | 0.43610 | 90.9780 |
0.40140 | 0.39438 | 88.3509 | |
0.40771 | 0.42312 | 91.1278 | |
0.7272 | 0.5000 | 92.6823 |
3.3. The multiple response chemical process data
This problem is analyzed in [17,18]. The aim of the study is to get the setting of the explanatory variables and (representing reaction time and temperature, respectively) that would simultaneously optimize three quality measures of a chemical solution , , and (representing yield, viscosity, and molecular weight, respectively).
Based on the process requirements a CCD was conducted to establish the design experiment and observed responses as presented in Table 8.
Table 8.
Responses | |||||
---|---|---|---|---|---|
i | |||||
1 | −1 | –1 | 76.5 | 62 | 2940 |
2 | 1 | –1 | 78.0 | 66 | 3680 |
3 | −1 | 1 | 77.0 | 60 | 3470 |
4 | 1 | 1 | 79.5 | 59 | 3890 |
5 | −1.414 | 0 | 75.6 | 71 | 3020 |
6 | 1.414 | 0 | 78.4 | 68 | 3360 |
7 | 0 | –1.414 | 77.0 | 57 | 3150 |
8 | 0 | 1.414 | 78.5 | 58 | 3630 |
9 | 0 | 0 | 79.9 | 72 | 3480 |
10 | 0 | 0 | 80.3 | 69 | 3200 |
11 | 0 | 0 | 80.0 | 68 | 3410 |
12 | 0 | 0 | 79.7 | 70 | 3290 |
13 | 0 | 0 | 79.8 | 71 | 3500 |
The values of the explanatory variables are transformed by the relation in Equation (30) coded between 0 and 1 as given in Table 9.
Table 9.
The transformed multiple response chemical process data.
1 | 0.1464 | 0.1464 | 76.5 | 62 | 2940 |
2 | 0.8536 | 0.1464 | 78.0 | 66 | 3680 |
3 | 0.1464 | 0.8536 | 77.0 | 60 | 3470 |
4 | 0.8536 | 0.8536 | 79.5 | 59 | 3890 |
5 | 0.0000 | 0.5000 | 75.6 | 71 | 3020 |
6 | 1.0000 | 0.5000 | 78.4 | 68 | 3360 |
7 | 0.5000 | 0.0000 | 77.0 | 57 | 3150 |
8 | 0.5000 | 1.0000 | 78.5 | 58 | 3630 |
9 | 0.5000 | 0.5000 | 79.9 | 72 | 3480 |
10 | 0.5000 | 0.5000 | 80.3 | 69 | 3200 |
11 | 0.5000 | 0.5000 | 80.0 | 68 | 3410 |
12 | 0.5000 | 0.5000 | 79.7 | 70 | 3290 |
13 | 0.5000 | 0.5000 | 79.8 | 71 | 3500 |
The process requirements for each response are as follows:
Maximize with lower limit and target value ;
should take a value in the range and with 65;
Minimize with upper limit and target value 3100.
The real values of the explanatory variables are transformed to values in the interval [0 1] by a mathematical relation in equation (30). This is a standard procedure for nonparametric regression models, see [29,33]. The data is presented in Table 9. The OLS is applied to get estimates of the parameters of a full second-order polynomial model specified for three responses.
The optimal values of the tuning parameters of both the proposed bandwidth selector and see [10] for each response are presented in Table 10. Table 11 presents the optimal bandwidths from each bandwidths selector. Table 12 presents the goodness of fits.
Table 10.
Optimal values of tuning parameters of the proposed locally adaptive bandwidths selector and [10] for the multiple response chemical process data.
Proposed tuning parameters for locally adaptive bandwidth selector | Edionwe et al. (2016) | ||||||
---|---|---|---|---|---|---|---|
1.0031 | 5.9998 | 1.2694 | 1.0541 | 0.5123 | 0.0797 | 6.0399 | |
1.5240 | 4.4808 | 0.4149 | 7.6721 | 0.4847 | 0.0959 | 2.4438 | |
0.9987 | 3.2763 | 0.6603 | 3.6044 | 1.0000 | 0.0896 | 4.8181 |
Table 11.
Optimal locally adaptive bandwidths from each selector for the multiple response chemical process data.
Proposed locally adaptive bandwidths selector | Edionwe et al. (2016) Bandwidths selector | ||||||||
---|---|---|---|---|---|---|---|---|---|
( ) | ( ) | ( ) | ( ) | ( ) | ( ) | ||||
1 | 0.2269 | 0.1655 | 0.3328 | 0.0960 | 0.2070 | 0.1393 | 0.4041 | 0.1106 | 0.6669 |
2 | 0.1284 | 0.1655 | 0.1460 | 0.0960 | 0.0573 | 0.1393 | 0.2781 | 0.0881 | 0.1755 |
3 | 0.2269 | 0.1218 | 0.3328 | 0.0627 | 0.2070 | 0.0457 | 0.3621 | 0.1219 | 0.3149 |
4 | 0.1284 | 0.1218 | 0.1460 | 0.0627 | 0.0573 | 0.0457 | 0.1521 | 0.1276 | 0.0360 |
5 | 0.2508 | 0.0008 | 0.3810 | 0.0784 | 0.2497 | 0.0862 | 0.4797 | 0.0599 | 0.6138 |
6 | 0.1115 | 0.0008 | 0.1168 | 0.0784 | 0.0379 | 0.0862 | 0.2445 | 0.0768 | 0.3880 |
7 | 0.1741 | 0.3174 | 0.2299 | 0.1037 | 0.1205 | 0.1651 | 0.3621 | 0.1389 | 0.5275 |
8 | 0.1741 | 0.2555 | 0.2299 | 0.0567 | 0.1205 | 0.0327 | 0.2361 | 0.1332 | 0.2087 |
9 | 0.1741 | 0.0008 | 0.2299 | 0.0784 | 0.1205 | 0.0862 | 0.1185 | 0.0542 | 0.3083 |
10 | 0.1741 | 0.0008 | 0.2299 | 0.0784 | 0.1205 | 0.0862 | 0.0849 | 0.0712 | 0.4943 |
11 | 0.1741 | 0.0008 | 0.2299 | 0.0784 | 0.1205 | 0.0862 | 0.1101 | 0.0768 | 0.3548 |
12 | 0.1741 | 0.0008 | 0.2299 | 0.0784 | 0.1205 | 0.0862 | 0.1353 | 0.0655 | 0.4345 |
13 | 0.1741 | 0.0008 | 0.2299 | 0.0784 | 0.1205 | 0.0862 | 0.1269 | 0.0599 | 0.2950 |
Table 12.
Model goodness of fits statistics for the multi-response chemical process data.
Response | Model | (%) | (%) | |||||
---|---|---|---|---|---|---|---|---|
7.0000 | 0.3361 | 2.3525 | 0.4962 | 0.0709 | 98.27 | 97.04 | ||
7.4717 | 0.5686 | 8.4888 | 4.7536 | 0.6362 | 83.46 | 73.44 | ||
4.7777 | 0.2063 | 3.0144 | 0.3103 | 0.0649 | 98.92 | 97.29 | ||
4.0144 | 0.0481 | 0.6687 | 0.2165 | 0.0539 | 99.25 | 97.75 | ||
7.0000 | 28.8726 | 202.1082 | 36.2242 | 5.1749 | 89.98 | 82.81 | ||
7.2576 | 22.0691 | 330.8149 | 80.2383 | 11.0558 | 77.79 | 63.27 | ||
4.0000 | 9.2024 | 126.2331 | 10.0000 | 2.5000 | 97.23 | 91.70 | ||
4.0009 | 8.8531 | 121.4495 | 10.0000 | 2.4994 | 97.23 | 91.70 | ||
7.0000 | 159,080 | 1,113,600 | 207,870 | 29,696 | 75.90 | 58.68 | ||
9.2798 | 56,513 | 588,010 | 243,460 | 26,235 | 71.77 | 63.50 | ||
5.8380 | 40,779 | 508,170 | 92,621 | 15,865 | 89.26 | 77.93 | ||
4.0000 | 26504 | 307,560 | 65,720 | 16,430 | 92.38 | 77.14 |
The results presented in Table 12 shows that , either exclusively or jointly, gives the best results in terms of all the statistics for and . For the , the gives the best results in four out of the seven statistics for comparison. Interestingly, gives the best and for all the responses with Figure 5.
Figure 5.
Graphs of model residuals for the multiple response chemical process data.
Figure 5 shows that the residuals of both the and overlap while those from , for the most part, are seen to lie closest to the zero residual line than those from existing models for and . Furthermore, quite unlike the curves of existing models, we observe that approximately the same number of residuals lie both above and below the zero residual line in all the curves. This is indicative of the fact that gives curves of best fit.
The optimization solutions in Table 13 show that provides the settings of the explanatory variables that give the highest desirability measure.
Table 13.
Model optimal solution based on the Desirability function for multi-response chemical process data.
Model | (%) | ||||||||
---|---|---|---|---|---|---|---|---|---|
0.4449 | 0.2226 | 78.7616 | 66.4827 | 3229.9 | 0.1744 | 0.5058 | 0.3504 | 31.3800 | |
0.4481 | 0.3709 | 78.5537 | 66.7908 | 3290.8 | 0.0358 | 0.4031 | 0.0461 | 8.7200 | |
0.5155 | 0.3467 | 78.6965 | 65.0328 | 3285.9 | 0.1310 | 0.9891 | 0.0703 | 20.8837 | |
1.0000 | 0.6472 | 79.6033 | 64.0137 | 3212.7 | 0.7355 | 0.6712 | 0.4367 | 59.9647 |
3.4. The Minced Fish Quality Data
The Minced Fish Quality Data is presented in [31,33]. The problem seeks the setting of three explanatory variables (washing temperature), (washing time) and (washing ratio of water volume to sample weight) that would optimize four aspects of quality of minced fish, namely, springiness ( ), thiobarbituric acid number ( ), cooking loss ( ), and whiteness index ( ).
Based on the process requirements, a CCD was conducted to establish the design experiment and observed responses as presented in Table 14.
Table 14.
The Minced Fish Quality Data generated through CCD [33].
Coded levels | |||||||
---|---|---|---|---|---|---|---|
i | y1 | y2 | y3 | ||||
1 | −1 | −1 | −1 | 1.83 | 29.31 | 29.50 | 50.36 |
2 | 1 | −1 | −1 | 1.73 | 39.32 | 19.40 | 48.16 |
3 | −1 | 1 | −1 | 1.85 | 25.16 | 25.70 | 50.72 |
4 | 1 | 1 | −1 | 1.67 | 40.18 | 27.10 | 49.69 |
5 | −1 | −1 | 1 | 1.86 | 29.82 | 21.40 | 50.09 |
6 | 1 | −1 | 1 | 1.77 | 32.20 | 24.00 | 50.61 |
7 | −1 | 1 | 1 | 1.88 | 22.01 | 19.60 | 50.36 |
8 | 1 | 1 | 1 | 1.66 | 40.02 | 25.10 | 50.42 |
9 | −1.682 | 0 | 0 | 1.81 | 33.00 | 24.20 | 29.31 |
10 | 1.682 | 0 | 0 | 1.37 | 51.59 | 30.60 | 50.67 |
11 | 0 | −1.682 | 0 | 1.85 | 20.35 | 20.90 | 48.75 |
12 | 0 | 1.682 | 0 | 1.92 | 20.53 | 18.90 | 52.70 |
13 | 0 | 0 | −1.682 | 1.88 | 23.85 | 23.00 | 50.19 |
14 | 0 | 0 | 1.682 | 1.90 | 20.16 | 21.20 | 50.86 |
15 | 0 | 0 | 0 | 1.89 | 21.72 | 18.50 | 50.84 |
16 | 0 | 0 | 0 | 1.88 | 21.21 | 18.60 | 50.93 |
17 | 0 | 0 | 0 | 1.87 | 21.55 | 16.80 | 50.98 |
The values of the explanatory variables are transformed by the relation in Equation (30) which is coded between 0 and 1 as given in Table 15.
Table 15.
The transformed Minced Fish Quality Data [33].
1 | 0.2030 | 0.2030 | 0.2030 | 1.83 | 29.31 | 29.50 | 50.36 |
2 | 0.7970 | 0.2030 | 0.2030 | 1.73 | 39.32 | 19.40 | 48.16 |
3 | 0.2030 | 0.7970 | 0.2030 | 1.85 | 25.16 | 25.70 | 50.72 |
4 | 0.7970 | 0.7970 | 0.2030 | 1.67 | 40.18 | 27.10 | 49.69 |
5 | 0.2030 | 0.2030 | 0.7970 | 1.86 | 29.82 | 21.40 | 50.09 |
6 | 0.7970 | 0.2030 | 0.7970 | 1.77 | 32.20 | 24.00 | 50.61 |
7 | 0.2030 | 0.7970 | 0.7970 | 1.88 | 22.01 | 19.60 | 50.36 |
8 | 0.7970 | 0.7970 | 0.7970 | 1.66 | 40.02 | 25.10 | 50.42 |
9 | 0.0000 | 0.5000 | 0.5000 | 1.81 | 33.00 | 24.20 | 29.31 |
10 | 1.0000 | 0.5000 | 0.5000 | 1.37 | 51.59 | 30.60 | 50.67 |
11 | 0.5000 | 0.0000 | 0.5000 | 1.85 | 20.35 | 20.90 | 48.75 |
12 | 0.5000 | 1.0000 | 0.5000 | 1.92 | 20.53 | 18.90 | 52.70 |
13 | 0.5000 | 0.5000 | 0.0000 | 1.88 | 23.85 | 23.00 | 50.19 |
14 | 0.5000 | 0.5000 | 1.0000 | 1.90 | 20.16 | 21.20 | 50.86 |
15 | 0.5000 | 0.5000 | 0.5000 | 1.89 | 21.72 | 18.50 | 50.84 |
16 | 0.5000 | 0.5000 | 0.5000 | 1.88 | 21.21 | 18.60 | 50.93 |
17 | 0.5000 | 0.5000 | 0.5000 | 1.87 | 21.55 | 16.80 | 50.98 |
The process requirements for each response given in [33] are as follows:
Maximize y with lower bound L = 1.70, and target value ∅ = 1.92;
Minimize y with target value ∅ = 20.16 and upper bound U = 21.00;
Minimize y with target value ∅ = 16.80, and upper bound U = 20.00;
Maximize y with lower bound L = 45.00, and target value ∅ = 50.98.
The polynomials specified for the response variables and include the intercept, and . The one specified for includes the intercept, , , and and for we have the intercept, , , . The OLS is used to get the estimates of the parameters of these polynomials.
The optimal values of the tuning parameters of both the proposed bandwidth selector and see [10] for each response are presented in Table 16. Table 17 presents the optimal bandwidths from each of the bandwidths selector. The models’ goodness of fits is presented in Table 18.
Table 16.
Optimal values of the tuning parameters of the proposed bandwidths selector and [10] for the Minced Fish Quality Data.
Proposed tuning parameters for locally adaptive bandwidth selector | Edionwe et al. (2016) | ||||||||
---|---|---|---|---|---|---|---|---|---|
0.6575 | – | – | – | – | 0.1463 | 0.8441 | 9.3384 | ||
– | – | 0.4363 | 0.0000 | 7.8641 | |||||
0.5371 | 0.0841 | 14.4996 | |||||||
– | – | – | – | 0.1197 | 0.5210 | 10.6354 |
Table 17.
Optimal locally adaptive bandwidths from each selector in the Minced FishQquality Data.
Proposed locally adaptive bandwidths selector | Edionwe et al. (2016) Bandwidths selector | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0.0443 | 0.1815 | 0.7673 | 0.2522 | 0.3200 | 0.2847 | 0.1399 | 0.0803 | 0.2044 | 0.1337 | 0.0747 |
2 | 0.1293 | 0.0857 | 0.7673 | 0.1334 | 0.3200 | 0.2847 | 0.0405 | 0.0806 | 0.2742 | 0.6098 | 0.0751 |
3 | 0.0443 | 0.1815 | 0.2631 | 0.2522 | 0.1686 | 0.2847 | 0.1399 | 0.0802 | 0.1755 | 0.3128 | 0.0746 |
4 | 0.1293 | 0.0857 | 0.2631 | 0.1334 | 0.1686 | 0.2847 | 0.0405 | 0.0808 | 0.2802 | 0.2468 | 0.0748 |
5 | 0.0443 | 0.1815 | 0.7673 | 0.2522 | 0.3200 | 0.1599 | 0.1399 | 0.0802 | 0.2080 | 0.5155 | 0.0747 |
6 | 0.1293 | 0.0857 | 0.7673 | 0.1334 | 0.3200 | 0.1599 | 0.0405 | 0.0805 | 0.2246 | 0.3929 | 0.0746 |
7 | 0.0443 | 0.1815 | 0.2631 | 0.2522 | 0.1686 | 0.1599 | 0.1399 | 0.0801 | 0.1535 | 0.6003 | 0.0747 |
8 | 0.1293 | 0.0857 | 0.2631 | 0.1334 | 0.1686 | 0.1599 | 0.0405 | 0.0808 | 0.2791 | 0.3411 | 0.0746 |
9 | 0.1644 | 0.2224 | 0.4823 | 0.3013 | 0.2383 | 0.2178 | 0.1876 | 0.0803 | 0.2301 | 0.3835 | 0.0787 |
10 | 0.3074 | 0.0611 | 0.4823 | 0.1014 | 0.2383 | 0.2178 | 0.0202 | 0.0818 | 0.3598 | 0.0818 | 0.0746 |
11 | 0.0055 | 0.1292 | 1.0000 | 0.1881 | 0.3829 | 0.2178 | 0.0827 | 0.0802 | 0.1419 | 0.5391 | 0.0750 |
12 | 0.0055 | 0.1292 | 0.1512 | 0.1881 | 0.1278 | 0.2178 | 0.0827 | 0.0800 | 0.1432 | 0.6333 | 0.0742 |
13 | 0.0055 | 0.1292 | 0.4823 | 0.1881 | 0.2383 | 0.3356 | 0.0827 | 0.0801 | 0.1663 | 0.4401 | 0.0747 |
14 | 0.0055 | 0.1292 | 0.4823 | 0.1881 | 0.2383 | 0.1254 | 0.0827 | 0.0800 | 0.1406 | 0.5249 | 0.0746 |
15 | 0.0055 | 0.1292 | 0.4823 | 0.1881 | 0.2383 | 0.2178 | 0.0827 | 0.0801 | 0.1515 | 0.6522 | 0.0746 |
16 | 0.0055 | 0.1292 | 0.4823 | 0.1881 | 0.2383 | 0.2178 | 0.0827 | 0.0801 | 0.1479 | 0.6475 | 0.0745 |
17 | 0.0055 | 0.1292 | 0.4823 | 0.1881 | 0.2383 | 0.2178 | 0.0827 | 0.0801 | 0.1503 | 0.7323 | 0.0745 |
Table 18.
Model goodness of fits statistics for the Minced Fish Quality Data.
Response | Model | (%) | (%) | |||||
---|---|---|---|---|---|---|---|---|
14.0000 | 0.0042 | 0.0582 | 0.0231 | 0.0017 | 92.13 | 91.00 | ||
12.1398 | 0.0026 | 0.0681 | 0.0126 | 0.0010 | 95.70 | 94.33 | ||
12.0000 | 0.0008 | 0.0216 | 0.0123 | 0.0010 | 95.79 | 94.39 | ||
12.0000 | 0.0019 | 0.0491 | 0.0123 | 0.0010 | 95.79 | 94.39 | ||
12.0000 | 19.5097 | 234.1166 | 90.9033 | 7.5753 | 93.39 | 91.18 | ||
11.2152 | 36.4407 | 786.71166 | 245.3568 | 21.8771 | 82.15 | 74.53 | ||
8.1282 | 16.7007 | 359.9569 | 38.7168 | 4.7633 | 97.18 | 94.45 | ||
8.2177 | 7.4867 | 162.1354 | 37.8103 | 4.6011 | 97.25 | 94.64 | ||
9.0000 | 20.2719 | 182.4468 | 41.1338 | 4.5704 | 84.06 | 71.66 | ||
8.3794 | 17.0573 | 287.0907 | 82.1622 | 9.8053 | 68.16 | 39.21 | ||
5.8585 | 11.5001 | 203.8490 | 20.4613 | 3.4926 | 92.07 | 78.35 | ||
2.0443 | 8.0901 | 120.7925 | 2.0489 | 1.0023 | 99.21 | 93.79 | ||
14.0000 | 48.9101 | 684.7407 | 198.8048 | 14.2003 | 54.13 | 47.57 | ||
12.0308 | 17.1477 | 454.5609 | 12.2623 | 1.0193 | 97.17 | 96.24 | ||
12.0000 | 14.0842 | 372.9912 | 12.1387 | 1.0116 | 97.20 | 96.27 | ||
12.0001 | 8.8590 | 234.6134 | 12.1387 | 1.0116 | 97.20 | 96.27 |
From the results in Table 18, we observe that the performs quite as well as the and the in the both and , This is due to the fact that both and involve a single explanatory variable, However, outperforms the OLS, and in and which depend on two and three explanatory variables, respectively. Again, gives the best and in three out of the four responses, coming a close second in .
The results in Table 19 show that provides the setting of the explanatory variables that gives the highest desirability measure of 100%, Figure 6.
Table 19.
Model optimal solution via the Desirability function in the Minced Fish Quality Data.
Model | D(%) | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
0.3764 | 1.0000 | 0.7155 | 1.9071 | 19.4993 | 17.2185 | 50.3018 | 0.9415 | 1.00 | 0.8692 | 0.8866 | 92.29 | |
0.8078 | 0.2375 | 0.9573 | 1.6877 | 36.7371 | 24.7076 | 49.7628 | 0.000 | 0.00 | 0.0000 | 0.7965 | 0.00 | |
0.4318 | 1.0000 | 0.5673 | 1.8775 | 18.9436 | 19.6005 | 50.6611 | 0.8068 | 1.00 | 0.1248 | 0.9467 | 55.57 | |
0.5711 | 0.4481 | 0.6094 | 2.0825 | 20.0918 | 16.7583 | 51.0266 | 1.0000 | 1.00 | 1.0000 | 1.0000 | 100.00 |
Figure 6.
Graphs of model residuals for the multiple response Minced Fish Quality Data.
Plots in Figure 6 shows that and residuals of both and overlap. However, for and the residuals are seen to lie the closest to the zero residual line than those from the existing models Again, for all the curves, we observe that approximately the same number of residuals below and above the zero residual line, indicative of the fact that gives curves of best fits.
3.5. Simulation study
In the examples given in section 3.3 and 3.4, it was shown that the goodness of fits and the optimal solutions of fits of were either better than or highly competitive when compared with the results from the OLS, and . In this subsection, we compare the performances of the respective regression models via simulated data. Each Monte Carlo simulation comprises 1000 data sets based on the following underlying polynomial models:
Model 1:
Model 2: ;
Model 3: ;
Model 4:
Model 5:
Model 6: ,
where the , and are the values of the explanatory variables, , are the error terms which are normally distributed with mean zero and variance 1, and γrepresents a misspecification parameter. The values of the explanatory variables are presented in Tables 20 and 21.
Table 20.
The CCD for the Simulating Data for Models 1–4.
1 | 0.8536 | 0.8536 |
2 | 0.1464 | 0.8536 |
3 | 0.8536 | 0.1464 |
4 | 0.1464 | 0.1464 |
5 | 1.0000 | 0.5000 |
6 | 0.0000 | 0.5000 |
7 | 0.5000 | 1.0000 |
8 | 0.5000 | 0.0000 |
9 | 0.5000 | 0.5000 |
10 | 0.5000 | 0.5000 |
11 | 0.5000 | 0.5000 |
12 | 0.5000 | 0.5000 |
13 | 0.5000 | 0.5000 |
Table 21.
The CCD for the Simulating Data for Models 5 and 6.
1 | 0.2030 | 0.2030 | 0.2030 |
2 | 0.7970 | 0.2030 | 0.2030 |
3 | 0.2030 | 0.7970 | 0.2030 |
4 | 0.7970 | 0.7970 | 0.2030 |
5 | 0.2030 | 0.2030 | 0.7970 |
6 | 0.7970 | 0.2030 | 0.7970 |
7 | 0.2030 | 0.7970 | 0.7970 |
8 | 0.7970 | 0.7970 | 0.7970 |
9 | 0.0000 | 0.5000 | 0.5000 |
10 | 1.0000 | 0.5000 | 0.5000 |
11 | 0.5000 | 0.0000 | 0.5000 |
12 | 0.5000 | 1.0000 | 0.5000 |
13 | 0.5000 | 0.5000 | 0.0000 |
14 | 0.5000 | 0.5000 | 1.0000 |
15 | 0.5000 | 0.5000 | 0.5000 |
16 | 0.5000 | 0.5000 | 0.5000 |
17 | 0.5000 | 0.5000 | 0.5000 |
The goal of the simulation study is to demonstrate the resolve of each of the regression models when applied to studies that consist of one, two, or three explanatory variables, respectively. The model Average Sum of Squares (AVESSE) for each degree of model misspecification is presented in Table 22.
Table 22.
Comparison of the AVESSE of each method for each model in the simulation studies.
Model | OLS | ||||
---|---|---|---|---|---|
(1) | 0.00 | 9.8961 | 8.3371 | 8.3220 | 8.3133 |
0.50 | 22.5001 | 8.4606 | 8.4105 | 8.4204 | |
1.00 | 48.7471 | 8.4817 | 8.4120 | 8.4310 | |
(2) | 0.00 | 9.8769 | 8.1392 | 8.2887 | 8.2679 |
0.50 | 16.2334 | 8.4989 | 8.2899 | 8.2973 | |
1.00 | 30.5292 | 9.4051 | 9.1337 | 8.9398 | |
(3) | 0.00 | 6.9849 | 68.9816 | 6.3277 | 4.0700 |
0.50 | 18.0887 | 61.6146 | 14.4455 | 4.7940 | |
1.00 | 51.0910 | 99.0211 | 15.1152 | 5.1912 | |
(4) | 0.00 | 7.0210 | 34.0919 | 13.6632 | 4.0198 |
0.50 | 13.7667 | 41.8323 | 20.9044 | 7.0169 | |
1.00 | 39.1912 | 72.1624 | 38.9560 | 8.9640 | |
(5) | 0.00 | 7.0113 | 28.9237 | 6.2117 | 5.8945 |
0.50 | 125.2006 | 254.4773 | 12.5466 | 6.3215 | |
1.00 | 479.6291 | 747.5212 | 71.8911 | 26.041 | |
(6) | 0.00 | 7.2458 | 37.5407 | 7.9100 | 4.7715 |
0.50 | 44.1519 | 64.3340 | 12.1213 | 5.8219 | |
1.00 | 155.2220 | 173.5006 | 22.1993 | 8.8906 |
The values of the AVESSE of the , and for models 1 and 2 are approximately the same but better than the AVESSE of the OLS model across all the degrees of model misspecifications. For models 3 and 4 through models 5 and 6 where the curse of dimensionality is most intense, gives the best AVESSE. Furthermore, while the AVESSE of the are fairly stable as γ increases from 0 to 1 across the models 3 through 6, the AVESSE of the OLS, the and the deteriorate quite rapidly.
3.6. Neural network computing and application
The application of neural network cut across multidisciplinary studies which range from neuroscience to theoretical statistical physics and more importantly, the most significant theoretical and applied topics in neural networks and computing is the choice of adequate activation functions because they can capture nonlinearity in data, hence form the core in both deep and shallow learning with different architectures [21].
We shall give the mathematical relation of SPOCU activation function in neural network as given in the literature and applied it to three RSM data (Single response, Multiple-response chemical process data, and the Minced Fish Quality Data):
3.6.1. Scaled polynomial constant unit (SPOCU) activation function
The SPOCU activation function is given by;
(31) |
and the generator:
(32) |
with and tend to infinity as see [21].
The performance statistic obtained from the activation functions, such as ReLU, Leaky-ReLU, SELU, and SPOCU were adequate. But SPOCU activation function shows a satisfactory result in terms of smaller mean squared error (MSE) over ReLU, Leaky-ReLU, and SELU and the result is justified in Figures 7, 8, and 9, respectively. See Tables 23, 24, 25 and Figures 7, 8, 9.
Figure 7.
Graph of Model Loss function (MSE) via neural network computing for single response chemical process data ( ).
Figure 8.
Graph of Model Loss function (MSE) via neural network computing for multi-response chemical process data.
Figure 9.
Graph of Model Loss function (MSE) via neural network computing for multi-response Minced Fish Quality Data.
4. Conclusion
Quality is one of the most important factors that inform a consumer’s preference for one product among several competing products. Consequently, improving the quality of a product is a key strategy that leads to business growth, enhanced competitiveness and huge returns to investment also see [6,27].
In the early stage of the design of a new product, research teams run experiments and build regression models in order to identify the setting of the explanatory variables that optimize responses related to the quality of the new product. This series of activities is referred to as product qualification in the manufacturing circles, see [25,26].
Once a product has been qualified, its recipe, which includes the identified optimal setting of the explanatory variables, is used to produce the product in a large scale for the intended consumers. The reliability of the optimal setting of the explanatory variables depends on how well the regression model fits the data, also see [6,18]. A regression model that gives relatively low Prediction Errors Sum of Squares, with a comparative high R2 provides statistically more reliable optimal solutions, see [29,33].
In this paper, we proposed a new locally adaptive bandwidths selector for smoothing RSM data. The proposed bandwidth selector is applied in the LLR model for fitting simulated data and three problems in the literature. The results of the goodness of fits and optimal solutions obtained show that the LLR regression model utilizing the proposed bandwidths selector performs better than the OLS, the fixed bandwidth LLR, and the LLR that utilizes the locally adaptive bandwidths selected by the existing locally adaptive bandwidths selector proposed by [10].
Data consisting of two or three explanatory variables are commonplace in RSM. This creates a problem referred to as the curse of dimensionality for the LLR model which normally thrives in modeling data that involves only a single explanatory variable. However, the results from the three examples and the simulated data show that the model benefits more from bandwidths selected by the proposed locally adaptive bandwidths selector that takes into account both the number and values of the explanatory variables at each data point than it does from bandwidths selected by fixed and the existing locally adaptive bandwidths selector.
Neural network activation functions such ReLU, Leaky-ReLU, SELU, and SPOCU were considered with a remarkable improvement on the loss function (MSError) over the regression models utilized in the three RSM data. Among the four activation functions, SPOCU show to work satisfactorily on the variety of problems over RELU, Leaky-ReLU, and SELU activation functions, see Tables 23, 24, 25, respectively.
Table 23.
Loss function (MSE) via neural network computing for single response chemical process data.
Activation function | Loss ( ) |
---|---|
ReLU | 0.0062 |
Leaky-ReLU | 0.0062 |
SELU | 0.0062 |
SPOCU | 0.0062 |
Table 24.
Loss function (MSE) via neural network computing for multi-response chemical process data.
Activation function | Loss ( ) | Loss ( ) | Loss ( ) |
---|---|---|---|
ReLU | 0.0329 | 0.0391 | 0.0762 |
Leaky-ReLU | 0.0074 | 0.0277 | 0.0764 |
SELU | 0.0086 | 0.0277 | 0.0762 |
SPOCU | 0.0074 | 0.0277 | 0.0762 |
Table 25.
Loss function (MSE) via neural network computing for multi-response Minced Fish Quality Data.
Activation function | Loss ( ) | Loss ( ) | Loss ( ) | Loss ( ) |
---|---|---|---|---|
ReLU | 0.000682 | 0.0186 | 0.0079 | 0.0000233 |
Leaky-ReLU | 0.00073 | 0.0000981 | 0.0079 | 0.0000237 |
SELU | 0.0102 | 0.000203 | 0.0083 | 0.0000259 |
SPOCU | 0.000682 | 0.000148 | 0.0079 | 0.0000232 |
Acknowledgements
I am obliged to my PhD supervisor, Prof. J. I. Mbegbu for his tutelage. Thanks to Dr E. Edionwe for his relentless contributions.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Declaration of Interest statement
‘None’
References
- 1.Adalarasan R., and Santhanakumar M., Response surface methodology and desirability analysis for optimizing μ WEDM parameters for A16351/20% Al2O2 composite. Int. J. ChemTech Res. 7 (2015), pp. 2625–2631. [Google Scholar]
- 2.Alvarez M.J., Izarbe L., Viles E., and Tanco M., The use of genetic algorithm in response surface methodology. J. Qual. Technol. Quant. Manag. 6 (2009), pp. 295–309. [Google Scholar]
- 3.Anderson-Cook C.M., and Prewitt K., Some guidelines for using nonparametric models for modeling data from response surface designs. J. Mod. Appl. Stat. Models 4 (2005), pp. 106–119. [Google Scholar]
- 4.Atkeson C.G., Moore A.W., and Schaal S., Locally weighted learning. Artif. Intell. Rev. 11 (1997), pp. 11–73. [Google Scholar]
- 5.Box G.E.P., and Wilson B., On the experimental attainment of optimum conditions. J. R. Stat. Soc. B 13 (1951), pp. 1–45. [Google Scholar]
- 6.Castillo D.E., Process Optimization: A Statistical Method, Springer International Series in Operations Research and Management Science, New York, 2007. [Google Scholar]
- 7.Chen Y., and Ye K., Bayesian hierarchical modelling on dual response surfaces in partially replicated designs. J. Qual. Technol. Quant. Manag. 6 (2009), pp. 371–389. [Google Scholar]
- 8.Derringer G., and Suich R., Simultaneous optimization of several response variables. J. Qual. Technol. 12 (1980), pp. 214–219. [Google Scholar]
- 9.Edionwe E., and Mbegbu J.I., Local bandwidths for improving the performance statistics of model robust regression 2. J. Mod. Appl. Stat. Methods. 13 (2014), pp. 506–527. [Google Scholar]
- 10.Edionwe E., Mbegbu J.I., and Chinwe R., A new function for generating local bandwidths for semi–parametric MRR2 model in response surface methodology. J. Qual. Technol. 48 (2016), pp. 388–404. [Google Scholar]
- 11.Fan J., and Gijbels I., Data-driven bandwidth selection in local polynomial fitting: A variable bandwidth and spatial adaptation. J. R. Stat. Soc. Ser. B 57 (1995), pp. 371–394. [Google Scholar]
- 12.Fan J., and Gijbels I., Local Polynomial Modeling and its Applications, Chapman and Hall, London, 1996. [Google Scholar]
- 13.Geenens G., Curse of dimensionality and related issues in nonparametric functional regression. Stat. Surv. 5 (2011), pp. 30–43. [Google Scholar]
- 14.Hardle W., Muller M., Sperlich S., and Werwatz A., Nonparametric and Semiparametric Models: An Introduction, Springer-Verlag, Berlin, 2005. [Google Scholar]
- 15.Harrington E.C., The desirability function. Ind. Qual. Control 21 (1965), pp. 494–498. [Google Scholar]
- 16.Heredia-Langner A., Montgomery D.C., Carlyle W.M., and Borer C.M., Model robust optimal designs: A genetic algorithm method. J. Qual. Technol. 36 (2004), pp. 263–279. [Google Scholar]
- 17.He Z., Wang J., Oh J., and Park S.H., Robust optimization for multiple responses using response surface methodology. Appl. Stoch. Models. Bus. Ind. 26 (2009), pp. 157–171. [Google Scholar]
- 18.He Z., Zhu P.E., and Park S.H., A robust desirability function for multi-response surface optimization. Eur. J. Oper. Res. 221 (2012), pp. 241–247. [Google Scholar]
- 19.Holland J., Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor, 1975. [Google Scholar]
- 20.Johnson R.T., and Montgomery D.C., Choice of second-order response surface designs for logistics and Poisson regression models. Int. J. Exp. Des. Process Optim. 1 (2009), pp. 2–23. [Google Scholar]
- 21.Kiselak J., Lu Y., Svihra J., Szepe P., and Stehlik M., “SPOCO”: scaled polynomial constant unit activation function. Neural Comput. Appl. (2020), doi: 10.1007/s00521-020-05182-1. [DOI] [Google Scholar]
- 22.Kohler M.A., Schindler A., and Sperlich S., A review and comparison of bandwidth selection methods for kernel regression. Int. Stat. Rev. 82 (2014), pp. 243–274. [Google Scholar]
- 23.Mays J.E., Birch J.B., and Starnes B.A., Model robust regression: Combining parametric, nonparametric, and semi-parametric models. J. Nonparametr. Stat. 13 (2001), pp. 245–277. [Google Scholar]
- 24.Mondal A., and Datta A.K., Investigation of the process parameters using response surface methodology on the quality of crustless bread baked in a water-spraying oven. J. Food Process Eng 34 (2011), pp. 1819–1837. [Google Scholar]
- 25.Montgomery D.C., Introduction to Statistical Quality Control. 7th Ed., John Wiley & Sons, New York, 2009. [Google Scholar]
- 26.Myers R.H., Response surface methodology – Current status and future directions. J. Qual. Technol. 31 (1999), pp. 30–44. [Google Scholar]
- 27.Myers R., Montgomery D.C., and Anderson-Cook C.M., Response Surface Methodology: Process and Product Optimization Using Designed Experiments, Wiley, Toronto, ON, 2009. [Google Scholar]
- 28.Nadaraya E.A., On estimating regression. J. Theory Probab. Appl. 9 (1964), pp. 141–142. [Google Scholar]
- 29.Pickle S.M., Robinson T.J., Birch J.B., and Anderson-Cook C.M., A semi-parametric model to robust parameter design. J. Stat. Plan. Inference. 138 (2008), pp. 114–131. [Google Scholar]
- 30.Sestelo M., Villanueva N.M., Meira-Machado L., and Roca-Pardinas J., An R package for nonparametric estimation and inference in life sciences. J. Stat. Softw. 82 (2017), pp. 1–27. [Google Scholar]
- 31.Shah K.H., Montgomery D.C., and Carlyle W.M., Response surface modelling and optimization in multi-response experiments using seemingly unrelated regressions. Qual. Eng. 16 (2004), pp. 387–397. [Google Scholar]
- 32.Thongsook S., Borkowski J.J., and Budsaba K., Using a genetic algorithm to generate Ds – optimal designs with bounded D-efficiencies for mixture experiments. J. Thail. Stat. 12 (2014), pp. 191–205. [Google Scholar]
- 33.Wan W., and Birch J.B., A semi-parametric technique for multi-response optimization. J. Qual. Reliab. Eng. Int. 27 (2011), pp. 47–59. [Google Scholar]
- 34.Wu C.F.J., and Hamada M.S., Experiments: Planning, Analysis and Parameter Design Optimization, John Wiley & Sons, Inc, New York, 2000. [Google Scholar]
- 35.Yeniay O., Comparative study of algorithm for response surface optimization. J. Math. Comput. Appl. 19 (2014), pp. 93–104. [Google Scholar]
- 36.Zheng Q., Gallagher C., and Kulasekera K.B., Adaptively weighted kernel regression. J. Nonparametr. Stat. 25 (2013), pp. 855–872. [Google Scholar]