Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Jan 1.
Published in final edited form as: IEEE Access. 2019 Jan 1;7:7133–7146. doi: 10.1109/ACCESS.2018.2890593

Finding High-Dimensional D-Optimal Designs for Logistic Models via Differential Evolution

WEINAN XU, WENG KEE WONG, KAY CHEN TAN, JIANXIN XU
PMCID: PMC6497399  NIHMSID: NIHMS1519531  PMID: 31058044

Abstract

D-optimal designs are frequently used in controlled experiments to obtain the most accurate estimate of model parameters at minimal cost. Finding them can be a challenging task, especially when there are many factors in a nonlinear model. As the number of factors becomes large and interact with one another, there are many more variables to optimize and the D-optimal design problem becomes high-dimensional and non-separable. Consequently, premature convergence issues arise. Candidate solutions get trapped in local optima and the classical gradient-based optimization approaches to search for the D-optimal designs rarely succeed. We propose a specially designed version of differential evolution (DE) which is a representative gradient-free optimization approach to solve such high-dimensional optimization problems. The proposed specially designed DE uses a new novelty-based mutation strategy to explore the various regions in the search space. The exploration of the regions will be carried out differently from the previously explored regions and the diversity of the population can be preserved. The proposed novelty-based mutation strategy is collaborated with two common DE mutation strategies to balance exploration and exploitation at the early or medium stage of the evolution. Additionally, we adapt the control parameters of DE as the evolution proceeds. Using logistic models with several factors on various design spaces as examples, our simulation results show our algorithm can find D-optimal designs efficiently and the algorithm outperforms its competitors. As an application, we apply our algorithm and re-design a 10-factor car refueling experiment with discrete and continuous factors and selected pairwise interactions. Our proposed algorithm was able to consistently outperform the other algorithms and find a more efficient D-optimal design for the problem.

Keywords: Approximate design, design efficiency, generalized linear model, high-dimensional, non-separable, sensitivity function

I. Introduction

OPTIMAL design problems frequently arise in scientific investigations when we want to obtain the most accurate statistical inference at minimal cost. For example, D-optimal designs are commonly used to estimate parameters in the statistical model by minimizing the volume of the confidence ellipsoid of the parameters. When the model is nonlinear, the design criterion contains the unknown model parameters, which we want to estimate. Nominal values for the parameters are required to replace the unknown parameters before optimization and the resulting optimal design is termed locally optimal [1], [2] because it depends on the nominal values. Nominal values for the parameters may come from an expert’s opinion or from a pilot study. The locally D-optimal design is then implemented to generate data to estimate the model parameters and the estimated parameters become the nominal values in the next step. The expectation is that after a couple of iterations, the estimates will become stable.

In the statistical literature, the optimal design is usually found from theory and when the model is nonlinear, there is usually only one or two factors. The theoretical approach encounters mathematical difficulties when the nonlinear model has several factors or the design criterion becomes complicated. Under such situations, our experience is that the classical optimization numerical techniques fail to find the locally optimal design or they become very inefficient. This is because as the number of factors in the model increases, the number of parameters in the model also increases. Consequently, the number of design points for the optimal design increases, resulting in having substantially many more variables to optimize. Thus, the design problem becomes quickly high-dimensional and also non-separable when factors interact with one another. Premature convergence can become a severe issue since solutions can easily get trapped in local optima.

Nature-inspired metaheuristic algorithms are now increasingly applied to solve a large variety of complicated optimization problems [3], [4]. Particle Swarm Optimization (PSO) [5] is one such algorithm [5], [6], [7], which has been recently used to solve various optimal design problems in the statistical literature [8], [9], [10]. However, the D-optimal design problems in these papers have only 3 or fewer factors in the statistical model and so premature convergence may not be an issue. Since PSO exerts the selective pressure onto some current best solutions termed as gbest and pbest, our experience is that models with 4 or more factors can cause PSO to experience premature convergence and make PSO less effective [11], [12].

Differential Evolution (DE) is an algorithm from a family of gradient-free algorithms-evolutionary algorithms. Mutation, crossover and selection are three fundamental operations in DE [13], [14]. One advantage that DE has over other evolutionary algorithms is that it has fewer control parameters [15], [16], [17], and works well in handling numerical optimization problems [18], [19], [20], [21]. Compared with PSO, DE can alleviate the premature convergence issue moderately [13] since most of the mutation strategies of DE do not exert the selective pressure onto the current best solution [22], [23], [24], [25], [26]. However, based on the studies of DE variants for solving high-dimensional problems, there is no specially designed mechanism to explore various but novelty regions in the search space and to preserve the diversity of the population.

To circumvent the above issues and also motivated by novelty search methods [27], [28] which are capable of escaping from local optima by trying some novelty solutions for efficient exploration, we propose a new novelty-based mutation strategy. At the start of the evolution, a portion of individuals are randomly selected as the novelty-based individuals, and their aim is to explore various individuals which are potentially to be novelty individuals. For each novelty-based individual, we sample some difference vectors to be added to the current individual. Among these sampled difference vectors, we select the one which has the largest angle differences from the difference vector used in the previous generation. Each novelty-based individual explores the region of the search space different from the region explored in the previous generation so that novelty solutions can be obtained. As evolution proceeds, various regions of the search space would be explored and the diversity of the population is enhanced. The novelty-based mutation strategy is combined with two common mutation strategies, ‘DE/rand/2’ and ‘DE/current-to-rand/1’. These two mutation strategies can balance the exploration and exploitation well at the early or medium stage of evolution as compared with other mutation strategies [29]. When the individuals obtained from these two mutation strategies converge, the novelty-based individuals can provide information of these recently explored regions in the search space so that these convergent individuals can both exploit in their current region and explore more regions in the search space.

We apply the proposed algorithm to generate locally D-optimal designs for logistic models with several factors with and without interactions on various design spaces. Logistic regression models have a binary response with one or more factors and is among the most frequently used in scientific investigations across many disciplines. Using a broad simulation study, we show our proposed algorithm consistently outperforms several of its top competitors. As an application, we implement our DE based algorithm to re-design a 10-factor car refueling experiment with both discrete and continuous factors, with and with factor interactions.

The remainder of this paper is organized as follows. Section II introduces statistical background and locally D-optimal designs for logistic regression models. It also reviews previous applications of using PSO to solve optimal design problems and a literature review of DE algorithms. In Section III, we propose a new DE algorithm NovDE and in Section IV, we apply it to construct and study properties of D-optimal designs on various design spaces. In Section V, we apply the proposed algorithm to generate D-optimal designs for a ten-factor car refueling experiment with and without factor interactions and there are mixed factors in the experiment. Section VI concludes with a summary of our work.

II. Background

A. Locally D-Optimal Designs for Logistic Regression Models

A generalized linear model is commonly used to study the mean of a response variable Y as a function of n independent variables [1]. We focus on models with a binary response variable even though the methodology proposed herein applies more generally. Let E(Yl) = μl and let ηl=rT (x)β be the linear predictor, where r(x) is a user-selected regression function that depends on the n factors. Additionally, let g(.) be a monotonic function such that g(μl) = ηl [30]. Some common choices for the regression function are r(x)T = {1, x1, … , xn} (additive model) or r(x)T = {1, x1, … ,xn, x1x2, … , xn−1xn} (model with all pairwise interaction terms). We assume Yk is independent of Yl if lk and the design space is user-selected compact set and contains all allowable combination levels of the factors to observe the response.

For the logistic model, we have

g(μl)=log(μl1μl)=ηl (1)

Our goal is to find an optimal set of factor levels x1, … , xL to estimate the vector of parameters β in the linear predictor [2], [31] when we are given resources to take N observations. This means that we determine the optimal number of support points required, i.e. the value of L, the best choices of the support points x1, … , xL from a given design space and the optimal number of replicates at ni at xi, i = 1, … , L subject to n1 + … + nL = N. The upshot is we have a constrained optimization problem where some of the variables to be optimized are positive integers and constrained to sum to N.

Following [31], the worth of a L-point design ξ with nl replicates at xl is determined by its Fisher information matrix defined by

Iξ=l=1LnlΥ(ηl)r(xl)r(xl)T, (2)

where Υ(ηl)=(duldηl)2ul(1ul). For the logistic regression model, the link function is the logit function in (1) and

Υ(ηl)=12+eηl+eηl=eηl(1+eηl)2. (3)

A locally D-optimal design maximizes the log-determinant of the Fisher information matrix Iξ in (2), or equivalently minimizes the generalized variance of the estimates of the parameters. Thus, D-optimal designs provide the most accurate estimates of all the model parameters in β. Clearly, Iξ depends on β and so nominal values for β are required before optimization. Frequently, the nominal values for β come from prior experiences or a pilot study [32].

We focus on approximate designs obtained by replacing each nl by wl = nl/N, the proportion of the total observations to be taken at xl. More generally, we allow wl to take on any value between 0 and 1 and doing so turns the problem into a convex optimization problem where convex optimization tools can be used to find and verify optimality of a design. Designs with weights wis that sum to unity are called approximate designs.

For D-optimality, the design criterion is −logI(ξ, θ)∣ and this is a convex function over the space of all approximate designs on the given and compact design space of interest [1]. Following [33], the approximate design ξ* is locally D-optimal among all designs if and only if for all x in the design space, the following checking condition is satisfied:

eβTr(x)(1+eβTr(x))2r(x)TIξ1r(x)k0 (4)

with equality at each support point of ξ*. Here k is the dimension of β and the left-hand side of (4) is sometimes called the sensitivity function.

Often, the worth of a design ξ is measured by its efficiency relative to the optimal design ξ* [1]. For D-optimality, the D-efficiency of a design ξ is

(det(Iξ)det(Iξ))1k.

If its D-efficiency is near 1, ξ is close to ξ*. If the theoretical optimal design ξ* is unknown, the proximity of a design ξ to ξ* can be determined from convex analysis theory. Specifically, its D-efficiency is at least exp(−θ/k), where θ is the maximum positive value of the sensitivity function across the entire design space [34]. If the D-efficiency lower bound is close to 1, the design ξ is close to the D-optimal design ξ*.

B. Fundamentals of Differential Evolution

Differential Evolution (DE) [13] was proposed by Storn and Price in 1995. It is a population-based optimization algorithm that searches for the optimum iteratively. DE is simple to implement and has good performance for solving various types of optimization problems. Compared with other evolutionary algorithms (EA), the the space complexity of DE is low [14] and number of control parameters in DE is small [15], [16], [17]. There are two control parameters in DE; a scaling factor F for mutation and a crossover rate CR for the crossover operation. The parameter F controls the convergence speed and the parameter CR affects both the convergence and the diversity of the population [13], [35], [36].

To fix ideas, suppose f(X) is the given objective function and we want to minimize it over a user-selected D-dimensional space comprising the decision variables. DE has three main operations: mutation, crossover and selection. Each solution of generation g is represented by Xi,g, where i is the index of the corresponding solution. Sometimes Xi,g is referred to as the target vector, which needs to be updated for the next generation g + 1. Mutation generates a mutant vector Vi,g, followed by a crossover which then generates a trial vector Ui,g based on both Vi,g and Xi,g. The next step is Selection, where a decision is made whether to update the solution Xi,g+1 from Ui,g or Xi,g based on their objective function values. Some details for the three operations follow.

1). Mutation:

Each target vector Xi,g generates a new individual, called the mutant vector Vi,g and some frequently used mutation strategies are listed below.

“DE/rand/1”:

Vi,g=Xr1,g+F(Xr2,gXr3,g) (5)

“DE/rand-to-best/2”:

Vi,g=Xi,g+F(Xbest,gXi,g)+F(Xr1,gXr2,g)+F(Xr3,gXr4,g) (6)

“DE/rand/2”:

Vi,g=Xr1,g+F(Xr2,gXr3,g)+F(Xr4,gXr5,g) (7)

“DE/current-to-rand/1”:

Vi,g=Xi,g+K(Xr1,gXi,g)+F(Xr2,gXr3,g) (8)

In (8), K is randomly generated from [0, 1]. The subscripts r1 to r5 of X in (5)(8) represent the random individuals selected from the population pool.

2). Crossover:

Crossover operation is employed after mutation. In crossover, the mutant vector Vi,g is recombined with the original individual Xi,g to form the trial vector Ui,g. Two types of crossover schemes of DE are binomial crossover and exponential crossover. Binomial crossover is commonly used in DE to determine the trial vector as follows [24]:

ui,gj={vi,gj,rand(0,1)Crj=jrandxi,gj,otherwise} (9)

where jrand ∈ {1, 2, 3, … , D} is a randomly selected index to ensure that the trial vector Ui,g can get at least one variable from the mutant vector Vi,g. The notation rand(0,1) is a uniform random number from the interval [0,1] and Cr is the pre-specified crossover rate.

An exponential crossover is another way to implement a crossover [37]. An integer z is randomly generated from [1,D]. Another integer L, i.e. the length of decision variables to be mutated, is determined as follows:

L=0

WHILE(rand(0,1)≤ Cr AND L≤ D)

DO(L=L+1)

If L ≥1, the trial vector Ui,g is generated as follows:

ui,gj={vi,gj,forj=z,z+1,z+2,z+L1xi,gj,otherwise} (10)

If L = 0, then Ui,g is identical to Xi,g.

3). Selection:

Selection is the last step to determine whether the trial vector Ui,g survives to enter the next generation based on the objective function value f (Ui,g).

The selection operation in DE is described below:

Xi,g+1={Ui,g,iff(Ui,g)f(Xi,g)Xi,g,otherwise} (11)

C. Literature Review of Differential Evolution

1). The Adaptation Scheme of Control Parameters:

The success of DE in solving a specific problem crucially depends on the appropriate choice of mutation strategies and the associated control parameter values. Many DE studies have proposed new mutation strategies that stayed constant for the entire evolution process but a few such as SaDE [29] have proposed an adaptive approach to select appropriate mutation strategies based on the successful experiences in the previous generations.

In terms of the control parameter adaptation schemes, most DE studies adapt the control parameters F and CR based on a pre-defined distribution. The mean of this distribution depends on the successful F or CR values in the previous generations. In [38], a new crossover method Multiple Exponential Recombination (MER) that combines the advantages of binomial crossover and exponential crossover was proposed to solve the non-separable problems, where the decision variables are dependent on each other [39]. It has been shown both theoretically and empirically that for the same value of CR, MER can result in improved performances. Hence, it is promising to embed MER into the control parameter adaptation schemes when we solve non-separable optimization problems.

2). High-dimensional Problems:

For solving high-dimensional problems, DE algorithms have a cooperative coevolution (CC) framework and a noncooperative coevolution framework [40]. CC-based framework partitions either the entire population into subpopulations, or partitions the entire decision variables into subcomponents. The optimization process is both parallel within each subgroup and centralized for the entire group. In [41], DECC-DML adopted CC framework, and the new partition strategy called delta grouping was proposed. To emphasize the interaction between variables in the same group, the improvement interval of interacting variables in different group would be limited. DECC-DML was efficient in solving non-separable problems with one group of rotated variables but not so when there are multiple groups. In [42], DCDE applied the CC framework and a ring connection to enhance the interactions among variables between different groups so that the search behavior of exploration and exploitation can be balanced. DCDE was capable of solving some non-separable and multimodal high-dimensional problems. In [43], DDE-AMS was proposed to solve the high-dimensional problems by a distributed differential evolution with adaptive mergence and split on subpopulations. The mergence and split operators made full use of the population resource to efficiently solve the problems in a cooperative and efficient way. For non-CC frameworks, most of the DE studies focused on adding adaptive mechanisms into the algorithms or proposed new mutation strategies. In [44], both F, CR, population size and mutation strategies were adapted and in [40], a new triangular mutation strategy was proposed.

III. Proposed Algorithm: NovDE

A. Overview

Since the D-optimal design problems in this paper are high-dimensional and non-separable, premature convergence can be a severe issue with solutions easily getting trapped as local optima. Compounding the problem is that most of the state-of-the-art DE methods do not have a special mechanism to preserve diversity of the solutions and so the issue of premature convergence is not completely solved. For the mutation strategies such as ‘DE/rand-to-best/2’, solutions tend to be close to the current best region thus limiting the exploration capability at the early stage. For the mutation strategies such as ‘DE/rand/2’ or ‘DE/current-to-rand/1’, solutions tend to be close to each other at the early or medium stage of the evolution. Thus, to circumvent the issue of premature convergence of DE-based algorithms for solving high-dimensional and non-separable optimization problems, a mechanism of exploring various regions of the search space should be specially designed and combined with other DE mutation strategies.

Assume that there are n factors in the model and we denote a L-point design by ξ = ([x11x12x1np1], … , [xl1xl2xlnpl], … [xL1xL2xLnpL]), where pl is the proportion of the total observations to be taken at the l-th design point [xl1xl2xln]. It follows that each individual Xi,g in the current generation g with population index i is constructed as Xi,g = (x11x12x1np1xl1xl2xlnplxL1xL2xLnpL). For an additive model with no interactions among the factors, the dimension D of Xi,g is (n + 1 )L.

We propose a new novelty-based DE-based algorithm and denote it by NovDE to solve our complex optimization problems using a novelty-based mutation strategy. At the start of the evolution process, a group of individuals are randomly selected to be novelty-based individuals. To preserve the diversity of solutions, various regions of the search space are explored by these novelty-based individuals. Fig. 1 shows the difference vector di,g−1 which is the difference between the trial vector ui,g−1 and the target vector xi,g−1 in the previous generation g − 1. For the current generation g and a user-selected value of m, the number of m difference vectors di,g1,di,gm are sampled. Fig. 1 displays the computed angle θs between the sampled difference vector di,gs in the current generation g and the difference vector di,g−1 in the previous generation g − 1 where s = 1, 2, … , m. We then add the difference vector di,g, which has the largest angle difference between di,gs and di,g−1 among the m samples, to the target vector xi,g to generate the mutant vector vi,g. This is because the largest angle differences between di,g and di,g−1 would enhance each novelty-based individual to explore region in the search space entirely different than what was explored in the previous generation g − 1. As the evolution proceeds, the novelty-based individuals can gradually explore various and novelty regions in the search space because of the efficient exploration and the diversity of solutions can be preserved. The proposed novelty-based mutation strategy is combined with ‘DE/rand/2’ and ‘DE/current-to-rand/1’ since they can balance exploration and exploitation at the early or medium stage of evolution [29]. If the individuals obtained based on ‘DE/rand/2’ and ‘DE/current-to-rand/1’ are close to each other, the novelty-based individuals can provide the information of the recent explored regions of the search space to those convergent individuals. The convergent individuals can either exploit in their current region of the search space or explore more regions of the search space.

Fig. 1:

Fig. 1:

The operation of novelty-based mutation strategy. The target vector is Xi,g, and the difference vector from the previous generation is di,g−1. In the current generation, the m sampled difference vectors are di,g1di,gm and θs is the computed angle between di,gs and di,g−1, where s = 1, … , m. The di,gs with the largest angle differences θs is selected to be di,g and the mutant vector Vi,g is generated from Xi,g and di,g.

The D-optimality criterion is a function of the information matrix in (2), where xl is part of the consecutive component in the decision variables. The term xlxlT in equation (2) is the consecutive variables multiplied by each other, so physically proximate variables have stronger correlation and the problem is non-separable. According to [38], the crossover method MER can solve non-separable problems more efficiently than the binomial or exponential crossover method if the CR rate is the same. Further, MER updates the consecutive variables altogether which is more suitable for the structure of the decision variables in our problem. Thus, MER is selected to be the crossover method for our problem.

We adapt the control parameters F and CR to find locally D-optimal designs. The adaptation of F is the same as the state-of-the-art adaptive DE algorithm SaDE [29] where the F value for each individual is generated from F = N(0.5, 0.3). In this way, the value of F falls in the range [−0.4, 1.4] with probability of 0.997, which covers exploration capability when F is large and exploitation capability when F is small [29]. Because the novelty-based individuals explore various regions in the search space, F is not required to be adaptive so it is fixed at 0.5. Our adaptation method of CR in NovDE is new. A First-in-First-out (FIFO) memory CRpoolk with a fixed size is applied, and the memory size for strategy k is proportion to the number of individuals involved in strategy k. The CRmeank is the mean value of the successful CR values stored in CRpoolk memory. The mean value of CRmeank for each strategy k is adapted based on the success values of CR stored in CRpoolk for strategy k. This adaptation method updates the distribution of CR more frequently based on the solutions in the current evolution stage. In NovDE, CR value for each individual for strategy k is generated from CR = N(CRmeank, 0.1). The initial value of CRmeank is selected to be 0.7 since if the value of CR is larger, the exploration would be encouraged. At the start of the evolution, exploration should be encouraged.

B. Algorithm Structure

The proposed algorithm NovDE is displayed in Algorithm 1. In NovDE, three mutation strategies ‘DE/rand/2’, ‘DE/current-to-rand/1’ and the proposed ‘novelty-based DE’ are employed to generate the mutant vector Vi,g. Population are assigned to these three groups to employ one of the mutation strategies based on the pre-defined ratios p1 and p2. From step 9 to step 16, the proposed novelty-based DE mutation is presented. For each novelty-based individual Xi,g, F is fixed to be 0.5. The number of m difference vectors are sampled as di,g1di,gm, and the value of m is user-selected. For each di,gs in the samples for s = 1,… , m, the angle between di,gs and the difference vector from last generation di,g−1 is computed and denoted as θs. The di,gs with the largest θs is denoted as di,g. Then in step 15, the mutant vector Vi,g can be generated based on target vector Xi,g and difference vector di,g.

For the adaptation of CR, a first-in-first-out memory for each mutation strategy k is established as CRpoolk with size LPk. CRpoolk is to store the values of CR that make the trial vector Ui,g successfully replace the target vector Xi,g for strategy k. The CRmeank is computed as the mean value of elements in CRpoolk, and CR for each individual is generated from N(CRmeank, 0.1). The crossover method is MER. After the crossover operation, the novelty-based individuals should update di,g to be used in the next generation as di,g+1.

IV. Empirical Study

In this section, we evaluate the performance of the proposed algorithm NovDE for finding locally D-optimal designs for logistic models on various design spaces with several factors. Specifically, we compare NovDE with six state-of-the-art variants of the DE algorithms. ‘DE/rand/2/bin’ [13] and SaDE [29] are effective in handling general numerical optimization problems; SaDE+MER [38] is effective in solving non-separable optimization problems; JADE [23] is an effective DE variant for its control parameter adaptation scheme; ANDE [40] and DDE-AMS [43] are effective in solving high-dimensional optimization problems. In order to validate the effectiveness of novelty-based mutation, we also compare the novelty-based mutation combined with the conventional crossover (i.e. binomial crossover), which is termed as NovDE-Bin. We compare using logistic models on various design spaces with seven continuous factors and five sets of nominal values. The design space of each factor is first selected to be on the prototype interval [−1, 1] before we vary the design space to [−3, 3], followed by the interval [0, 3]. We next describe the details of our experimental setup for comparing the four algorithms.

A. Experimental Setup

1) Population size is 100.

2) The preset upper bound on the number of support points L is 100.

3) The dimension D of the problem to be optimized for seven factors without interactions is 800 (=(7 + 1) × 100). The dimension for each support point is 8, which includes the number of factors (i.e. 7) and the dimension of the corresponding portion of observations taken at each support point (i.e. 1).

4) The evolution process terminates if and when a design with at least 99.99% D-efficiency is found. Otherwise, the process terminates when the maximum number of generations we specify is met. For all our experiments, we set the maximum number of generation to be 20000.

Algorithm 1 NovDE
Require:Target VectorXi,g=(xi,g1,xi,g2,,xi,gD),population sizeN,p1=0.45,p2=0.9,sample sizem,CRpoolkwith sizeLPk,wherekrepresents thekth mutationstrategy.Ensure:Trial VectorUi,g=(ui,g1,ui,g2,,ui,gD).1:ifip1Nthen2:Fis generated fromN(0.5,0.3).3:Xi,gperformsDErand2to generate mutant vectorVi,g.4:endif5:ifp1N<ip2Nthen6:Fis generated fromN(0.5,0.3).7:Xi,gperformsDE/current-to-rand1to generate mutant vectorVi,g.8:endif9:ifi>p2Nthen10:Fis fixed to be0.5.11:di,g1is the differences between trial vectorUi,g1andtarget vectorXi,g1in the previous generationg1.12:Sample number ofmdifference vectors asdi,g1di,gm.13:Compute the angleθsbetweendi,gsanddi,g1wheres=1,2,,m.14:di,gis the one with the largestθs.15:The mutant vectorVi,g=Xi,g+Fdi,g.16:endif17:CRis generated from(CRmeank,0.1)for differentmutation strategyk.18:Trial vectorUi,gis generated based on MER and theCRrate.19:ifi>p2Nthen20:di,g=Ui,gXi,g.21:endif22:iff(Ui,g)<f(Xi,g)then23:RecordCRvalue into the correspondingCRpoolk.24:Perform first-in-first-out operation once the size ofCRpoolkexceedsLPk.25:UpdateCRmeankto compute the mean value of elements inCRpoolk.26:endif

5) The maximum number of run is 30.

6) For ‘DE/rand/2/bin’, we set F = 0.5 and CR = 0.9 based on the suggested settings in [13],.

7) For ANDE, we follow recommendations in [40] and generate F1, F2, and F3 from the uniform distribution on [0, 1]. We select CR accordingly to [40] and set LP to 10% of the maximum generation.

8) For SaDE and SaDE+MER, we set LP = 20 and initial value of pk = 0.25 for each strategy. The initial CRm for each strategy is 0.5 and generate the value of F from the normal distribution [0.5,0.3] based on [29]. For SaDE+MER, we set T = 10 based on [38].

9) For JADE, we set p = 0.05, c = 0.1 and set μCR = 0.5, μF 0.5 as their initial values indicated in [23].

10) For DDE-AMS, we use 4 sub-populations, and set Up = 25, T = 80, Dr = 0.3, and φ = 0.05, F = 0.5 and CR = 0.9 based on [43].

11) For NovDE and NovDE-Bin, we set p1 = 0.45, p2 = 0.9 and the sample size is m = 10. Initial CRmeank for each strategy is 0.7 to encourage the exploration at the start of evolution. The upper bound of CRmeank is 0.9, and the lower bound of CRmeank is 0.1. For both ‘DE/rand/2’ and ‘DE/current-to-rand/1’, we set LP = 50, and for the novelty-based mutation strategy, we set LP = 10. We generate values of F for both ‘DE/rand/2’ and ‘DE/current-to-rand/1’ from the normal distribution [0.5, 0.3], and set F = 0.5 for the novelty-based mutation strategy.

12) We generate each of the nominal values in the vector of 8 coefficients βT = (β0, β1, β2, β3, β4, β5, β6, β7) in an additive 7-factor logistic regression model randomly from the interval [−1,1] without loss of generality. In this experiment, we generate five parameter sets and they are as follows: β1 = (0.6294, 0.8116, −0.7460, 0.8268, 0.2647, −0.8049, −0.4430, 0.0938), β2 = (−0.6710, 0.8256, −0.9221, 0.8348, 0.0538, 0.8664, 0.9186, 0.7741), β3 = (−0.4926, −0.6280, −0.3283, 0.4378, 0.5283, −0.6120, −0.6837, −0.2061), β4 = (−0.4336, 0.3501, −0.8301, 0.3295, 0.0853, 0.5650, 0.0870, 0.1688), β5 = (0.8379, −0.5372, 0.1537, −0.1094, −0.2925, 0.2599, −0.8201, −0.8402).

13) The program is implemented in MATLAB R2017b.

14) In this paper, the D-efficiency lower bound criterion is applied to evaluate the optimality of the generated design ξ and “DE/rand/1/bin” with F=0.5 and CR=0.9 is used to find the maximum positive value of the sensitivity function θ. We recall this value is used to compute the D-efficiency lower bound of the design ξ, which is exp (−θ/k) where k is the dimension of β. In what is to follow, if a design has at least 95% D-efficiency, we accept the design as close enough to the optimum.

B. Results and Discussions

We compare the performance of the proposed NovDE with NovDE-Bin and six competitive DE-based algorithms using 3 different design spaces to validate that NovDE is an effective DE variant in solving the high-dimensional optimal design problems. Since the optimal designs of the logistic model under various sets of nominal values and design spaces are unknown, the average of the objective function values obtained in 30 runs is considered as one performance indicator. Since the aim is to maximize the log-determinant, the larger the objective function values are, the better is the performance of the algorithm. Another performance indicator is the success rate, which is the percentage of runs where the generated design has at least 95% D-efficiency. To judge whether the proposed NovDE algorithm outperforms each of the other seven DE-based algorithms in a statistically significant way, we employ a nonparametric statistical test called Wilcoxon rank-sum test [45] and t the 5% significance level. For each algorithm, the numbers in the upper line in each cell represent the mean and standard deviation of the objective values. The numbers in the bottom line represent the success rate of the algorithm. The best values of the mean and success rates are in bold, and entries with * represent NovDE significantly outperforms the other algorithm based on Wilcoxon rank-sum test at the 0.05 significance level.

For each design space, there are 5 different settings with nominal values β1 to β5. Hence, for the three different design spaces, there are 15 different settings in total. For the 15 different settings, when NovDE compares with the other seven DE algorithms, NovDE ranks first 9 out of 15 in terms of the mean of the objective function values. Furthermore, in these 9 cases, excluding NovDE-Bin, NovDE significantly outperforms the other six DE algorithms in 6 out of 9. NovDE also ranks first 10 out of 15 in terms of the success rate. These empirical results suggest that since the novelty-based mutation strategy combined with the MER crossover has the advantages of superior capability of exploration [27] and maintaining the dependent variables structure [38], NovDE can work well in handling non-separable problems and can avoid trapping into local optimum with higher chances. Thus, NovDE is more effective in generating locally D-optimal designs compared with the other seven algorithms.

To validate the effectiveness of novelty-based mutation alone, NovDE-Bin which is the novelty-based mutation with the conventional crossover-binomial crossover, is involved for the comparisons as well. For the 15 different settings, NovDE-Bin ranks first 3 out of 15, and second 8 out of 15 in terms of the mean of the objective function values. NovDE-Bin also ranks first 8 out of 15 in terms of the success rate. These empirical results suggest that the novelty-based mutation strategy presents better exploration capability and can prevent solutions from trapping into local optimum, which is consistent with the advantages of novelty search methods as illustrated in [27].

To give a clearer picture of the performance difference between NovDE and the other four DE algorithms, Fig. 2 plots the change of best-of-run objective function values over generations for each DE algorithm. The plots in Fig. 2 are based on nominal parameter β3 and plots based on the other 4 sets of nominal values showed a similar pattern. We observe from Tables I-III and Fig. 2, both NovDE and NovDE-Bin clearly outperform ‘DE/rand/2/bin’ for all of the settings. Although ‘DE/rand/2/bin’ converges faster than NovDE, ‘DE/rand/2/bin’ has the issue of premature convergence so that the solutions tend to become close to each other and its exploration capability is deteriorated. The better performance of both NovDE-Bin and NovDE validate that exploration is important to solve our high-dimensional non-seperable problem which have local optimums. Furthermore,the novel information collected from exploration can be provided to the individuals generated from ‘DE/rand/2’ and ‘DE/current-to-rand/1’ to enhance both exploration and exploitation. As shown in Fig. 2, NovDE has the best converged objective function values close to the global optimum on various design spaces.

Fig. 2:

Fig. 2:

Average best-of-run objective function values of 30 independent runs over generations for X =[−1 1]7, X =[−3 3]7 and X = [0 3]7, respectively. The nominal parameter is β3.

TABLE I:

Performances of NovDE, NovDE-Bin and six competitors for finding locally D-optimal designs on [−1, 1]7 using 5 sets of nominal values. In each cell, the numbers in the upper line are the mean and standard deviation of the values of the objective function over 30 runs, and the number at the bottom line is its success rate. For each set of nominal values, the best values of the mean and success rates are in bold. The entries with an * means that NovDE significantly outperforms the other algorithm based on Wilcoxon rank-sum test.

Algorithm β1 β2 β3 β4 β5
NovDE −13.2018 (0.0079)
86.67%
−13.5845 (0.0153)
90%
−12.8106 (0.0054)
73.33%
−12.6012 (0.0041)
83.33%
−13.0221 (0.0093)
90%
NovDE-Bin −13.2028 (0.0077)*
86.67%
−13.5958 (0.0526)*
90%
−12.8116 (0.0049)
70%
−12.5946 (0.0044)
96.67%
−13.0224 (0.0242)
86.67%
DE/rand/2/bin −13.2274 (0.0153)*
3.33%
−13.5973 (0.0092)*
0%
−12.8459 (0.0128)*
3.33%
−12.6412 (0.0253)*
3.33%
−13.0650 (0.0205)*
0%
ANDE −13.2340 (0.0146)*
3.33%
−13.6009 (0.0197)*
6.67%
−12.8460 (0.0095)*
0%
−12.6330 (0.0144)*
3.33%
−13.0543 (0.0163)*
3.33%
SaDE −13.2030 (0.0060)*
73.33%
−13.5778 (0.0052)
93.33%
−12.8195 (0.0106)*
23.33%
−12.6006 (0.0078)
96.67%
−13.0505 (0.0651)*
73.33%
SaDE+MER −13.2032 (0.0071)*
80%
−13.5761 (0.0012)
96.67%
−12.8186 (0.0057)*
53.33%
−12.6022 (0.0052)
90%
−13.0233 (0.0059)*
76.67%
JADE −13.2035 (0.0021)*
30%
−13.5799 (0.0021)
73.33%
−12.8156 (0.0033)*
53.33%
−12.5958 (0.0072)
40%
−13.0248 (0.0039)*
33.33%
DDE-AMS −13.2390 (0.0096)*
3.33%
−13.5961 (0.0186)*
3.33%
−12.8690 (0.0154)*
0%
−12.6258 (0.0169)*
6.67%
−13.0549 (0.0204)*
0%

TABLE III:

Performances of NovDE, NovDE-Bin and six competitors for finding locally D-optimal designs on [0, 3]7 using 5 sets of nominal values. In each cell, the numbers in the upper line are the mean and standard deviation of the values of the objective function over 30 multiple runs, and the number at the bottom line is its success rate. For each set of nominal values, the best values of the mean and success rates are in bold. The entries with * represent NovDE significantly outperforms the other algorithm based on Wilcoxon rank-sum test.

Algorithm β1 β2 β3 β4 β5
NovDE −8.2056 (0.0091)
83.33%
−11.0117 (0.0023)
100%
−9.3156 (0.0191)
26.67%
−7.6625 (0.0183)
83.33%
−9.1025 (0.0124)
76.67%
NovDE-Bin −8.2064 (0.0047)
80%
−11.0134 (0.0027)
100%
−9.3188 (0.0314)
23.33%
−7.6508 (0.0038)
96.67%
−9.1076 (0.0171)
80%
DE/rand/2/bin −8.2344 (0.0196)*
3.33%
−11.0293 (0.0121)*
16.67%
−9.3562 (0.0193)*
3.33%
−7.6977 (0.0184)*
6.67%
−9.1549 (0.0194)*
3.33%
ANDE −8.2430 (0.0194)*
3.33%
−11.0249 (0.0055)*
46.67%
−9.3500 (0.0256)*
0%
−7.6861 (0.0163)*
0%
−9.1510 (0.0135)*
3.33%
SaDE −8.2147 (0.0145)*
66.67%
−11.0141 (0.0019)*
90%
−9.3282 (0.0213)*
10%
−7.6571 (0.0063)
86.67%
−9.1066 (0.0128)*
46.67%
SaDE+MER −8.2067 (0.0052)
70%
−11.0139 (0.0030)*
86.67%
−9.3348 (0.0270)*
16.67%
−7.6594 (0.0121)
76.67%
−9.1098 (0.0147)*
56.67%
JADE −8.2085 (0.0019)*
20%
−11.0330 (0.0093)*
20%
−9.3213 (0.0207)*
6.67%
−7.6517 (0.0040)
33.33%
−9.1095 (0.0032)*
36.67%
DDE-AMS −8.2212 (0.0145)*
3.33%
−11.0239 (0.0114)*
26.67%
−9.3939 (0.0279)*
0%
−7.7216 (0.0250)*
0%
−9.1529 (0.0377)*
3.33%

Since CRmean can represent the overall CR values of the individuals under different strategies, it is instructive to plot the CRmean values versus generations for each design space. Fig. 3 plots the CRmean values based on the median run using β3 as nominal values for the same reason explained earlier. In Fig. 3, we observe that the CRmean values for ‘DE/rand/2’ and ‘DE/current-to-rand/1’ would converge to 0.1, which is the lower bound of the CRmean in NovDE. The variation of the CRmean values for novelty-based strategy presents distinct patterns under different design spaces. When X = [−1 1]7 and X = [−3 3]7, the CRmean would converge to 0.1, which is the lower bound of the CRmean in NovDE. Under these two design spaces, the decrease of CRmean values indicates their exploration capability tends to be restricted as evolution proceeds; for X = [−3 3]7, the CRmean values would converge faster. When X = [0 3], the CRmean converges to around 0.9, which is the upper bound of the CRmean in NovDE. The increase of CRmean values indicates their exploration capability tends to be enhanced as evolution proceeds. For different design spaces, the CRmean for the novelty-based strategy presents its adaptation to the exploration capability.

Fig. 3:

Fig. 3:

Adaptation behaviors of the median run among the 30 multiple runs of the CRmean values in NovDE for X = [−1 1]7, X = [−3 3]7 and X = [0 3]7, respectively. The nominal parameter is β3.

Table IV to Table VI present the support points of the locally D-optimal designs when β3 is the set of nominal values. Interestingly, each support point of these locally D-optimal designs has at most one factor level supported at its non-extreme values. This observation may provide an impetus for further study using analytical tools.

TABLE IV:

NovDE-generated locally D-optimal design for the logistic model with seven variables when the vector of nominal values for the parameters is β3 = (β0, β1, β2, β3, β4, β5, β6, β7)T = (−0.4926, −0.6280, −0.3283, 0.4378, 0.5283, −0.6120, −0.6837, −0.2061)T, and X = [−1, 1]7.

Support point X1 X2 X3 X4 X5 X6 X7 Pi
1 1 −1 1 −1 −1 −1 −1 0.0230
2 1 −1 1 −1 −1 −1 1 0.0160
3 1 1 1 1 1 −1 1 0.0255
4 1 1 −1 1 1 −1 −1 0.0223
5 1 −1 1 −1 −1 1 1 0.0152
6 −1 −1 1 1 1 1 1 0.0212
7 −1 1 1 −1 1 −1 1 0.0269
8 1 −1 1 −1 1 −1 −1 0.0101
9 1 −1 1 −1 −1 1 −1 0.0269
10 −1 1 1 −1 −1 1 −1 0.0117
11 1 −1 −1 1 1 −1 1 0.0269
12 −1 −1 1 1 1 1 1 0.0142
13 −1 1 1 1 1 −1 1 0.0219
14 1 1 1 −1 −1 −1 1 0.0182
15 1 1 −1 1 −1 −1 1 0.0183
16 −1 −1 1 1 1 1 −1 0.0199
17 −1 −1 −1 −1 −1 −1 1 0.0269
18 −1 −1 −1 −1 1 −1 −1 0.0101
19 −1 −1 1 −1 1 1 −1 0.0269
20 −1 1 1 −1 −1 −1 −1 0.0163
21 1 −1 1 1 −1 −1 1 0.0102
22 −1 1 −1 −1 1 −1 −1 0.0269
23 −1 1 −1 1 −1 −1 1 0.0241
24 −1 −1 −1 −1 −1 −1 1 0.0213
25 −1 −1 −1 −1 −1 1 1 0.0269
26 −1 −1 1 −1 1 −1 −1 0.0269
27 1 −1 −1 1 −1 −1 −1 0.0269
28 −1 1 −1 −1 −1 1 −1 0.0184
29 −1 −1 −1 1 −1 1 −1 0.0269
30 1 −1 −1 1 −1 1 −1 0.0161
31 1 −1 1 1 −1 1 −1 0.0165
32 1 −1 −1 −1 −1 −1 1 0.0143
33 −1 1 −1 −1 1 −1 −1 0.0269
34 −1 −1 1 1 1 1 −1 0.0124
35 −1 1 1 1 1 −1 −1 0.0269
36 −1 1 1 −1 −1 1 1 0.0204
37 1 1 1 1 −1 1 −1 0.0269
38 −1 1 −1 1 1 −1 −1 0.0103
39 1 −1 1 1 1 −1 1 0.0269
40 −1 1 −1 1 −1 1 1 0.0152
41 1 1 −1 1 −1 −1 −1 0.0150
42 1 1 1 −1 −1 −1 −1 0.0260
43 −1 −1 −1 −1 −1 −1 1 0.0144
44 −1 1 −1 1 −1 1 −1 0.0260
45 −1 −1 −1 1 1 1 1 0.0269
46 1 1 1 1 −1 1 1 0.0203
47 −1 1 1 1 −1 1 1 0.0245
48 1 −1 −1 1 1 −1 −1 0.0269

TABLE VI:

NovDE-generated locally D-optimal design for the logistic model with seven variables when the vector of nominal values for the parameters is β3 = (β0, β1, β2, β3, β4, β5, β6, β7)T = (−0.4926, −0.6280, −0.3283, 0.4378, 0.5283, −0.6120, −0.6837, −0.2061)T, and X = [0, 3]7.

Support point X1 X2 X3 X4 X5 X6 X7 Pi
1 3 0 3 3 3 0 0 0.0232
2 0 3 3 3 3 0 0 0.0262
3 0 3 3 0 0 0 3 0.0151
4 0 3 3 3 0 3 0 0.0270
5 3 3 3 3 0 0 0 0.0191
6 3 0 3 0 0 0 0 0.0113
7 0 0 3 3 0 3 0 0.0103
8 3 0 3 0 0 0 0 0.0215
9 0 3 3 0 0 0 3 0.0220
10 3 0 0 3 0 0 0 0.0106
11 3 0 3 3 0 0 3 0.0366
12 0 0 0 3 3 0 3 0.0389
13 0 3 3 3 0 0 0 0.0318
14 0 3 0 3 0 0 3 0.0309
15 0 0 3 3 0 3 3 0.0171
16 0 0 3 3 3 0 3 0.0315
17 0 3 3 0 0 0 3 0.0385
18 0 0 0 0 0 0 0 0.0367
19 0 0 3 0 3 0 0 0.0319
20 0 0 3 3 3 0 0 0.0323
21 0 0 0 3 3 0 0 0.0128
22 0 0 0 3 0 0 3 0.0217
23 3 3 3 3 0 0 0 0.0333
24 0 0 3 3 3 0 3 0.0114
25 0 0 0 3 0 3 0 0.0389
26 0 0 0 0 0 0 0 0.0107
27 0 0 3 0 0 3 0 0.0271
28 0 0 3 0 0 0 0 0.0380
29 0 3 3 3 3 0 0 0.0263
30 3 0 3 3 0 0 3 0.0298
31 0 0 3 3 0 3 0 0.0346
32 3 0 0 3 0 0 0 0.0316
33 0 3 0 3 0 0 0 0.0312
34 0 0 3 0 0 0 0 0.0131
35 0 0 0 3 0 0 3 0.0225
36 0 3 0 3 0 0 0 0.0138
37 0 3 3 3 0 3 3 0.0300
38 3 0 3 3 0 0 0 0.0100
39 0 0 3 3 3 3 0 0.0118
40 0 0 3 0 0 0 3 0.0146
41 0 0 3 3 0 3 3 0.0242

V. Car Refueling Experiment

We now apply the proposed NovDE algorithm to re-design a ten factor experiment to test a vision-based car refueling system [46]. The investigators were interested in finding whether a computer-controlled nozzle was able to insert into gas pipe correctly or not implying that the response variable in the study is binary. Table VII lists the ten factors. Four factors are discrete, each with two levels −1 or +1, and six factors are continuous. Table VII shows that the range of values for each continuous factor and they do vary considerably. The proposed NovDE algorithm is applied to find a locally D-optimal design for this high-dimensional nonlinear model with mixed factors using the vector of nominal values β = (β0, β1, β2, β3, β4, β5, β6, β7, β8, β9, β10)T = (3, 0.5, 0.75, 1.25, 0.8, 0.5, 0.8, −0.4, −1, 2.65, 0.65) from literature [46].

TABLE VII:

Factor types and levels for the car refueling experiment.

Type Factor Level
Low High
Discrete Ring Type White paper Reflective
Lighting Room lighting 2 flood lights and room lights
Sharpen No Yes
Smooth No Yes
Continuous Lighting angle from 50 degrees to 90 degrees
Gas-cap angle (Z axis) from 30 degrees to 55 degrees
Gas-cap angle (Y axis skew) from 0 degrees to 10 degrees
Car distance from 18 in. to 48 in.
Reflective ring thickness from 0.125 in. to 0.425 in.
Threshold step value from 5 to 15

Design issues for this ten-factor experiment were also considered in [47] but without interaction terms. In practice, the binary response is likely dependent on the joint changes in two or more of the factors, suggesting that interaction terms should be in the model. To fix ideas, we include five pairwise interactions into the model and believe that this is the first design work for such a high-dimensional logistic model. Previous attempts using common algorithms, like multiplicative and modified Fedorov-Wynn algorithms did not converge. The vector of nominal values for the model with the five selected pairwise interactions is β = (β0, β1, β2, β3, β4, β5, β6, β7, β8, β9, β10, β1,9, β2,5, β3,4, β6,7, β8,10)T = (3, 0.5, 0.75, 1.25, 0.8, 0.5, 0.8, −0.4, −1, 2.65, 0.65, 0.01, −0.02, 0.03, −0.04, 0.05)T based on literature [46].

Some of the tuning parameters used to find the locally D-optimal designs are the population size, maximum number of generations and maximum number of support points. For the model without factor interactions, the population size is 100, and the maximum number of generations is 10000. The maximum number of support points L is set to 100 so the dimension D of the problem is 1100 (=(10 + 1) × 100). The dimension for each support point is 11, which includes the number of factors (i.e. 10) and the dimension of the corresponding portion of observations taken at each support point (i.e. 1). For the model with factor interactions, the population size is 100, and the maximum number of generations is 20000. The maximum number of support points L is 100 so the dimension D of the problem is 1600 (=(10 + 1 + 5) × 100).

Due to the number of factors in this study, it is hard to construct and visually appreciate the high-dimensional sensitivity function of the generated design to confirm its optimality. An option is to generate 1000000 random vectors within the design spaces and check whether the sensitivity function is positive at these points. One may repeat this procedure and if none is found and the sensitivity function is zero at the support points of the generated design, then we conclude we have found an optimal design. Otherwise, we apply “DE/rand/1/bin” with F=0.5 and CR=0.9 to find the maximum positive value of the function and compute its D-efficiency lower bound. The lower bound D-efficiency is defined as exp (−θ/k) where k is the dimension of the model parameter β. Since the variables of this problems are mixed, the variation of lower bound D-efficiency is very large. In what is to follow, if a design has at least 90% D-efficiency, we accept the design as close enough to the optimum.

A. Without Factor Interactions

Table VIII compares the mean of locally D-optimal objective value and success rate of NovDE with NovDE-Bin and the other six differential evolution algorithms. Wilcoxon rank-sum test [45] is also conducted at the 5% significance level. In Table VIII, both the mean of the objective value and the success rate of NovDE-Bin are the highest. NovDE ranks the second. Both NovDE-Bin and NovDE significantly outperform all the other six algorithms. Thus, our empirical results validate the effectiveness of novelty exploration in solving the car refueling experiment. By extension, our work suggests that the NovDE and NovDE-Bin are effective for searching locally D-optimal designs for high-dimensional non-separable problems with mixed variables on various design spaces. In problems with mixed factors, the design space is less complex than the design space of the problems with continuous factors. The solutions obtained from D-optimal design with mixed factors are more likely to be close to one another at early evolution stage. Thus, it is more crucial to handle the premature issue especially for the problems with mixed factors. NovDE and NovDE-Bin can present the advantages in handling the premature issue and preserve the diversity of the solutions. As a result, NovDE performs even better than it performs on handling D-optimal design with continuous factors. Table IX lists the locally D-optimal design for the car refueling experiment, and 12 support points are generated. The design criteria value is −35.9178. A direct calculation shows that the D-efficiency lower bound for the generated design is 94.60%. This is not surprising even though we set the lower bound to be 90% for this problem. The reason is because the algorithm is not monotonic in the sense that it does not necessarily produce increasingly more efficient designs with each iteration. Another reason is that the higher D-efficiency optimal design may exist in the continuous design spaces instead of the mixed design spaces. Based on Table IX, the common rule of the mining knowledge still satisfies. For each support point, there is at most one factor value not at the boundary of the design space. This is consistent with the observation in Section IV.

TABLE VIII:

Comparisons of the performance of NovDE, NovDE-Bin and six competitors for the car refueling experiments without factor interactions. The best values of the mean and success rates are in bold. The entries with * represent NovDE significantly outperforms the other algorithm based on Wilcoxon rank-sum test.

Algorithm Success Rate Mean (std)
NovDE 90% −35.9390 (0.1060)
NovDE-Bin 90% −35.9180 (0.0035)
DE/rand/2/bin 13.33% −37.4963 (0.8482)*
ANDE 26.67% −36.7708 (0.8923)*
SaDE 70% −35.9480 (0.0485)*
SaDE+MER 46.67% −36.2242 (0.7181)*
JADE 70% −35.9645 (0.0911)*
DDE-AMS 10% −39.5935 (0.5128)*

TABLE IX:

NovDE-generated locally D-optimal design for car refueling experiment without factor interactions.

Support point Ring Type Lighting Sharpen Smooth Lighting
Angle
Z axis Y axis skew Car Dist. Ring Thick. Threshold
Step-size
Pi
1 −1 −1 −1 1 50 30 10 48 0.1250 5 0.0909
2 −1 −1 −1 −1 50 30 4.1991 48 0.1250 5 0.0909
3 −1 −1 −1 −1 50 30 10 48 0.1250 8.5698 0.0909
4 −1 −1 −1 −1 50 30 10 48 0.1250 5 0.0807
5 −1 −1 −1 −1 54.6407 30 10 48 0.1250 5 0.0909
6 −1 1 −1 −1 50 30 10 48 0.1250 5 0.0909
7 1 −1 −1 −1 50 30 10 48 0.1250 5 0.0752
8 1 −1 −1 −1 50 30 10 48 0.4250 5 0.0397
9 −1 −1 −1 −1 50 32.9005 10 48 0.1250 5 0.0909
10 −1 −1 −1 −1 50 30 10 45.6796 0.1250 5 0.0909
11 −1 −1 −1 −1 50 30 10 48 0.4250 5 0.0772
12 −1 −1 1 −1 50 30 10 48 0.1250 5 0.0909

B. With Factor Interactions

It seems realistic that there are factor interactions between Ring type and Reflective ring thickness, Lighting and Lighting angle, Sharpen and Smooth, Gas-cap angle (Z axis) and Gas-cap angle (Y axis skew) and Car distance and Threshold step value, respectively. The former two interactions are between a discrete factor and a continuous factor; the third interaction is between a discrete factor and a discrete factor and the latter two interactions are between a continuous factor and a continuous factor. In practice, the researcher uses content information to specify interaction terms in the model and implements a parsimonious model. Our conjecture that interaction terms were ignored in earlier design work for such a model is to simplify the design construction.

Table X compares the mean of locally D-optimal objective value and success rate of NovDE with NovDE-Bin and the other six differential evolution algorithms. Wilcoxon rank-sum test [45] is also conducted at the 5% significance level. In Table X, both the mean of the objective value and the success rate of NovDE is the highest, and NovDE-Bin is the second highest. NovDE significantly outperforms all the other six algorithms. We observe that the overall outperformance of the NovDE and NovDE-Bin algorithms relative to the other six algorithms are less dramatic than when the model has no interaction terms, our results still show it is effective in solving non-separable high-dimensional locally D-optimal design problems with mixed factors on various design spaces. In particular, it shows the NovDE is able to handle premature convergence and non-separable issues well in complex optimization problems. The proposed NovDE algorithm can produce optimal designs for a more realistic situation and so represents an advancement. Table XI shows the optimal design for the car refueling experiment with five pairwise factor interactions. There are 18 support points, and the design criteria value is −71.4284. The design has a D-efficiency of 95% or higher. An interesting note is that Table XI shows each support point can have one or more factors supported other than at its extreme values. This violates the common rule mentioned earlier and serves to show that as the model gets more complicated, the structure of the optimal design also becomes harder to characterize and understand.

TABLE X:

Comparisons of the performance of NovDE, NovDE-Bin and six competitors for the car refueling experiment with factor interactions. The best values of the mean and success rates are in bold. The entries with * represent NovDE significally outperforms the other algorithm based on Wilcoxon rank-sum tests

Algorithm Success Rate Mean (std)
NovDE 80% −71.5401 (0.3365)
NovDE-Bin 70% −71.5640 (0.3453)
DE/rand/2/bin 0% −74.4495 (1.4082)*
ANDE 23.33% −72.0390 (0.6751)*
SaDE 3.33% −71.7135 (0.3182)*
SaDE+MER 20% −71.6072 (0.1860)*
JADE 30% −71.5843 (0.3562)*
DDE-AMS 6.67% −71.7058 (0.3136)*

TABLE XI:

NovDE-generated locally D-optimal design for the car refueling experiment with five pairwise factor interactions.

Support point Ring Type Lighting Sharpen Smooth Lighting
Angle
Z axis Y axis
skew
Car Dist. Ring Thick. Threshold
Step-size
Pi
1 −1 1 −1 −1 50 35.5152 10 48 0.1250 5 0.0625
2 −1 1 1 1 50 30 10 48 0.1250 5 0.0625
3 −1 −1 −1 −1 50 30 10 48 0.1250 5 0.0466
4 −1 1 −1 −1 50 30 10 48 0.1250 5 0.0511
5 −1 1 −1 −1 50 34.5716 8.8571 48 0.1250 5 0.0625
6 −1 1 −1 −1 50 30 8.6212 48 0.1250 5 0.0625
7 −1 1 −1 −1 50 30 10 48 0.4250 5 0.0486
8 −1 1 −1 −1 50 30 10 45.1640 0.1250 5.6974 0.0625
9 1 1 −1 −1 50 30 10 48 0.4250 5 0.0625
10 −1 1 −1 −1 50 30 10 48 0.1250 5.7233 0.0625
11 −1 1 1 −1 50 30 10 48 0.1250 5 0.0625
12 −1 1 −1 −1 54.5960 30 10 48 0.1250 5 0.0625
13 1 −1 −1 −1 50 30 10 48 0.1250 5 0.0242
14 1 1 −1 −1 50 30 10 48 0.1250 5 0.0499
15 −1 −1 −1 −1 54.1003 30 10 48 0.1250 5 0.0625
16 −1 −1 −1 −1 50 30 10 48 0.4250 5 0.0296
17 −1 1 −1 1 50 30 10 48 0.1250 5 0.0625
18 −1 1 −1 −1 50 30 10 45.0586 0.1250 5 0.0625

VI. Conclusion

We propose a DE-based algorithm NovDE to search for locally D-optimal designs for logistic models with multiple factors and the factors may or may not interact with one another. We employ a new novelty-based mutation strategy to explore various regions of the search space so that the diversity of the population would be preserved. The new novelty-based mutation strategy is collaborated with ‘DE/rand/2’ and ‘DE/current-to-rand/1’ which can balance exploration and exploitation at early or medium stage of the evolution so that both convergence and diversity of the population are enhanced, and premature convergence issues are alleviated. We have demonstrated that NovDE provides the best objective function values and success rates compared with four other DE-related evolutionary algorithms. NovDE also outperforms the others in terms of finding a highly efficient D-optimal design for the ten-factor car refueling study where there are discrete and continuous factors in the logistic model and some of them interact with one another. Our empirical results also show that the distribution of the support points for optimal designs for models with interaction terms are more complex than those for models without interaction terms.

We focus on logistic models which are the most commonly used in practice to model binary responses. We expect our proposed algorithm works for other link functions as well, including cases when the response is continuous and there are many mixed factors. Our future study includes testing the capability of our proposed algorithm for tackling these problems and multiple-objective optimal design problems.

TABLE II:

Performances of NovDE, NovDE-Bin and six competitors for finding locally D-optimal designs on [−3, 3]7 using 5 sets of nominal values. In each cell, the numbers in the upper line are the mean and standard deviation of the values of the objective function over 30 multiple runs, and the number at the bottom line is its success rate. For each set of nominal values, the best values of the mean and success rates are in bold. The entries with * represent NovDE significantly outperforms the other algorithm based on Wilcoxon rank-sum test.

Algorithm β1 β2 β3 β4 β5
NovDE −0.1052 (0.0048)
100%
−0.4441 (0.0038)
80%
0.5343 (0.0128)
46.67%
0.7487 (0.0076)
93.33%
0.3678 (0.0028)
90%
NovDE-Bin −0.1054 (0.0049)
93.33%
−0.4438 (0.0036)
90%
0.5384 (0.0177)
40%
0.7476 (0.0051)
93.33%
0.3645 (0.0055)
90%
DE/rand/2/bin −0.1209 (0.0135)*
43.33%
−0.4581 (0.0221)*
23.33%
0.4879 (0.0283)*
0%
0.7140 (0.0127)*
6.67%
0.3396 (0.0061)*
16.67%
ANDE −0.1131 (0.0080) *
53.33%
−0.4561 (0.0113)*
40%
0.4940 (0.0207)*
3.33%
0.7097 (0.0295)*
3.33%
0.3412 (0.0138)*
6.67%
SaDE −0.1033 (0.0027)
80%
−0.4435 (0.0040)
90%
0.5234 (0.0155)*
40%
0.7457 (0.0109)
86.67%
0.3641 (0.0070)
73.33%
SaDE+MER −0.1022 (0.0032)
100%
−0.4440 (0.0030)
86.67%
0.5240 (0.0143) *
36.67%
0.7474 (0.0067)
90%
0.3661 (0.0061)
83.33%
JADE −0.1031 (0.0025)
50%
−0.4436 (0.0038)
46.67%
0.4979 (0.0274)
10%
0.7467 (0.0058)
83.33%
0.3667 (0.0031)
83.33%
DDE-AMS −0.1241 (0.0320)*
30%
−0.4590 (0.0304)*
16.67%
0.4278 (0.0350)*
0%
0.7130 (0.0197)*
3.33%
0.3547 (0.0098)*
33.33%

TABLE V:

NovDE-generated locally D-optimal design for the logistic model with seven variables when the vector of nominal values for the parameters is β3 = (β0, β1, β2, β3, β4, β5, β6, β7)T = (−0.4926, −0.6280, −0.3283, 0.4378, 0.5283, −0.6120, −0.6837, −0.2061)T, and X = [−3, 3]7.

Support point X1 X2 X3 X4 X5 X6 X7 Pi
1 −3 −3 −3 −3 3 −3 −3 0.0311
2 3 3 3 3 −3 3 3 0.0100
3 −3 −3 −3 −3 3 −3 3 0.0347
4 3 −3 3 3 3 −2.9971 3 0.0480
5 −3 −3 3 −3 3 2.9217 −3 0.0292
6 3 −3 −3 −3 −3 −3 −3 0.0100
7 −3 3 −3 −3 2.0891 −3 −3 0.0107
8 −3 −3 3 −3 −2.7545 3 3 0.0468
9 3 3 3 3 3 −3 −3 0.0187
10 3 3 3 3 −3 3 −3 0.0298
11 −3 −3 3 3 3 3 3 0.0409
12 3 −3 −3 3 −3 3 −3 0.0403
13 3 3 3 −3 −3 −3 3 0.0100
14 3 3 −3 3 −3 −3 3 0.0100
15 −3 3 3 3 3 3 −3 0.0385
16 3 −3 −3 −3 −3 −3 −3 0.0260
17 3 3 3 −3 −3 −3 −3 0.0517
18 −3 −3 −3 3 3 3 −3 0.0338
19 3 −3 3 −3 3 −3 −3 0.0375
20 −3 3 −3 −3 −3 −3 3 0.0303
21 3 3 −3 3 −3 −3 3 0.0258
22 −2.9190 3 −3 3 −3 3 −3 0.0451
23 −3 3 3 −3 −3 3 3 0.0431
24 3 −3 −3 3 3 −3 −3 0.0100
25 3 −3 −3 −3 −3 −3 3 0.0316
26 −3 −3 −3 −3 −3 3 −3 0.0136
27 −3 3 3 −3 3 −3 −3 0.0356
28 −3 3 −3 3 −3 3 3 0.0162
29 3 −3 3 3 −3 3 3 0.0438
30 −3 3 3 −3 −3 3 3 0.0305
31 3 −3 −3 3 −3 3 −3 0.0140
32 −3 3 −3 3 3 −3 3 0.0517
33 3 3 3 3 3 −3 3 0.0512

ACKNOWLEDGMENT

This work was partially supported by the National Natural Science Foundation of China (Grant No. 61876162/F060604) and the Research Grants Council of the Hong Kong Special Administrative Region, China [Project No. CityU11202418]. The research of Wong was partially supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award Number R01GM107639. The contents in this paper are solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Wong wishes to acknowledge the support from the Institute for Mathematical Sciences at the National University of Singapore for hosting the Particle Swarm and Evolutionary Computation Workshop in February 2018 and co-chaired by the K. C. Tan and W. K. Wong, where parts of this paper were discussed and written.

Biography

graphic file with name nihms-1519531-b0001.gif

Weinan Xu received the B.E. degree in electrical engineering from National University of Singapore in 2015. She is currently pursuing the Ph.D. degree in National University of Singapore. Her current research interests include many-objective optimizations, differential evolution, constraint optimization problems, and applications of evolutionary computation.

graphic file with name nihms-1519531-b0002.gif

Weng Kee Wong is a full Professor with the Department of Biostatistics in the Fielding School of Public Health at the University of California at Los Angeles. His current Associate Editorship includes Statistical Methods in Medical Research, Communication in Statistics, Test and Journal of Statistical Planning and Inference. He is an elected member of the Delta Omega Public Health Honor Society, an elected member of The International Statistical Institute, Fellow of The American Statistical Association, Fellow of The Institute of Mathematical Statistics and Fellow of The American Association for the Advancement of Science. He has published more than 200+ refereed articles in medical and statistics journals, including 2 edited books and a design monograph.

graphic file with name nihms-1519531-b0003.gif

Kay Chen Tan (SM’08-F’14) is a full Professor with the Department of Computer Science, City University of Hong Kong. He is the Editor-in-Chief of IEEE Transactions on Evolutionary Computation, was the EiC of IEEE Computational Intelligence Magazine (2010-2013), and currently serves on the Editorial Board member of 20+ journals. He is an elected member of IEEE CIS AdCom (2017-2019). He has published 200+ refereed articles and 6 books. He is a Fellow of IEEE.

graphic file with name nihms-1519531-b0004.gif

Jian-Xin Xu (M’92-SM’98-FM’11) received the Ph.D. degree from University of Tokyo, Tokyo, Japan, in 1989. In 1991, he joined the National University of Singapore. He is an IEEE fellow. He has published more than 200 journal papers and 6 books in the field of system and control, supervised 30 Ph.D. students and 15 research associates. Over the years, his cumulative research is supported by nearly 30 research grants. His current research interests include the fields of learning theory, intelligent system and control, nonlinear and robust control, robotics, and precision motion control.

References

  • [1].Silvey S Optimal design: an introduction to the theory for parameter estimation, vol. 1 Springer Science & Business Media; 2013 [Google Scholar]
  • [2].Chernoff H “Locally optimal designs for estimating parameters”, The Annals of Mathematical Statistics pp. 586–602 1953 [Google Scholar]
  • [3].Whitacre JM “Recent trends indicate rapid growth of nature-inspired optimization in academia and industry”, Computing vol. 93 no. 2–4 pp. 121–133 2011. [Google Scholar]
  • [4].Whitacre JM “Survival of the flexible: explaining the recent popularity of nature-inspired optimization within a rapidly evolving world”, Computing vol. 93 no. 2 pp. 135–146 2011 [Google Scholar]
  • [5].Kennedy J “Particle swarm optimization” in Encyclopedia of Machine Learning pp. 760–766 Springer; 2011[chk all similarcases]. [Google Scholar]
  • [6].Clerc M Particle swarm optimization vol. 93 John Wiley & Sons; 2010. [Google Scholar]
  • [7].Poli R Kennedy J and Blackwell T “Particle swarm optimization” Swarm Intelligence vol. 1 no. 1 pp. 33–57 2007. [Google Scholar]
  • [8].Qiu J Chen R-B Wang W and Wong WK “Using animal instincts to design efficient biomedical studies via particle swarm optimization”, Swarm and Evolutionary Computation vol. 18 pp. 1–10 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Wong WK Chen R-B Huang C-C and Wang W “A modified particle swarm optimization technique for finding optimal designs for mixture models” PloS One vol. 10 no. 6 p. e0124720 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Chen R-B Chang S-P Wang W Tung H-C and Wong WK “Minimax optimal designs via particle swarm optimization methods” Statistics and Computing vol. 25 no. 5 pp. 975–988 2015 [Google Scholar]
  • [11].Yang Y and Pedersen JO “A comparative study on feature selection in text categorization”, in ICML vol. 97 pp. 412–420 1997. [Google Scholar]
  • [12].Chen W-N Zhang J Lin Y Chen N Zhan Z-H Chung HS-H Li Y and Shi Y-H “Particle swarm optimization with an aging leader and challengers”[chk all similarcases], IEEE Transactions on Evolutionary Computation vol. 17 no. 2 pp. 241–258 2013 [Google Scholar]
  • [13].Storn R and Price K “Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces”, Journal of Global Optimization vol. 11 no. 4 pp. 341–359 1997 [Google Scholar]
  • [14].Das S and Suganthan PN “Differential evolution: A survey of the state-of-the-art” IEEE Transactions on Evolutionary Computation vol. 15 no. 1 pp. 4–31 2011 [Google Scholar]
  • [15].Keshk M Singh H and Abbass H “Automatic estimation of differential evolution parameters using hidden markov models” Evolutionary Intelligence pp. 1–17 2018 [Google Scholar]
  • [16].Kramer O “Evolutionary self-adaptation: a survey of operators and strategy parameters” Evolutionary Intelligence vol. 3 no. 2 pp. 51–65 2010 [Google Scholar]
  • [17].Boukhari N Debbat F Monmarché N and Slimane M “A study on self-adaptation in the evolutionary strategy algorithm” in Computational Intelligence and Its Applications: 6th IFIP TC 5 International Conference, CIIA 2018 Oran, Algeria May 8-10, 2018 Proceedings 6, pp. 150–160 Springer; 2018. [Google Scholar]
  • [18].Qing A “Dynamic differential evolution strategy and applications in electromagnetic inverse scattering problems” IEEE Transactions on Geoscience and Remote Sensing vol. 44 no. 1 pp. 116–125 2006. [Google Scholar]
  • [19].Talukder A Kirley M and Buyya R “Multiobjective differential evolution for scheduling workflow applications on global grids”, Concurrency and Computation: Practice and Experience vol. 21 no. 13 pp. 1742–1756 2009. [Google Scholar]
  • [20].Zhang C Lim P Qin A and Tan KC “Multiobjective deep belief networks ensemble for remaining useful life estimation in prognostics” IEEE Transactions on Neural Networks and Learning Systems 2017. [DOI] [PubMed] [Google Scholar]
  • [21].Zhang C Tan KC and Ren R “Training cost-sensitive deep belief networks on imbalance data problems” in Neural Networks (IJCNN), 2016 International Joint Conference on pp. 4362–4367 IEEE; 2016. [Google Scholar]
  • [22].Brest J Greiner S Boskovic B Mernik M and Zumer V “Self-adapting control parameters in differential evolution: A comparative study on numerical benchmark problems” IEEE Transactions on Evolutionary Computation vol. 10 no. 6 pp. 646–657 2006. [Google Scholar]
  • [23].Zhang J and Sanderson AC “Jade: adaptive differential evolution with optional external archive” IEEE Transactions on Evolutionary Computation vol. 13 no. 5 pp. 945–958 2009. [Google Scholar]
  • [24].Das S Abraham A Chakraborty UK and Konar A “Differential evolution using a neighborhood-based mutation operator” IEEE Transactions on Evolutionary Computation vol. 13 no. 3 pp. 526–553 2009. [Google Scholar]
  • [25].Epitropakis MG Tasoulis DK Pavlidis NG Plagianakos VP and Vrahatis MN “Enhancing differential evolution utilizing proximity-based mutation operators” IEEE Transactions on Evolutionary Computation vol. 15 no. 1 pp. 99–119 2011. [Google Scholar]
  • [26].Islam SM Das S Ghosh S Roy S and Suganthan PN “An adaptive differential evolution algorithm with novel mutation and crossover strategies for global numerical optimization” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) vol. 42 no. 2 pp. 482–500 2012. [DOI] [PubMed] [Google Scholar]
  • [27].Risi S Vanderbleek SD Hughes CE and Stanley KO “How novelty search escapes the deceptive trap of learning to learn” in Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation pp. 153–160 ACM 2009. [Google Scholar]
  • [28].Lehman J and Stanley KO “Evolving a diversity of virtual creatures through novelty search and local competition” in Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation pp. 211–218 ACM, 2011. [Google Scholar]
  • [29].Qin AK and Suganthan PN “Self-adaptive differential evolution algorithm for numerical optimization” in Evolutionary Computation, 2005. The 2005 IEEE Congress on vol. 2 pp. 1785–1791 IEEE 2005. [Google Scholar]
  • [30].McCulloch CE “Generalized linear models” Journal of the American Statistical Association vol. 95 no. 452 pp. 1320–1324 2000. [Google Scholar]
  • [31].Ford I Torsney B and Wu CJ “The use of a canonical form in the construction of locally optimal designs for non-linear problems” Journal of the Royal Statistical Society. Series B (Methodological) pp. 569–583 1992. [Google Scholar]
  • [32].Stufken J and Yang M “Optimal designs for generalized linear models” Design and Analysis of Experiments, Special Designs and Applications vol. 3 p. 137, 2012. [Google Scholar]
  • [33].Kiefer J and Wolfowitz J “Optimum designs in regression problems” The Annals of Mathematical Statistics pp. 271–294 1959. [Google Scholar]
  • [34].Pázman A Foundations of optimum experimental design vol. 14 Springer; 1986. [Google Scholar]
  • [35].Zaharie D “Control of population diversity and adaptation in differential evolution algorithms” in Proc. of MENDEL vol. 9 pp. 41–46 2003. [Google Scholar]
  • [36].Abbass HA “The self-adaptive pareto differential evolution algorithm” in Evolutionary Computation, 2002 CEC’02. Proceedings of the 2002 Congress on vol. 1 pp. 831–836 IEEE; 2002. [Google Scholar]
  • [37].Zhao S-Z and Suganthan PN “Empirical investigations into the exponential crossover of differential evolutions”, Swarm and Evolutionary Computation vol. 9 pp. 27–36 2013. [Google Scholar]
  • [38].Qiu X Tan KC and Xu J-X “Multiple exponential recombination for differential evolution” IEEE Transactions on Cybernetics vol. 47 no. 4 pp. 995–1006, 2017. [DOI] [PubMed] [Google Scholar]
  • [39].Hansen N Ros R Mauny N Schoenauer M and Auger A “Impacts of invariance in search: When cma-es and pso face ill-conditioned and non-separable problems”, Applied Soft Computing vol. 11 no. 8 pp. 5755–5769, 2011. [Google Scholar]
  • [40].Mohamed AW and Almazyad AS “Differential evolution with novel mutation and adaptive crossover strategies for solving large scale global optimization problems”, Applied Computational Intelligence and Soft Computing vol. 2017, 2017. [Google Scholar]
  • [41].Omidvar MN Li X and Yao X “Cooperative co-evolution with delta grouping for large scale non-separable function optimization” in Evolutionary Computation (CEC), 2010 IEEE Congress on pp. 1–8 IEEE; 2010. [Google Scholar]
  • [42].Tang R “Decentralizing and coevolving differential evolution for large-scale global optimization problems” Applied Intelligence vol. 47 no. 4 pp. 1208–1223, 2017. [Google Scholar]
  • [43].Ge Y-F Yu W-J Lin Y Gong Y-J Zhan Z-H Chen W-N and Zhang J “Distributed differential evolution based on adaptive mergence and split for large-scale optimization” IEEE Transactions on Cybernetics vol. 48 no. 7 pp. 2166–2180, 2018. [DOI] [PubMed] [Google Scholar]
  • [44].Brest J Zamuda A Fister I and Maučec MS “Large scale global optimization using self-adaptive differential evolution algorithm” in Evolutionary Computation (CEC), 2010 IEEE Congress on pp. 1–8 IEEE; 2010. [Google Scholar]
  • [45].Wilcoxon F “Individual comparisons by ranking methods” Biometrics bulletin vol. 1 no. 6 pp. 80–83, 1945. [Google Scholar]
  • [46].Grimshaw SD Collings BJ Larsen WA and Hurt CR “Eliciting factor importance in a designed experiment” Technometrics vol. 43 no. 2 pp. 133–146, 2001. [Google Scholar]
  • [47].Lukemire J Mandal A and Wong WK “d-qpso: A quantum-behaved particle swarm technique for finding d-optimal designs with discrete and continuous factors and a binary response” Technometrics, no. just-accepted pp. 1–27, 2018. [Google Scholar]

RESOURCES