Skip to main content
Springer logoLink to Springer
. 2017 Feb 6;67(2):361–399. doi: 10.1007/s10589-017-9894-9

An SQP method for mathematical programs with vanishing constraints with strong convergence properties

Matúš Benko 1,, Helmut Gfrerer 1
PMCID: PMC5397537  PMID: 28479672

Abstract

We propose an SQP algorithm for mathematical programs with vanishing constraints which solves at each iteration a quadratic program with linear vanishing constraints. The algorithm is based on the newly developed concept of Q-stationarity (Benko and Gfrerer in Optimization 66(1):61–92, 2017). We demonstrate how QM-stationary solutions of the quadratic program can be obtained. We show that all limit points of the sequence of iterates generated by the basic SQP method are at least M-stationary and by some extension of the method we also guarantee the stronger property of QM-stationarity of the limit points.

Keywords: SQP method, Mathematical programs with vanishing constraints, Q-stationarity, QM-stationarity

Introduction

Consider the following mathematical program with vanishing constraints (MPVC)

minxRnf(x)subject tohi(x)=0iE,gi(x)0iI,Hi(x)0,Gi(x)Hi(x)0iV, 1

with continuously differentiable functions f,hi,iE,gi,iI,Gi,Hi,iV and finite index sets EI and V.

Theoretically, MPVCs can be viewed as standard nonlinear optimization problems, but due to the vanishing constraints, many of the standard constraint qualifications of nonlinear programming are violated at any feasible point x¯ with Hi(x¯)=Gi(x¯)=0 for some iV. On the other hand, by introducing slack variables, MPVCs may be reformulated as so-called mathematical programs with complementarity constraints (MPCCs), see [7]. However, this approach is also not satisfactory as it has turned out that MPCCs are in fact even more difficult to handle than MPVCs. This makes it necessary, both from a theoretical and numerical point of view, to consider special tailored algorithms for solving MPVCs. Recent numerical methods follow different directions. A smoothing-continuation method and a regularization approach for MPCCs are considered in [6, 10] and a combination of these techniques, a smoothing-regularization approach for MPVCs is investigated in [2]. In [3, 8] the relaxation method has been suggested in order to deal with the inherent difficulties of MPVCs.

In this paper, we carry over a well known SQP method from nonlinear programming to MPVCs. We proceed in a similar manner as in [4], where an SQP method for MPCCs was introduced by Benko and Gfrerer. The main task of our method is to solve in each iteration step a quadratic program with linear vanishing constraints, a so-called auxiliary problem. Then we compute the next iterate by reducing a certain merit function along some polygonal line which is given by the solution procedure for the auxiliary problem. To solve the auxiliary problem we exploit the new concept of QM-stationarity introduced in the recent paper by Benko and Gfrerer [5]. QM-stationarity is in general stronger than M-stationarity and it turns out to be very suitable for a numerical approach as it allows to handle the program with vanishing constraints without relying on enumeration techniques. Surprisingly, we compute at least a QM-stationary solution of the auxiliary problem just by means of quadratic programming by solving appropriate convex subproblems.

Next we study the convergence of the SQP method. We show that every limit point of the generated sequence is at least M-stationary. Moreover, we consider the extended version of our SQP method, where at each iterate a correction of the iterate is made to prevent the method from converging to undesired points. Consequently we show that under some additional assumptions all limit points are at least QM-stationary. Numerical tests indicate that our method behaves very reliably.

A short outline of this paper is as follows. In Sect. 2 we recall the basic stationarity concepts for MPVCs as well as the recently developed concepts of Q- and QM-stationarity. In Sect. 3 we describe an algorithm based on quadratic programming for solving the auxiliary problem occurring in every iteration of our SQP method. We prove the finiteness and summarize some other properties of this algorithm. In Sect. 4 we propose the basic SQP method. We describe how the next iterate is computed by means of the solution of the auxiliary problem and we consider the convergence of the overall algorithm. In Sect. 5 we consider the extended version of the overall algorithm and we discuss its convergence. Section 6 is a summary of numerical results we obtained by implementing our basic algorithm in MATLAB and by testing it on a subset of test problems considered in the thesis of Hoheisel [7].

In what follows we use the following notation. Given a set M we denote by P(M):={(M1,M2)|M1M2=M,M1M2=} the collection of all partitions of M. Further, for a real number a we use the notation (a)+:=max(0,a),(a)-:=min(0,a). For a vector u=(u1,u2,,um)TRm we define |u|,(u)+,(u)- componentwise, i.e. |u|:=(|u1|,|u2|,,|um|)T, etc. Moreover, for uRm and 1p we denote the p norm of u by up and we use the notation u:=u2 for the standard 2 norm. Finally, given a sequence ykRm, a point yRm and an infinite set KN we write ykKy instead of limk,kKyk=y.

Stationary points for MPVCs

Given a point x¯ feasible for (1) we define the following index sets

Ig(x¯):={iI|gi(x¯)=0},I0+(x¯):={iV|Hi(x¯)=0<Gi(x¯)},I0-(x¯):={iV|Hi(x¯)=0>Gi(x¯)},I+0(x¯):={iV|Hi(x¯)>0=Gi(x¯)},I00(x¯):={iV|Hi(x¯)=0=Gi(x¯)},I+-(x¯):={iV|Hi(x¯)>0<Gi(x¯)}. 2

In contrast to nonlinear programming there exist a lot of stationarity concepts for MPVCs.

Definition 2.1

Let x¯ be feasible for (1). Then x¯ is called

  1. Weakly stationary, if there are multipliers λig,iI,λih,iE,λiG,λiH,iV such that
    f(x¯)T+iEλihhi(x¯)T+iIλiggi(x¯)T+iV-λiHHi(x¯)T+λiGGi(x¯)T=0 3
    and
    λiggi(x¯)=0,iI,λiHHi(x¯)=0,iV,λiGGi(x¯)=0,iV,λig0,iI,λiH0,iI0-(x¯),λiG0,iI00(x¯)I+0(x¯). 4
  2. M-stationary, if it is weakly stationary and
    λiHλiG=0,iI00(x¯). 5
  3. Q-stationary with respect to (β1,β2), where (β1,β2) is a given partition of I00(x¯), if there exist two multipliers λ¯=(λ¯h,λ¯g,λ¯H,λ¯G) and λ̲=(λ̲h,λ̲g,λ̲H,λ̲G), both fulfilling (3) and (4), such that
    λ¯iG=0,λ̲iH,λ̲iG0,iβ1;λ¯iH,λ¯iG0,λ̲iG=0,iβ2. 6
  4. Q-stationary, if there is some partition (β1,β2)P(I00(x¯)) such that x¯ is Q-stationary with respect to (β1,β2).

  5. QM-stationary, if it is Q-stationary and at least one of the multipliers λ¯ and λ̲ fulfills M-stationarity condition (5).

  6. S-stationary, if it is weakly stationary and
    λiH0,λiG=0,iI00(x¯).

The concepts of Q-stationarity and QM-stationarity were introduced in the recent paper by Benko and Gfrerer [5], whereas the other stationarity concepts are very common in the literature, see e.g. [1, 7, 8]. The following implications hold:

S-stationarityQ-stationarity with respect to every(β1,β2)P(I00(x¯))Q-stationarity w.r.t.(,I00(x¯))QM-stationarityM-stationarityweak stationarity.

The first implication follows from the fact that the multiplier corresponding to S-stationarity fulfills the requirements for both λ¯ and λ̲. The third implication holds because for (β1,β2)=(,I00(x¯)) the multiplier λ̲ fulfills (5) since λ̲iG=0 for iI00(x¯).

Note that the S-stationarity conditions are nothing else than the Karush-Kuhn-Tucker conditions for the problem (1). As we will demonstrate in the next theorems, a local minimizer is S-stationary only under some comparatively stronger constraint qualification, while it is QM-stationary under very weak constraint qualifications. Before stating the theorems we recall some common definitions.

Denoting

Fi(x):=(-Hi(x),Gi(x))T,iV,P:={(a,b)R-×R|ab0}, 7
F(x):=(h(x)T,g(x)T,F(x)T)T,D:={0}|E|×R-|I|×P|V|, 8

we see that problem (1) can be rewritten as

minf(x)subject toxΩV:={xRn|F(x)D}.

Recall that the contingent (also tangent) cone to a closed set ΩRm at uΩ is defined by

TΩ(u):={dRm|(dk)d,(τk)0:u+τkdkΩk}.

The linearized cone to ΩV at x¯ΩV is then defined as TΩVlin(x¯):={dRn|F(x¯)dTD(F(x¯))}.

Further recall that x¯ΩV is called B-stationary if

f(x¯)d0dTΩV(x¯).

Every local minimizer is known to be B-stationary.

Definition 2.2

Let x¯ be feasible for (1), i.e x¯ΩV. We say that the generalized Guignard constraint qualification (GGCQ) holds at x¯, if the polar cone of TΩV(x¯) equals the polar cone of TΩVlin(x¯).

Theorem 2.1

(c.f. [5, Theorem 8]) Assume that GGCQ is fulfilled at the point x¯ΩV. If x¯ is B-stationary, then x¯ is Q-stationary for (1) with respect to every partition (β1,β2)P(I00(x¯)) and it is also QM-stationary.

Theorem 2.2

(c.f. [5, Theorem 8]) If x¯ is Q-stationary with respect to a partition (β1,β2)P(I00(x¯)), such that for every jβ1 there exists some zj fulfilling

h(x¯)zj=0,gi(x¯)zj=0,iIg(x¯),Gi(x¯)zj=0,iI+0(x¯),Gi(x¯)zj0,iβ1,0,iβ2,Hi(x¯)zj=0,iI0-(x¯)I00(x¯)I0+(x¯)\{j},Hj(x¯)zj=-1 9

and there is some z¯ such that

h(x¯)z¯=0,gi(x¯)z¯=0,iIg(x¯),Gi(x¯)z¯=0,iI+0(x¯),Gi(x¯)z¯0,iβ1,-1,iβ2,Hi(x¯)z¯=0,iI0-(x¯)I00(x¯)I0+(x¯), 10

then x¯ is S-stationary and consequently also B-stationary.

Note that these two theorems together also imply that a local minimizer x¯ΩV is S-stationary provided GGCQ is fulfilled at x¯ and there exists a partition (β1,β2)P(I00(x¯)), such that for every jβ1 there exists zj fulfilling (9) and z¯ fulfilling (10).

Moreover, note that (9) and (10) are fulfilled for every partition (β1,β2)P(I00(x¯)) e.g. if the gradients of active constraints are linearly independent. On the other hand, in the special case of partition (,I00(x¯))P(I00(x¯)), this conditions read as the requirement that the system

h(x¯)z¯=0,gi(x¯)z¯=0,iIg(x¯),Gi(x¯)z¯=0,iI+0(x¯),Gi(x¯)z¯-1,iI00(x¯),Hi(x¯)z¯=0,iI0-(x¯)I00(x¯)I0+(x¯)

has a solution, which resembles the well-known Mangasarian-Fromovitz constraint qualification (MFCQ) of nonlinear programming and it seems to be a rather weak and possibly often fulfilled assumption.

Finally, we recall the definitions of normal cones. The regular normal cone to a closed set ΩRm at uΩ can be defined as the polar cone to the tangent cone by

N^Ω(u):=(TΩ(u))={zRm|(z,d)0dTΩ(u)}.

The limiting normal cone to a closed set ΩRm at uΩ is given by

NΩ(u):={zRm|uku,zkzwithukΩ,zkN^Ω(uk)k}. 11

In case when Ω is a convex set, regular and limiting normal cone coincide with the classical normal cone of convex analysis, i.e.

N^Ω(u)=NΩ(u)={zRm|(z,u-v)0vΩ}. 12

Well-known is also the following description of the limiting normal cone

NΩ(u):={zRm|uku,zkzwithukΩ,zkNΩ(uk)k}. 13

We conclude this section by the following characterization of M- and Q-stationarity via limiting normal cone. Straightforward calculations yield that

NP(Fi(x¯))=R+×{0}ifiI0-(x¯),R×{0}{0}×R+ifiI00(x¯),R×{0}ifiI0+(x¯),{0}×R+ifiI+0(x¯),{0}×{0}ifiI+-(x¯),NP1(Fi(x¯))=R×{0}ifiI0+(x¯)I00(x¯)I0-(x¯),NP2(Fi(x¯))=R+×R+ifiI00(x¯),NP(Fi(x¯))ifiI0-(x¯)I+0(x¯)I+-(x¯)

and hence the M-stationarity conditions (4) and (5) can be replaced by

(λh,λg,λH,λG)ND(F(x¯))=R|E|×{uR+|I||(u,g(x¯))=0}×NP|V|(F(x¯)) 14

and the Q-stationarity conditions (4) and (6) can be replaced by

(λ¯h,λ¯g,λ¯H,λ¯G)R|E|×{uR+|I||(u,g(x¯))=0}×iVνiβ1,β2(x¯), 15
(λ̲h,λ̲g,λ̲H,λ̲G)R|E|×{uR+|I||(u,g(x¯))=0}×iVνiβ2,β1(x¯), 16

where for (β1,β2)P(I00(x¯)) we define

νiβ1,β2(x¯):=NP1(Fi(x¯))ifiI0+(x¯)β1,NP2(Fi(x¯))ifiI0-(x¯)I+0(x¯)I+-(x¯)β2.

Note also that for every iV we have

νiI00(x¯),(x¯)NP(Fi(x¯)). 17

Solving the auxiliary problem

In this section, we describe an algorithm for solving quadratic problems with vanishing constraints of the type

QPVC(ρ)min(s,δ)Rn+112sTBs+fs+ρ12δ2+δsubject to(1-δ)hi+his=0iE,(1-θigδ)gi+gis0iI,(1-θiHδ)Hi+His0,(1-θiGδ)Gi+Gis(1-θiHδ)Hi+His0iV,-δ0. 18

here the vector θ=(θg,θG,θH){0,1}|I|+2|V|=:B is chosen at the beginning of the algorithm such that some feasible point is known in advance, e.g. (s,δ)=(0,1). The parameter ρ has to be chosen sufficiently large and acts like a penalty parameter forcing δ to be near zero at the solution. B is a symmetric positive definite n×n matrix, f,hi,gi,Gi,Hi denote row vectors in Rn and hi,gi,Gi,Hi are real numbers. Note that this problem is a special case of problem (1) and consequently the definition of Q- and QM- stationarity as well as the definition of index sets (2) remain valid.

It turns out to be much more convenient to operate with a more general notation. Let us denote by Fi:=(-Hi,Gi)T a vector in R2, by Fi:=(-HiT,GiT)T a 2×n matrix and by P1:={0}×R and P2:=R-2 two subsets of R2. Note that for P given by (7) it holds that P=P1P2. The problem (18) can now be equivalently rewritten in a form

QPVC(ρ)min(s,δ)Rn+112sTBs+fs+ρ12δ2+δsubject to(1-δ)hi+his=0iE,(1-θigδ)gi+gis0iI,δ(θiHHi,-θiGGi)T+Fi+FisPiV,-δ0. 19

For a given feasible point (s,δ) for the problem QPVC(ρ) we define the following index sets

I1(s,δ):={iV|δ(θiHHi,-θiGGi)T+Fi+FisP1\P2}=I0+(s,δ),I2(s,δ):={iV|δ(θiHHi,-θiGGi)T+Fi+FisP2\P1}=I+0(s,δ)I+-(s,δ),I0(s,δ):={iV|δ(θiHHi,-θiGGi)T+Fi+FisP1P2}=I0-(s,δ)I00(s,δ),

where the index sets I0+(s,δ),I+0(s,δ),I+-(s,δ),I0-(s,δ),I00(s,δ) are given by (2).

Further, consider the distance function d defined by

d(x,A):=infyAx-y1,

for xR2 and AR2. The following proposition summarizes some well-known properties of d.

Proposition 3.1

Let xR2 and AR2.

  1. Let BR2, then
    d(x,AB)=min{d(x,A),d(x,B)}. 20
    In particular,
    d(x,P1)=(x1)++(-x1)+,d(x,P2)=(x1)++(x2)+,d(x,P)=(x1)++(min{-x1,x2})+. 21
  2. d(·,A):R2R+ is Lipschitz continuous with Lipschitz modulus L=1 and consequently
    d(x,A)d(x+y,A)+y1. 22
  3. d(·,A):R2R+ is convex, provided A is convex.

Due to the disjunctive structure of the auxiliary problem we can subdivide it into several QP-pieces. For every partition (V1,V2)P(V) we define the convex quadratic problem

QP(ρ,V1)min(s,δ)Rn+112sTBs+fs+ρ12δ2+δsubject to(1-δ)hi+his=0iE,(1-θigδ)gi+gis0iI,δ(θiHHi,-θiGGi)T+Fi+FisP1iV1,δ(θiHHi,-θiGGi)T+Fi+FisP2iV2,-δ0. 23

Since (V1,V2) form a partition of V it is sufficient to define V1 since V2 is given by V2=V\V1.

At the solution (s,δ) of QP(ρ,V1) there is a corresponding multiplier λ(ρ,V1)=(λh,λg,λH,λG) and a number λδ0 with λδδ=0 fulfilling the KKT conditions:

Bs+fT+iEλihhiT+iIλiggiT+iVFiTλiF=0, 24
ρ(δ+1)-λδ-iEλihhi-iIλigθiggi+iV(θiHHi,-θiGGi)λiF=0, 25
λig((1-θigδ)gi+gis)=0,λig0,iI, 26
λiFNP1(δ(θiHHi,-θiGGi)T+Fi+Fis),iV1, 27
λiFNP2(δ(θiHHi,-θiGGi)T+Fi+Fis),iV2, 28

where λiF:=(λiH,λiG)T for iV. Since P1 and P2 are convex sets, the above normal cones are given by (12).

The definition of the problem QP(ρ,V1) allows the following interpretation of Q-stationarity, which is a direct consequence of (15) and (16).

Lemma 3.1

A point (s,δ) is Q-stationary with respect to (β1,β2)P(I00(s,δ)) for (19) if and only if it is the solution of the convex problems QP(ρ,I1(s,δ)β1) and QP(ρ,I1(s,δ)β2).

Moreover, since for V1=I1(s,δ)I00(s,δ) the conditions (27),(28) read as λiFνiI00(s,δ),(s,δ), it follows from (17) that if a point (s,δ) is the solution of QP(ρ,I1(s,δ)I00(s,δ)) then it is M-stationary for (19).

Finally, let us denote by δ¯(V1) the objective value at a solution of the problem

min(s,δ)Rn+1δsubject to the constraints of (23). 29

An outline of the algorithm for solving QPVC(ρ) is as follows.

Algorithm 3.1

(Solving the QPVC) Let ζ(0,1),ρ¯>1 and ρ>0 be given.

  1. Initialize:
    • Set the starting point (s0,δ0):=(0,1), define the vector θ by
      θig:=1ifgi>0,0ifgi0,(θiH,θiG):=(0,0)ifd(Fi,P)=0,(1,0)if0<d(Fi,P1)d(Fi,P2),(0,1)if0<d(Fi,P2)<d(Fi,P1) 30
      and set the partition V11:=I1(s0,δ0) and the counter of pieces t:=0.
    • Compute (s1,δ1) as the solution and λ1 as the corresponding multiplier of the convex problem QP(ρ,V11) and set t:=1.
    • If δ1>δ0, perform a restart: set ρ:=ρρ¯ and go to step 1.
  2. Improvement step:
    • while (st,δt) is not a solution of the following four convex problems:
      QP(ρ,I1(st,δt)(I00(st,δt)V1t)),QP(ρ,I1(st,δt)(I00(st,δt)\V1t)), 31
      QP(ρ,I1(st,δt)),QP(ρ,I1(st,δt)I00(st,δt)). 32
    • Compute (st+1,δt+1) as the solution and λt+1 as the corresponding multiplier of the first problem with (st+1,δt+1)(st,δt), set V1t+1 to the corresponding index set and increase the counter t of pieces by 1.
    • If δt>δt-1, perform a restart: set ρ:=ρρ¯ and go to step 1.
  3. Check for successful termination:
    • If δt<ζ set N:=t, stop the algorithm and return.
  4. Check the degeneracy:
    • If the non-degeneracy condition
      min{δ¯(I1(st,δt)),δ¯(I1(st,δt)I00(st,δt))}<ζ 33
      is fulfilled, perform a restart: set ρ:=ρρ¯ and go to step 1.
    • Else stop the algorithm because of degeneracy.

The selection of the index sets in step 2 is motivated by Lemma 3.1, since if (s,δ) is the solution of convex problems (31), then it is Q-stationary and if (s,δ) is also the solution of convex problems (32), then it is even QM-stationary for problem (19).

We first summarize some consequences of the Initialization step.

Proposition 3.2

  1. Vector θ is chosen in a way that for all iV it holds that
    (θiHHi,-θiGGi)T1=d(Fi,P). 34
  2. Partition (V11,V21) is chosen in a way that for j=1,2 it holds that
    iVj1impliesd(Fi,P)=d(Fi,Pj). 35

Proof

1. If d(Fi,P)=0 we have (θiH,θiG)=(0,0) and (34) obviously holds. If 0<d(Fi,P1)d(Fi,P2) we have (θiH,θiG)=(1,0) and we obtain

(θiHHi,-θiGGi)T1=|Hi|=d(Fi,P1)=d(Fi,P)

by (21) and (20). Finally, if 0<d(Fi,P2)<d(Fi,P1) we have Hi<0<Gi,(θiH,θiG)=(0,1) and thus

(θiHHi,-θiGGi)T1=|Gi|=(Hi)++(Gi)+=d(Fi,P2)=d(Fi,P)

follows again by (21) and (20).

2. If (θiHHi,-θiGGi)T+FiPj for some iV and j=1,2, by (22) and (34) we obtain

d(Fi,Pj)(θiHHi,-θiGGi)T1=d(Fi,P)

and consequently d(Fi,Pj)=d(Fi,P), because of (20). Hence we conclude that i(Ij(s0,δ0)I0(s0,δ0)) implies d(Fi,Pj)=d(Fi,P) for j=1,2 and the statement now follows from the fact that V11=I1(s0,δ0) and V21=I2(s0,δ0)I0(s0,δ0).

The following lemma plays a crucial part in proving the finiteness of the Algorithm 3.1.

Lemma 3.2

For each partition (V1,V2)P(V) there exists a positive constant Cρ(V1) such that for every ρCρ(V1) the solution (s,δ) of QP(ρ,V1) fulfills δ=δ¯(V1).

Proof

Let (s(V1),δ(V1)) denote a solution of (29). Since δ(V1)=δ¯(V1), it follows that the problem

minsRn12sTBs+fssubject to(1-δ¯(V1))hi+his=0iE,(1-θigδ¯(V1))gi+gis0iI,δ¯(V1)(θiHHi,-θiGGi)T+Fi+FisP1iV1,δ¯(V1)(θiHHi,-θiGGi)T+Fi+FisP2iV2 36

is feasible and by s¯(V1) we denote the solution of this problem and by λ¯(V1) the corresponding multiplier. Further, (s¯(V1),δ¯(V1)) is a solution of (29) and by λ(V1) we denote the corresponding multiplier.

Then, triple (s¯(V1),δ¯(V1)) and λ¯(V1) fulfills (24) and (26)–(28). Moreover, triple (s¯(V1),δ¯(V1)) and λ(V1) fulfills (26)-(28) and

iEλ(V1)ihhiT+iIλ(V1)iggiT+iVFiTλ(V1)iF=0, 37
1-λδ-iEλ(V1)ihhi-iIλ(V1)igθiggi+iV(θiHHi,-θiGGi)λ(V1)iF=0 38

for some λδ0 with λδδ¯(V1)=0.

Let Cρ(V1) be a positive constant such that for all ρCρ(V1) we have

α:=ρ(δ¯(V1)+1)-iEλ¯(V1)ihhi-iIλ¯(V1)igθiggi+iV(θiHHi,-θiGGi)λ¯(V1)iF0

and set λ~δ:=αλδ0 and λ~:=λ¯(V1)+αλ(V1). We will now show that for such ρ it holds that (s¯(V1),δ¯(V1)) is the solution of QP(ρ,V1).

Clearly, λ~δδ¯(V1)=αλδδ¯(V1)=0 and the triple (s¯(V1),δ¯(V1)) and λ~ also fulfills (24) due to (37) and it fulfills (26)-(28) due to the convexity of the normal cones. Moreover, taking into account the definitions of α,λ~δ and λ~ together with (38), we obtain

ρ(δ¯(V1)+1)-λ~δ-iEλ~ihhi-iIλ~igθiggi+iV(θiHHi,-θiGGi)λ~iF=α-αλδ-α(1-λδ)=0,

showing also (25). Hence (s¯(V1),δ¯(V1)) is the solution of QP(ρ,V1) and the proof is complete.

We now formulate the main theorem of this section.

Theorem 3.1

  1. Algorithm 3.1 is finite.

  2. If the Algorithm 3.1 is not terminated because of degeneracy, then (sN,δN) is QM-stationary for the problem (19) and δN<ζ.

Proof

1. The algorithm is obviously finite unless we perform a restart and hence increase ρ. Thus we can assume that ρ is sufficiently large, say

ρCρ:=max(V1,V2)P(V)Cρ(V1),

with Cρ(V1) given by the previous lemma. However this means, taking into account also Proposition 3.3 (1.), that (st-1,δt-1) is feasible for the problem QP(ρ,V1t) for all t, hence δt-1δ¯(V1t) and (st,δt) is the solution of QP(ρ,V1t), implying δt=δ¯(V1t) and consequently δtδt-1. Therefore we do not perform a restart in step 1 or step 2. On the other hand, since we enter steps 3 and 4 with δt=δ¯(I1(st,δt))=δ¯(I1(st,δt)I00(st,δt)), we either terminate the algorithm in step 3 with δt<ζ if the non-degeneracy condition (33) is fulfilled or we terminate the algorithm because of degeneracy in step 4. This finishes the proof.

2. The statement regarding stationarity follows easily from the fact that we enter step 3 of the algorithm only when (s,δ) is a solution of problems (32) and this means that it is also Q-stationary with respect to (,I00(sN,δN)) by Lemma 3.1. Thus, (s,δ) is also QM-stationary for problem (19). The claim about δ follows from the assumption that the Algorithm 3.1 is not terminated because of degeneracy.

We conclude this section with the following proposition that brings together the basic properties of the Algorithm 3.1.

Proposition 3.3

If the Algorithm 3.1 is not terminated because of degeneracy, then the following properties hold:

  1. For all t=1,,N the points (st-1,δt-1) and (st,δt) are feasible for the problem QP(ρ,V1t) and the point (st,δt) is also the solution of the convex problem QP(ρ,V1t).

  2. For all t=1,,N it holds that
    0δtδt-11. 39
  3. There exists a constant Ct, dependent only on the number of constraints, such that
    NCt. 40

Proof

1. By definitions of the problems QPVC(ρ) and QP(ρ,V1) it follows that a point (s,δ), feasible for QPVC(ρ), is feasible for QP(ρ,V1) if and only if

I1(s,δ)V1I1(s,δ)I0(s,δ). 41

The point (s0,δ0) is clearly feasible for QP(ρ,V11) and similarly the point (st,δt) is feasible for QP(ρ,V1t+1) for all t=1,,N-1, since the partition V1t+1 is defined by one of the index sets of (31)-(32) and thus fulfills (41). However, feasibility of (st+1,δt+1) for QP(ρ,V1t+1), together with (st+1,δt+1) being the solution of QP(ρ,V1t+1), then follows from its definition.

2. Statement follows from δ0=1, from the fact that we perform a restart whenever δt>δt-1 occurs and from the constraint -δ0.

3. Since whenever the parameter ρ is increased the algorithm goes to the step 1 and thus the counter t of the pieces is reset to 0, it follows that after the last time the algorithm enters step 1 we keep ρ constant. It is obvious that all the index sets V1t are pairwise different implying that the maximum of switches to a new piece is 2|V|.

The basic SQP algorithm for MPVC

An outline of the basic algorithm is as follows.

Algorithm 4.1

(Solving the MPVC)

  1. Initialization:
    • Select a starting point x0Rn together with a positive definite n×n matrix B0, a parameter ρ0>0 and constants ζ(0,1) and ρ¯>1.
    • Select positive penalty parameters σ-1=(σ-1h,σ-1g,σ-1F).
    • Set the iteration counter k:=0.
  2. Solve the Auxiliary problem:
    • Run Algorithm 3.1 with data ζ,ρ¯,ρ:=ρk,B:=Bk,f:=f(xk), hi:=hi(xk),hi:=hi(xk),iE, etc.
    • If the Algorithm 3.1 stops because of degeneracy, stop the Algorithm 4.1 with an error message.
    • If the final iterate sN is zero, stop the Algorithm 4.1 and return xk as a solution.
  3. Next iterate:
    • Compute new penalty parameters σk.
    • Set xk+1:=xk+sk where sk is a point on the polygonal line connecting the points s0,s1,,sN such that an appropriate merit function depending on σk is decreased.
    • Set ρk+1:=ρ, the final value of ρ in Algorithm 3.1.
    • Update Bk to get positive definite matrix Bk+1.
    • Set k:=k+1 and go to step 2.

Remark 4.1

We terminate the Algorithm 4.1 only in the following two cases. In the first case no sufficient reduction of the violation of the constraints can be achieved. The second case will be satisfied only by chance when the current iterate is a QM-stationary solution. Normally, this algorithm produces an infinite sequence of iterates and we must include a stopping criterion for convergence. Such a criterion could be that the violation of the constraints at some iterate is sufficiently small,

maxmaxiE|hi(xk)|,maxiI(gi(xk))+,maxiVd(Fi(xk),P)ϵC,

where Fi is given by (7) and the expected decrease in our merit function is sufficiently small,

skNkTBkskNkϵ1,

see Proposition 4.1 below.

The next iterate

Denote the outcome of Algorithm 3.1 at the k-th iterate by

(skt,δkt),λkt,(V1,kt,V2,kt)fort=0,,Nkandθk,λ̲kNk,λ¯kNk.

The new penalty parameters are computed by

σi,kh=ξ2λ~i,khifσi,k-1h<ξ1λ~i,kh,σi,k-1helse,σi,kg=ξ2λ~i,kgifσi,k-1g<ξ1λ~i,kg,σi,k-1gelse,σi,kF=ξ2λ~i,kFifσi,k-1F<ξ1λ~i,k-1F,σi,k-1Felse, 42

where

λ~i,kh=max|λi,kh,t|,λ~i,kg=max|λi,kg,t|,λ~i,kF=maxλi,kF,t, 43

with maximum being taken over t{1,,Nk} and 1<ξ1<ξ2. Note that this choice of σk ensures

σkhλ~kh,σkgλ~kg,σkFλ~kF. 44

The merit function

We are looking for the next iterate at the polygonal line connecting the points sk0,sk1,,skNk. For each line segment [skt-1,skt]:={(1-α)skt-1+αskt|α[0,1]},t=1,,Nk we consider the functions

ϕkt(α):=f(xk+s)+iEσi,kh|hi(xk+s)|+iIσi,kg(gi(xk+s))++iV1,ktσi,kFd(Fi(xk+s),P1)+iV2,ktσi,kFd(Fi(xk+s),P2),ϕ^kt(α):=f+fs+12sTBks+iEσi,kh|hi+his|+iIσi,kg(gi+gis)++iV1,ktσi,kFd(Fi+Fis,P1)+iV2,ktσi,kFd(Fi+Fis,P2),

where s=(1-α)skt-1+αskt and f=f(xk),f=f(xk),hi=hi(xk),hi=hi(xk),iE, etc. and we further denote

rk,0t:=ϕ^kt(0)-ϕ^k1(0),rk,1t:=ϕ^kt(1)-ϕ^k1(0). 45
Lemma 4.1
  1. For every t{1,,Nk} the function ϕ^kt is convex.

  2. For every t{1,,Nk} the function ϕ^kt is a first order approximation of ϕkt, that is
    |ϕkt(α)-ϕ^kt(α)|=o(s),
    where s=(1-α)skt-1+αskt.
Proof

1. By convexity of P1 and P2,ϕ^kt is convex because it is sum of convex functions.

2. By Lipschitz continuity of distance function with Lipschitz modulus L=1 we conclude

ϕkt(α)-ϕ^kt(α)|f(xk+s)-f-fs-12sTBks|+iEσi,kh|hi(xk+s)-hi-his|+iIσi,kg|gi(xk+s)-gi-gis|+iVσi,kFFi(xk+s)-Fi-Fis1

and hence the assertion follows.

We state now the main result of this subsection. For the sake of simplicity we omit the iteration index k in this part.

Proposition 4.1

For every t{1,,Nk}

ϕ^t(0)-ϕ^1(0)-τ=1t-112(sτ-sτ-1)TB(sτ-sτ-1)0, 46
ϕ^t(1)-ϕ^1(0)-τ=1t12(sτ-sτ-1)TB(sτ-sτ-1)0. 47
Proof

Fix t{1,,Nk} and note that

1/2(st)TBst+fst=1/2(st)TBst+fst-1/2(s0)TBs0-fs0=τ=1t1/2(sτ)TBsτ-1/2(sτ-1)TBsτ-1+f(sτ-sτ-1),

because of s0=0. For j=0,1 consider r1-jt+j defined by (45). We obtain

r1-jt+j=τ=1t12(sτ)TBsτ-12(sτ-1)TBsτ-1+f(sτ-sτ-1)+iEσih|hi+hist|-|hi|+iIσig(gi+gist)+-(gi)++iV1t+jσiFd(Fi+Fist,P1)+iV2t+jσiFd(Fi+Fist,P2)-iV11σiFd(Fi,P1)-iV21σiFd(Fi,P2). 48

Using that (sτ,δτ) is the solution of QP(ρ,V1τ) and multiplying the first order optimality condition (24) by (sτ-sτ-1)T yields

(sτ-sτ-1)TBsτ+fT+iEλih,τhiT+iIλig,τgiT+iVFiTλiF,τ=0. 49

Summing up the expression on the left hand side from τ=1 to t, subtracting it from the right hand side of (48) and taking into account the identity

1/2(sτ)TBsτ-1/2(sτ-1)TBsτ-1-(sτ-sτ-1)TBsτ=-1/2(sτ-sτ-1)TB(sτ-sτ-1)

we obtain for j=0,1

r1-jt+j=-τ=1t12(sτ-sτ-1)TB(sτ-sτ-1)+iEσih(|hi+hist|-|hi|)-τ=1tλih,τhi(sτ-sτ-1)+iIσig((gi+gist)+-(gi)+)-τ=1tλig,τgi(sτ-sτ-1)+iV1t+jσiFd(Fi+Fist,P1)+iV2t+jσiFd(Fi+Fist,P2)-iV11σiFd(Fi,P1)-iV21σiFd(Fi,P2)-iVτ=1t(λiF,τ)TFi(sτ-sτ-1). 50

First, we claim that

-iVτ=1t(λiF,τ)TFi(sτ-sτ-1)iVλ~iF(1-δt)d(Fi,P). 51

Consider iV and τ{1,,t} with iV1τ. By the feasibility of (sτ,δτ) and (sτ-1,δτ-1) for QP(ρ,V1τ) it follows that

δτ(θiHHi,-θiGGi)T+Fi+FisτP1,δτ-1(θiHHi,-θiGGi)T+Fi+Fisτ-1P1

and hence from (27) and (12) we conclude

-(λiF,τ)TFi(sτ-sτ-1)+(δτ-δτ-1)(θiHHi,-θiGGi)T0

and consequently

-(λiF,τ)TFi(sτ-sτ-1)(λiF,τ)T(δτ-δτ-1)(θiHHi,-θiGGi)Tλ~iF(δτ-1-δτ)d(Fi,P) 52

follows by the Hölder inequality and (34).

Analogous argumentation yields (52) also for i,τ with iV2τ and since V1τ,V2τ form a partition of V, the claimed inequality (51) follows.

Further, we claim that for j=0,1 it holds that

iV1t+jσiFd(Fi+Fist,P1)+iV2t+jσiFd(Fi+Fist,P2)iVσiFδtd(Fi,P). 53

From feasibility of (st,δt) for either QP(ρ,V1t) or QP(ρ,V1t+1) for iV1tV1t+1 it follows that

δt(θiHHi,-θiGGi)T+Fi+FistP1

and hence, using (34) and (22),

σiFd(Fi+Fist,P1)σiFδt(θiHHi,-θiGGi)T1=σiFδtd(Fi,P). 54

Again, for iV2t or iV2t+1 it holds that σiFd(Fi+Fist,P2)σiFδtd(Fi,P) by analogous argumentation and since V1t,V2t and V1t+1,V2t+1 form a partition of V, the claimed inequality (53) follows.

Finally, we have

-iV11σiFd(Fi,P1)-iV21σiFd(Fi,P2)=-iVσiFd(Fi,P), 55

due to the fact that V11,V21 form a partition of V and (35).

Similar arguments as above show

σih(|hi+hist|-|hi|)-τ=1tλih,τhi(sτ-sτ-1)(σih-λ~ih)(δt-1)|hi|,iE,σig((gi+gist)+-(gi)+)-τ=1tλig,τgi(sτ-sτ-1)(σig-λ~ig)(δt-1)(gi)+,iI.

Taking this into account and putting together (50), (51), (53) and (55) we obtain for j=0,1

r1-jt+j-τ=1t12(sτ-sτ-1)TB(sτ-sτ-1)-iV(σiF-λ~iF)(1-δt)d(Fi,P)-iE(σih-λ~ih)(1-δt)|hi|-iI(σig-λ~ig)(1-δt)(gi)+

and hence (46) and (47) follow by monotonicity of δ and (44). This completes the proof.

Searching for the next iterate

We choose the next iterate as a point from the polygonal line connecting the points sk0,,skNk. Each line segment [skt-1,skt] corresponds to the convex subproblem solved by Algorithm 3.1 and hence each line search function ϕ^kt corresponds to the usual 1 merit function from nonlinear programming. This makes it technically more difficult to prove the convergence behavior stated in Proposition 4.2 which is also the motivation for the following procedure.

First we parametrize the polygonal line connecting the points sk0,,skNk by its length as a curve s^k:[0,1]Rn in the following way. We define tk(1):=Nk, for every γ[0,1) we denote by tk(γ) the smallest number t such that Skt>γSkNk and we set αk(1):=1,

αk(γ):=γSkNk-Sktk(γ)-1Sktk(γ)-Sktk(γ)-1,γ[0,1),

where Sk0:=0,Skt:=τ=1tskτ-skτ-1 for t=1,,Nk. Then we define

s^k(γ)=sktk(γ)-1+αk(γ)(sktk(γ)-sktk(γ)-1).

Note that s^k(γ)γSkNk.

In order to simplify the proof of Proposition 4.2, for γ[0,1] we further consider the following line search functions

Yk(γ):=ϕktk(γ)(αk(γ)),Y^k(γ):=ϕ^ktk(γ)(αk(γ)),Zk(γ):=(1-αk(γ))ϕ^ktk(γ)(0)+αk(γ)ϕ^ktk(γ)(1). 56

Now consider some sequence of positive numbers γ1k=1,γ2k,γ3k, with 1>γ¯γj+1k/γjkγ̲>0 for all jN. Consider the smallest j, denoted by j(k) such that for some given constant ξ(0,1) one has

Yk(γjk)-Yk(0)ξZk(γjk)-Zk(0). 57

Then the new iterate is given by

xk+1:=xk+s^k(γj(k)k).

As can be seen from the proof of Lemma 4.5, this choice ensures a decrease in merit function Φ defined in the next subsection.

The following relations are direct consequences of the properties of ϕkt and ϕ^kt

|Yk(γ)-Y^k(γ)|=o(γSkNk),Y^k(γ)Zk(γ),Zk(γ)-Zk(0)0. 58

The last property holds due to Proposition 4.1 and

Zk(γ)-Zk(0)=(1-αk(γ))rk,0tk(γ)+αk(γ)rk,1tk(γ), 59

which follows from αk(0)=0,Sktk(0)-1=0 and hence ϕ^ktk(0)(0)=ϕ^k1(0). We recall that rk,0t and rk,1t are defined by (45).

Lemma 4.2

The new iterate xk+1 is well defined.

Proof

In order to show that the new iterate is well defined, we have to prove the existence of some j such that (57) is fulfilled. Note that Sktk(0)-1=0 and Sktk(0)>0. There is some δk>0 such that |Yk(γ)-Y^k(γ)|-(1-ξ)rk,1tk(0)γSkNkSktk(0), whenever γSkNkδk. Since limjγjk=0, we can choose j sufficiently large to fulfill γjkSkNk<min{δk,Sktk(0)} and then tk(γjk)=tk(0) and αk(γjk)=γjkSkNk/Sktk(0), since Sktk(0)-1=0. This yields

Yk(γjk)-Y^k(γjk)-(1-ξ)αk(γjk)rk,1tk(γjk). 60

Then by second property of (58), (59), taking into account rk,0tk(γjk)0 by Proposition 4.1 and Yk(0)=Zk(0) we obtain

Yk(γjk)-Yk(0)Y^k(γjk)-Yk(0)-(1-ξ)αk(γjk)rk,1tk(γjk)ξ(Zk(γjk)-Zk(0))+(1-ξ)Zk(γjk)-Zk(0)-αk(γjk)rk,1tk(γjk)ξ(Zk(γjk)-Zk(0))+(1-ξ)(1-αk(γjk))rk,0tk(γjk)ξ(Zk(γjk)-Zk(0)).

Thus (57) is fulfilled for this j and the lemma is proved.

Convergence of the basic algorithm

We consider the behavior of the Algorithm 4.1 when it does not prematurely stop and it generates an infinite sequence of iterates

xk,Bk,(skt,δkt),λkt,(V1,kt,V2,kt),t=0,,Nkandθk,λ̲kNk,λ¯kNk.

Note that δkNk<ζ. We discuss the convergence behavior under the following assumption.

Assumption 1

  1. There exist constants Cx,Cs,Cλ such that
    xkCx,SkNkCs,λ^kh,λ^kg,λ^kFCλ
    for all k, where λ^kh:=maxiE{λ~i,kh},λ^kg:=maxiI{λ~i,kg},λ^kF:=maxiV{λ~i,kF}.
  2. There exist constants C¯B,C̲B such that C̲Bλ(Bk),BkC¯B for all k, where λ(Bk) denotes the smallest eigenvalue of Bk.

For our convergence analysis we need one more merit function

Φk(x):=f(x)+iEσi,kh|hi(x)|+iIσi,kg(gi(x))++iVσi,kFd(Fi(x),P).

Lemma 4.3

For each k and for any γ[0,1] it holds that

Φk(xk+s^k(γ))Yk(γ)andΦk(xk)=Yk(0). 61

Proof

The first claim follows from the definitions of Φk and Yk and the estimate

d(Fi(xk+s),P1),d(Fi(xk+s),P2)min{d(Fi(xk+s),P1),d(Fi(xk+s),P2)}=d(Fi(xk+s),P),

which holds by (20). The second claim follows from (35).

A simple consequence of the way that we define the penalty parameters in (42) is the following lemma.

Lemma 4.4

Under Assumption 1 there exists some k¯ such that for all kk¯ the penalty parameters remain constant, σ¯:=σk and consequently Φk(x)=Φk¯(x).

Remark 4.2

Note that we do not use Φk for calculating the new iterate because its first order approximation is in general not convex on the line segments connecting skt-1 and skt due to the involved min operation.

Lemma 4.5

Assume that Assumption 1 is fulfilled. Then

limkYk(γj(k)k)-Yk(0)=0. 62

Proof

Take an existed k¯ from Lemma 4.4. Then we have for kk¯

Φk+1(xk+1)=Φk¯(xk+1)=Φk¯(xk+s^k(γj(k)k))=Φk(xk+s^k(γj(k)k))Yk(γj(k)k)<Yk(0)=Φk(xk)

and therefore Φk+1(xk+1)-Φk(xk)Yk(γj(k)k)-Yk(0)<0. Hence the sequence Φk(xk) is monotonically decreasing and therefore convergent, because it is bounded below by Assumption 1. Hence

-<limkΦk(xk)-Φk¯(xk¯)=k=k¯(Φk+1(xk+1)-Φk(xk))k=k¯(Yk(γj(k)k)-Yk(0))

and the assertion follows.

Proposition 4.2

Assume that Assumption 1 is fulfilled. Then

limkY^k(1)-Y^k(0)=0 63

and consequently

limkskNk=0. 64

Proof

We prove (63) by contraposition. Assuming on the contrary that (63) does not hold, by taking into account Y^k(1)-Y^k(0)0 by Proposition 4.1, there exists a subsequence K={k1,k2,} such that Y^k(1)-Y^k(0)r¯<0. By passing to a subsequence we can assume that for all kK we have kk¯ with k¯ given by Lemma 4.4 and Nk=N¯, where we have taken into account (40). By passing to a subsequence once more we can also assume that

limkKSkt=S¯t,limkKrk,1t=r¯1t,limkKrk,0t=r¯0t,t{1,,N¯},

where rk,1t and rk,0t are defined by (45). Note that r¯1N¯r¯<0.

Let us first consider the case S¯N¯=0. There exists δ>0 such that |Yk(γ)-Y^k(γ)|(ξ-1)r¯1N¯γSkN¯kK, whenever γSkN¯δ. Since S¯N¯=0 we can assume that SkN¯min{δ,1/2}kK. Then

Yk(1)-Yk(0)rk,1N¯+(ξ-1)r¯1N¯SkN¯rk,1N¯+(ξ-1)rk,1N¯=ξrk,1N¯=ξ(Zk(1)-Zk(0))ξr¯1N¯2<0

and this implies that for the next iterate we have j(k)=1 and hence γj(k)k=1, contradicting (62).

Now consider the case S¯N0 and let us define the number τ¯:=max{t|S¯t=0}+1. Note that Proposition 4.1 yields

rk,1t,rk,0t+1-λ(Bk)2τ=1tskτ-skτ-12-C̲B21tτ=1tskτ-skτ-12=-C̲B21t(Skt)2 65

and therefore r~:=maxt>τ¯r¯t<0, where r¯t:=max{r¯0t,r¯1t}. By passing to a subsequence we can assume that for every t>τ¯ and every kK we have rk,0t,rk,1tr¯t2.

Now assume that for infinitely many kK we have γj(k)kSkN¯Skτ¯, i.e. tk(γj(k)k)>τ¯. Then we conclude

Yk(γj(k)k)-Yk(0)ξ(Zk(γj(k)k)-Zk(0))=ξ(1-αk(γj(k)k))rk,0tk(γj(k)k)+αk(γj(k)k)rk,1tk(γj(k)k)ξr~2<0

contradicting (62). Hence for all but finitely many kK, without loss of generality for all kK, we have γj(k)kSkN¯<Skτ¯.

There exists δ>0 such that

|Yk(γ)-Y^k(γ)||r¯τ¯|(1-ξ)γ̲γSkN¯8Sτ¯kK, 66

whenever γSkN¯δ. By eventually choosing δ smaller we can assume δSτ¯/2 and by passing to a subsequence if necessary we can also assume that for all kK we have

2Skτ¯-1/γ̲δ<Skτ¯2Sτ¯. 67

Now let for each k the index j~(k) denote the smallest j with γjSkN¯δ. It obviously holds that γj~(k)-1kSkN¯>δ and by (67) we obtain

Skτ¯-1γ̲δγ̲γj~(k)-1kSkN¯γj~(k)kSkN¯δ<Skτ¯

implying tk(γj~(k)k)=τ¯ and

αkγj~(k)kγ̲δ-Skτ¯-1Skτ¯-Skτ¯-1γ̲δ4Sτ¯

by (67).

Taking this into account together with (66) and γj~(k)kSkN¯δ we conclude

Ykγj~(k)k-Y^kγj~(k)k|r¯τ¯|(1-ξ)γ̲γj~(k)kSkN¯8Sτ¯-(1-ξ)γ̲δ4Sτ¯rk,1τ¯-(1-ξ)αkγj~(k)krk,1tkγj~(k)k.

Now we can proceed as in the proof of Lemma 4.2 to show that j~(k) fulfills (57).

However, this yields j~(k)j(k) by definition of j(k) and hence γj(k)kSkN¯γj~(k)kSkN¯Skτ¯-1 showing tk(γj(k)k)=tk(γj~(k)k)=τ¯. But then we also have αk(γj(k)k)αk(γj~(k)k)γ̲δ4S¯τ¯ and from (57) we obtain

Yk(γj(k)k)-Yk(0)ξ(Zk(γj(k)k)-Zk(0))ξαk(γj(k)k)rk,1tk(γj(k)k)ξγ̲δr~8S¯τ¯<0

contradicting (62) and so (63) is proved. Condition (64) now follows from (63) because we conclude from (65) that Y^k(1)-Y^k(0)-C̲B21Nk(SkNk)2-C̲B21NkskNk2.

Now we are ready to state the main result of this section.

Theorem 4.1

Let Assumption 1 be fulfilled. Then every limit point of the sequence of iterates xk is at least M-stationary for problem (1).

Proof

Let x¯ denote a limit point of the sequence xk and let K denote a subsequence such that limkKxk=x¯. Further let λ̲ be a limit point of the bounded sequence λ̲kNk and assume without loss of generality that limkKλ̲kNk=λ̲. First we show feasibility of x¯ for the problem (1) together with

λ̲ig0=λ̲iggi(x¯),iIand(λ̲H,λ̲G)NP|V|(F(x¯)). 68

Consider iI. For all k it holds that

0(1-θi,kgδkNk)gi(xk)+gi(xk)skNkλ̲i,kg,Nk0.

Since 0δkNkζ,θi,kg{0,1} we have 1(1-θi,kgδkNk)1-ζ and together with skNk0 by Proposition 4.2 we conclude

0lim supkKgi(xk)+gi(xk)skNk(1-θi,kgδkNk)=gi(x¯),

λ̲ig0 and

0=limkKλ̲i,kg,Nkgi(xk)+gi(xk)skNk(1-θi,kgδkNk)=λ̲iggi(x¯).

Hence λ̲ig0=λ̲iggi(x¯). Similar arguments show that for every iE we have

0=limkKhi(xk)+hi(xk)skNk(1-δkNk)=hi(x¯).

Finally consider iV. Taking into account (22), (34) and δkNkζ we obtain

d(Fi(xk),P)δkNk(θi,kHHi(xk),-θi,kGGi(xk))T+Fi(xk)skNk1ζd(Fi(xk),P)+Fi(xk)skNk1.

Hence, Fi(xk)skNk0 by Proposition 4.2 implies

(1-ζ)d(Fi(x¯),P)=limkK(1-ζ)d(Fi(xk),P)limkKFi(xk)skNk1=0,

showing the feasibility of x¯. Moreover, the previous arguments also imply

F~i(xk,skNk,δkNk):=δkNk(θi,kHHi(xk),-θi,kGGi(xk))T+Fi(xk)+Fi(xk)skNkKFi(x¯). 69

Taking into account (14), the fact that λ̲kNk fulfills M-stationarity conditions at (skNk,δkNk) for (19) yields

(λ̲kH,Nk,λ̲kG,Nk)NP|V|(F~(xk,skNk,δkNk)).

However, this together with (λ̲kH,Nk,λ̲kG,Nk)K(λ̲H,λ̲G), (69), and (13) yield (λ̲H,λ̲G)NP|V|(F(x¯)) and consequently (68) follows.

Moreover, by first order optimality condition we have

BkskNk+f(xk)T+iEλ̲i,kh,Nkhi(xk)T+iIλ̲i,kg,Nkgi(xk)T+iVFi(xk)Tλ̲i,kF,Nk=0

for each k and by passing to a limit and by taking into account that BkskNk0 by Proposition 4.2 we obtain

f(x¯)T+iEλ̲ihhi(x¯)T+iIλ̲iggi(x¯)T+iVFi(x¯)Tλ̲iF=0.

Hence, invoking (14) again, this together with the feasibility of x¯ and (68) implies M-stationarity of x¯ and the proof is complete.

The extended SQP algorithm for MPVC

In this section we investigate what can be done in order to secure QM-stationarity of the limit points. First, note that to prove M-stationarity of the limit points in Theorem 4.1 we only used that (λ̲kH,Nk,λ̲kG,Nk)NP|V|(F~(xk,skNk,δkNk)), i.e. it is sufficient to exploit only the M-stationarity of the solutions of auxiliary problems. Further, recalling the comments after Lemma 3.1, the solution (s,δ) of QP(ρ,I1(s,δ)I00(s,δ)) is M-stationary for the auxiliary problem. Thus, in Algorithm 3.1 for solving the auxiliary problem, it is sufficient to consider only the last problem of the four problems (31),(32). Moreover, definition of limiting normal cone (11) reveals that, in general, the limiting process abolishes any stationarity stronger that M-stationarity, even S-stationarity.

Nevertheless, in practical situations it is likely that some assumption, securing that a stronger stationarity will be preserved in the limiting process, may be fulfilled. E.g., let x¯ be a limit point of xk. If we assume that for all k sufficiently large it holds that I00(x¯)=I00(skNk,δkNk), then x¯ is at least QM-stationary for (1). This follows easily, since now for all iI00(x¯) it holds that λ̲i,kG,Nk=0,λ¯i,kH,Nk,λ¯i,kG,Nk0 and consequently

λ̲iG=limkλ̲i,kG,Nk=0,λ¯iH=limkλ¯i,kH,Nk0,λ¯iG=limkλ¯i,kG,Nk0.

This observation suggests that to obtain a stronger stationarity of a limit point, the key is to correctly identify the bi-active index set at the limit point and it serves as a motivation for the extended version of our SQP method. Before we can discuss the extended version, we summarize some preliminary results.

Preliminary results

Let a:RnRp and b:RnRq be continuously differentiable. Given a vector xRn we define the linear problem

LP(x)mindRnf(x)dsubject toa(x)d=0,(b(x))-+b(x)d0,-1d1. 70

Note that d=0 is always feasible for this problem. Next we define a set A by

A:={xRn|a(x)=0,b(x)0}. 71

Let x¯A and recall that the Mangasarian-Fromovitz constraint qualification (MFCQ) holds at x¯ if the matrix a(x¯) has full row rank and there exists a vector dRn such that

a(x¯)d=0,bi(x¯)d<0,iI(x¯):={i{1,,q}|bi(x¯)=0}.

Moreover, for a matrix M we denote by Mp the norm given by

Mp:=sup{Mup|u1} 72

and we also omit the index p in case p=2.

Lemma 5.1

Let x¯A, let assume that MFCQ holds at x¯ and let d¯ denote the solution of LP(x¯). Then for every ϵ>0 there exists δ>0 such that if x-x¯δ then

f(x)df(x¯)d¯+ϵ, 73

where d denotes the solution of LP(x).

Proof

The classical Robinson’s result (c.f. [9, Corollary 1, Theorem 3]), together with MFCQ at x¯, yield the existence of κ>0 and δ~>0 such that for every x with x-x¯δ~ there exists d^ with a(x)d^=0,(b(x))-+b(x)d^0 and

d¯-d^κmax{a(x)d¯,((b(x))-+b(x)d¯)+}=:ν.

Since d^d^-d¯+d¯1+ν, by setting d~:=d^/(1+ν) we obtain that d~ is feasible for LP(x) and

d¯-d~11+νd¯-d^+νd¯(1+n)ν1+ν(1+n)ν.

Thus, taking into account a(x¯)d¯=0,(b(x¯))-+b(x¯)d¯0 and d¯1, we obtain

d¯-d~(1+n)κmax{a(x)-a(x¯),b(x)-b(x¯)+b(x)-b(x¯)}.

Hence, given ϵ>0, by continuity of objective and constraint functions as well as their derivatives at x¯ we can define δδ~ such that for all x with x-x¯δ it holds that

f(x)-f(x¯)1,f(x)d¯-d~ϵ/2.

Consequently, we obtain

f(x)d~f(x)d~-d¯+f(x)-f(x¯)1d¯+f(x¯)d¯f(x¯)d¯+ϵ

and since f(x)df(x)d~ by feasibility of d~ for LP(x), the claim is proved.

Lemma 5.2

Let ν(0,1) be a given constant and for a vector of positive parameters ω=(ωE,ωI) let us define the following function

φ(x):=f(x)+i{1,,p}ωiE|ai(x)|+i{1,,q}ωiI(bi(x))+. 74

Further assume that there exist ϵ>0 and a compact set C such that for all xC it holds that f(x)d-ϵ, where d denotes the solution of LP(x). Then there exists α~>0 such that

φ(x+αd)-φ(x)ναf(x)d 75

holds for all xC and every α[0,α~].

Proof

Definition of φ, together with u+-v+(u-v+)+ for u,vR, yield

φ(x+αd)-φ(x)f(x+αd)-f(x)+ω(a(x+αd)-a(x)1+(b(x+αd)-(b(x))+)+1). 76

By uniform continuity of the derivatives of constraint functions and objective function on compact sets, it follows that there exists α~>0 such that for all xC and every h with hα~ we have

f(x+h)-f(x)1,ω(a(x+h)-a(x)1+b(x+h)-b(x)1)1-ν2ϵ. 77

Hence, for all xC and every α[0,α~] we obtain

f(x+αd)-f(x)=ναf(x)d+(1-ν)αf(x)d+01(f(x+tαd)-f(x))αddtναf(x)d-(1-ν)αϵ+1-ν2αϵ=ναf(x)d-1-ν2αϵ.

On the other hand, taking into account a(x)d=0,d1, (77) and

(b(x))-+αb(x)d=(1-α)(b(x))-+α((b(x))-+b(x)d)0

we similarly obtain for all xC and every α[0,α~]

ω(a(x+αd)-a(x)1+(b(x+αd)-(b(x))+)+1)ω(01(a(x+tαd)-a(x))αddt1+01(b(x+tαd)-b(x))αddt1)1-ν2αϵ.

Consequently, (75) follows from (76) and the proof is complete.

The extended version of Algorithm 4.1

For every vector xRn and every partition (W1,W2)P(V) we define the linear problem

LP(x,W1)mindRnf(x)dsubject tohi(x)d=0iE,(gi(x))-+gi(x)d0iI,Fi(x)dP1iW1,(Fi(x))-+Fi(x)dP2iW2,-1d1. 78

Note that d=0 is always feasible for this problem and that the problem LP(x,W1) coincides with the problem LP(x) with ab given by

a:=(hi(x),iE,-Hi(x),iW1)T,b:=(gi(x),iI,-Hi(x),iW2,Gi(x),iW2)T. 79

The following proposition provides the motivation for introducing the problem LP(x,W1).

Proposition 5.1

Let x¯ be feasible for (1). Then x¯ is Q-stationary with respect to (β1,β2)P(I00(x¯)) if and only if the solutions d¯1 and d¯2 of the problems LP(x¯,I0+(x¯)β1) and LP(x¯,I0+(x¯)β2) fulfill

min{f(x¯)d¯1,f(x¯)d¯2}=0. 80

Proof

Feasibility of d=0 for LP(x¯,I0+(x¯)β1) and LP(x¯,I0+(x¯)β2) implies

min{f(x¯)d¯1,f(x¯)d¯2}0.

Denote by d~1 and d~2 the solutions of LP(x¯,I0+(x¯)β1) and LP(x¯,I0+(x¯)β2) without the constraint -1d1, and denote these problems by LP~1 and LP~2. Clearly, we have

min{f(x¯)d~1,f(x¯)d~2}min{f(x¯)d¯1,f(x¯)d¯2}.

The dual problem of LP~j for j=1,2 is given by

maxλRm-iIλig(gi(x¯))--iW2jλiH(-Hi(x¯))-+λiG(Gi(x¯))-subject to(3)andλig0,iI,λiH,λiG0,iW2j,λiG=0,iW1j, 81

where λ=(λh,λg,λH,λG),m=|E|+|I|+2|V|,W1j:=I0+(x¯)βj,W2j:=V\W1j.

Assume first that x¯ is Q-stationary with respect to (β1,β2)P(I00(x¯)). Then the multipliers λ¯,λ̲ from definition of Q-stationarity are feasible for dual problems of LP~1 and LP~2, respectively, both with the objective value equal to zero. Hence, duality theory of linear programming yields that min{f(x¯)d~1,f(x¯)d~2}0 and consequently (80) follows.

On the other hand, if (80) is fulfilled, is follows that min{f(x¯)d~1,f(x¯)d~2}=0 as well. Thus, d=0 is an optimal solution for LP~1 and LP~2 and duality theory of linear programming yields that the solutions λ1 and λ2 of the dual problems exist and their objective values are both zero. However, this implies that for j=1,2 we have

λig,jgi(x¯)=0,iI,λiH,jHi(x¯)=0,λiG,jGi(x¯)=0,iV

and consequently λ1 fulfills the conditions of λ¯ and λ2 fulfills the conditions of λ̲, showing that x¯ is indeed Q-stationary with respect to (β1,β2).

Now for each k consider two partitions (W1,k1,W2,k1),(W1,k2,W2,k2)P(V) and let dk1 and dk2 denote the solutions of LP(xk,W1,k1) and LP(xk,W1,k2). Choose dk{dk1,dk2} such that

f(xk)dk=mind{dk1,dk2}f(xk)d 82

and let (W1,k,W2,k){(W1,k1,W2,k1),(W1,k2,W2,k2)} denote the corresponding partition. Next, we define the function φk in the following way

φk(x):=f(x)+iEσi,kh|hi(x)|+iIσi,kg(gi(x))++iW1,kσi,kFd(Fi(x),P1)+iW2,kσi,kFd(Fi(x),P2). 83

Note that the function φk coincides with φ for ab given by (79) with (W1,W2):=(W1,k,W2,k) and ω=(ωE,ωI) given by

ωE:=(σi,kh,iE,σi,kF,iW1,k),ωI:=(σi,kg,iI,σi,kF,iW2,k,σi,kF,iW2,k).

Proposition 5.2

For all xRn it holds that

0φk(x)-Φk(x)σkF|V|max{maxiW1,kd(Fi(x),P1),maxiW2,kd(Fi(x),P2)}. 84

Proof

Non-negativity of the distance function, together with (20) yield for every iV,j=1,2

0d(Fi(x),Pj)-d(Fi(x),P)d(Fi(x),Pj).

Hence (84) now follows from

j=1,2iWj,kσi,kFd(Fi(x),Pj)σkF|V|maxj=1,2maxiWj,kd(Fi(x),Pj).

An outline of the extended algorithm is as follows.

Algorithm 5.1

(Solving the MPVC*)

  1. Initialization:
    • Select a starting point x0Rn together with a positive definite n×n matrix B0, a parameter ρ0>0 and constants ζ(0,1),ρ¯>1 and μ(0,1).
    • Select positive penalty parameters σ-1=(σ-1h,σ-1g,σ-1F).
    • Set the iteration counter k:=0.
  2. Correction of the iterate:
    • Set the corrected iterate by x~k:=xk.
    • Take some (W1,k1,W2,k1),(W1,k2,W2,k2)P(V), compute dk1 and dk2 as solutions of LP(xk,W1,k1) and LP(xk,W1,k2) and let dk be given by (82).
    • Consider a sequence of numbers αk(1)=1,αk(2),αk(3), with 1>α¯αk(j+1)/αk(j)α̲>0.
    • If f(xk)dk<0, denote by j(k) the smallest j fulfilling either
      Φk(xk+αk(j)dk)-Φk(xk)μαk(j)f(xk)dk, 85
      orαk(j)Φk(xk)-φk(xk)μf(xk)dk. 86
      • If j(k) fulfills (85), set x~k:=xk+αkj(k)dk.
  3. Solve the Auxiliary problem:
    • Run Algorithm 3.1 with data ζ,ρ¯,ρ:=ρk,B:=Bk,f:=f(x~k), hi:=hi(x~k),hi:=hi(x~k),iE, etc.
    • If the Algorithm 3.1 stops because of degeneracy, stop the Algorithm 5.1 with an error message.
    • If the final iterate sN is zero, stop the Algorithm 5.1 and return x~k as a solution.
  4. Next iterate:
    • Compute new penalty parameters σk.
    • Set xk+1:=x~k+sk where sk is a point on the polygonal line connecting the points s0,s1,,sN such that an appropriate merit function depending on σk is decreased.
    • Set ρk+1:=ρ, the final value of ρ in Algorithm 3.1.
    • Update Bk to get positive definite matrix Bk+1.
    • Set k:=k+1 and go to step 2.

Naturally, Remark 4.1 regarding the stopping criteria for Algorithm 4.1 aplies to this algorithm as well.

Lemma 5.3

Index j(k) is well defined.

Proof

In order to show that j(k) is well defined, we have to prove the existence of some j such that either (85) or (86) is fulfilled. By (84) we know that Φk(xk)-φk(xk)0. In case Φk(xk)-φk(xk)<0 every j sufficiently large clearly fulfills (86). On the other hand, if Φk(xk)-φk(xk)=0, taking into account (84) we obtain

Φk(xk+αdk)-Φk(xk)φk(xk+αdk)-φk(xk).

However, Lemma 5.2 for ν:=μ and C:={xk} yields that if f(xk)dk<0 then there exists some α~ such that

φk(xk+αdk)-φk(xk)μαf(xk)dk

holds for all α[0,α~] and thus (85) is fulfilled for every j sufficiently large. This finishes the proof.

Convergence of the extended algorithm

We consider the behavior of the Algorithm 5.1 when it does not prematurely stop and it generates an infinite sequence of iterates

xk,Bk,θk,λ̲kNk,λ¯kNk,(skt,δkt),λkt,(V1,kt,V2,kt),andx~k,dk1,dk2,(W1,k1,W2,k1),(W1,k2,W2,k2).

We discuss the convergence behavior under the following additional assumption.

Assumption 2

Let x¯ be a limit point of the sequence of iterates xk.

  1. Mangasarian-Fromovitz constraint qualification (MFCQ) holds at x¯ for constraints xA, where A is given by (71) and ab are given by (79) with (W1,W2):=(I0+(x¯),V\I0+(x¯)) or (W1,W2):=(I0+(x¯)I00(x¯),V\(I0+(x¯)I00(x¯))).

  2. There exists a subsequence K(x¯) such that limkK(x¯)xk=x¯ and
    W1,k1=I0+(x¯),W1,k2=I0+(x¯)I00(x¯)for allkK(x¯).

Note that the Next iterate step from Algorithm 5.1 remains almost unchanged compared to the Next iterate step from Algorithm 4.1, we just consider the point x~k instead of xk. Consequently, most of the results from subsections 4.1 and 4.2 remain valid, possibly after replacing xk by x~k where needed, e.g. in Lemma 4.3. The only exception is the proof of Lemma 4.5, where we have to show that the sequence Φk(xk) is monotonically decreasing. This follows now from (85) and hence Lemma 4.5 remains valid as well.

We state now the main result of this section.

Theorem 5.1

Let Assumptions 1 and 2 be fulfilled. Then every limit point of the sequence of iterates xk is at least QM-stationary for problem (1).

Proof

Let x¯ denote a limit point of the sequence xk and let K(x¯) denote a subsequence from Assumption 2 (2.). Since

xk-x~k-1Sk-1Nk-10

we conclude that limkK(x¯)x~k-1=x¯ and by applying Theorem 4.1 to sequence x~k-1 we obtain the feasibility of x¯ for problem (1).

Next we consider d¯1,d¯2 as in Proposition 5.1 with β1:= and without loss of generality we only consider kK(x¯),kk¯, where k¯ is given by Lemma 4.4. We show by contraposition that the case min{f(x¯)d¯1,f(x¯)d¯2}<0 can not occur. Let us assume on the contrary that, say f(x¯)d¯1<0. Assumption 2 (2.) yields that W1,k1=I0+(x¯) and feasibility of x¯ for (1) together with I0+(x¯)W1,k1I0(x¯) imply x¯A for A given by (71) and ab given by (79) with (W1,W2):=(W1,k1,W2,k1). Taking into account Assumption 2 (1.), Lemma 5.1 then yields that for ϵ:=-f(x¯)d¯1/2>0 there exists δ such that for all xk-x¯δ we have f(xk)dkf(xk)dk1f(x¯)d¯1/2=-ϵ, with dk given by (82).

Next, we choose k^ to be such that for kk^ it holds that xk-x¯δ and we set ν:=(1+μ)/2,C:={x|x-x¯δ}. From Lemma 5.2 we obtain that

φk(xk+αdk)-φk(xk)1+μ2αf(xk)dk 87

holds for all α[0,α~]. Moreover, by choosing k^ larger if necessary we can assume that for all iV we have

Fi(xk)-Fi(x¯)1-min1-μ2,μα̲α~f(xk)dkσkF|V|. 88

For the partition (W1,k,W2,k){(W1,k1,W2,k1),(W1,k2,W2,k2)} corresponding to dk it holds that I0+(x¯)W1,kI0(x¯) and this, together with the feasibility of x¯ for (1), imply Fi(x¯)Pj,iWj,k for j=1,2. Therefore, taking into account (22), we obtain

maxmaxiW1,kd(Fi(xk),P1),maxiW2,kd(Fi(xk),P2)maxiVFi(xk)-Fi(x¯)1.

Consequently, (84) and (88) yield for all α>α̲α~

φ(xk)-Φk(xk)<-min1-μ2,μαf(xk)dk.

Thus, from (87) and (84) we obtain for all α(α̲α~,α~]

Φk(xk+αdk)-Φk(xk)φ(xk+αdk)-φ(xk)+φ(xk)-Φk(xk)μαf(xk)dkandΦk(xk)-φ(xk)>μαf(xk)dk.

Now consider j with αk(j-1)>α~αk(j). We see that αk(j)(α̲α~,α~], since αk(j)α̲αk(j-1)>α̲α~ and consequently j fulfills (85) and violates (86). However, then we obtain for all kk^

Φk(xk+1)-Φk(xk)μαk(j(k))f(xk)dk=μα̲α~f(x¯)d¯/2<0,

a contradiction.

Hence it follows that the solutions d¯1,d¯2 fulfill min{f(x¯)d¯1,f(x¯)d¯2}=0 and by Proposition 5.1 we conclude that x¯ is Q-stationary with respect to (,I00(x¯)) and consequently also QM-stationary for problem (1).

Finally, we discuss how to choose the partitions (W1,k1,W2,k1) and (W1,k2,W2,k2) such that Assumption 2 (2.) will be fulfilled. Let us consider a sequence of nonnegative numbers ϵk such that for every limit point x¯ with limkKxk=x¯ it holds that

limkKϵkxk-x¯ 89

and let us define

I~k0+:={iV||Hi(xk)|ϵk<Gi(xk)},I~k00:={iV||Hi(xk)|ϵk|Gi(xk)|},I~k0-:={iV||Hi(xk)|ϵk<-Gi(xk)},I~k+0:={iV|Hi(xk)>ϵk|Gi(xk)|},I~k+-:={iV|Hi(xk)>ϵk<-Gi(xk)}.

Proposition 5.3

For W1,k1 and W1,k2 defined by W1,k1:=I~k0+ and W1,k1:=I~k0+I~k00 the Assumption 2 (2.) is fulfilled.

Proof

Let x¯ be a limit point of the sequence xk such that limkKxk=x¯. Recall that F is given by (8) and let us set L:=maxx-x¯1F(x), where F(x) is given by (72). Further, taking into account (89), consider k^ such that for all kk^ it holds that xk-x¯minϵk/L,1. Hence, for all kK with kk^ we conclude

F(xk)-F(x¯)01F(x¯+t(xk-x¯))xk-x¯dtϵk. 90

Now consider iI0+(x¯), i.e. Hi(x¯)=0<Gi(x¯). By choosing k^ larger if necessary we can assume that for all kk^ it holds that ϵk<Gi(x¯)/2 and consequently, taking into account (90), for all k{kK|kk^} we have

|Hi(xk)|=|Hi(xk)-Hi(x¯)|ϵk<Gi(x¯)-ϵkGi(xk),

showing iI~k0+. By similar argumentation and by increasing k^ if necessary we obtain that for all k{kK|kk^}=:K(x¯) it holds that

I0+(x¯)I~k0+,I00(x¯)I~k00,I0-(x¯)I~k0-,I+0(x¯)I~k+0,I+-(x¯)I~k+-. 91

However, feasibility of x¯ for (1) yields

V=I0+(x¯)I00(x¯)I0-(x¯)I+0(x¯)I+-(x¯)

and the index sets I~k0+,I~k00,I~k0-,I~k+0,I~k+- are pairwise disjoint subsets of V by definition. Hence we claim that (91) must in fact hold with equalities. Indeed, e.g.

I~k0+V\(I~k00I~k0-I~k+0I~k+-)V\(I00(x¯)I0-(x¯)I+0(x¯)I+-(x¯))=I0+(x¯).

This finishes the proof.

Note that if we assume that there exist a constant L>0, a number NN and a limit point x¯ such that for all kN it holds that

xk+1-x¯Lxk+1-xk,

by setting ϵk:=xk-xk-1 we obtain (89), since

xk-xk-1xk-x¯xk-x¯Lxk-x¯=1Lxk-x¯.

Numerical results

Algorithm 4.1 was implemented in MATLAB. To perform numerical tests we used a subset of test problems considered in the thesis of Hoheisel [7].

First we considered the so-called academic example

minxR24x1+2x2subject tox10,x20,(52-x1-x2)x10,(5-x1-x2)x20. 92

As in [7], we tested 289 different starting points x0 with x10,x20{-5,-4,,10,20}. For 84 starting points our algorithm found a global minimizer (0, 0) with objective value 0, while for the remaining 205 starting points a local minimizer (0, 5) with objective value 10 was found. Hence, convergence to the perfidious candidate (0,52), which is not a local minimizer, did not occur (see [7]).

Expectantly, after adding constraint 3-x1-x20 to the model (92), to artificially exclude the point (0, 0), unsuitable for the practical application, we reached the point (0, 5), now a global minimizer. For more detailed information about the problem we refer the reader to [7] and [2].

Next we solved 2 examples in truss topology optimization, the so called Ten-bar Truss and Cantilever Arm. The underlying model for both of them is as follows:

min(a,u)RN×RdV:=i=1Niaisubject toK(a)u=f,fuc,aia¯ii{1,2,,N},ai0i{1,2,,N},(σi(a,u)2-σ¯2)ai0i{1,2,,N}. 93

Here the matrix K(a) denotes the global stiffness matrix of the structure a and the vector fRd contains the external forces applying at the nodal points. Further, for each i the function σi(a,u) denotes the stress of the i-th potential bar and c,a¯i,σ¯ are positive constants. Again, for more background of the model and the following truss topology optimization problems we refer to [7].

In the Ten-bar Truss example we consider the ground structure depicted in Fig. 1a consisting of N=10 potential bars and 6 nodal points. We consider a load which applies at the bottom right hand node pulling vertically to the ground with force f=1. The two left hand nodes are fixed, and hence the structure has d=8 degrees of freedom for displacements.

Fig. 1.

Fig. 1

Ten-bar Truss example

We set c:=10,a¯:=100 and σ¯:=1 as in [7] and the resulting structure consisting of 5 bars is shown in Fig. 1b and is the same as the one in [7]. For comparison, in the following table we show the full data containing also the stress values.

i ai σi(a,u) ui
1 0 1.029700000000000 −1.000000000000000
2 1.000000000000000 1.000000000000000 1.000000000000000
3 0 1.119550000000000 −2.000000000000000
4 1.000000000000000 1.000000000000000 1.302400000000000
5 0 0.485150000000000 −1.970300000000000
6 1.414213562373095 1.000000000000000 −3.000000000000000
7 0 0.302400000000000 −8.000000000000000
8 1.414213562373095 1.000000000000000 −6.511800000000000
9 2.000000000000000 1.000000000000000 fTu=8
10 0 1.488200000000000 V=8.000000000000002

We can see that although our final structure and optimal volume are the same as the final structure and the optimal volume in [7], the solution (a,u) is different. For instance, since fTu=8<10=c, our solution does not reach the maximal compliance. Similarly as in [7], we observe the effect of vanishing constraints since the stress values from the table show that

σmax:=max1iN|σi(a,u)|=1.4882>σ^:=max1iN:ai>0|σi(a,u)|=1=σ¯.

In the Cantilever Arm example we consider the ground structure depicted in Fig. 2a consisting of N=224 potential bars and 27 nodal points. Again, we consider a load acting at the bottom right hand node pulling vertically to the ground with force f=1. Now the three left hand nodes are fixed, and hence d=48.

Fig. 2.

Fig. 2

Cantilever Arm example

We proceed as in [7] and we first set c:=100,a¯:=1 and σ¯:=100. The resulting structure consisting of only 24 bars (compared to 38 bars in [7]) is shown in Fig. 2b. Similarly as in [7], we have max1iNai1=a¯ and fu1=c. On the other hand, our optimal volume V1=23.4407 is a bit larger than the optimal volume 23.1399 in [7]. Also, analysis of our stress values shows that

graphic file with name 10589_2017_9894_Equ184_HTML.gif

and hence, although it holds true that both absolute stresses as well as absolute ”fictitious stresses” (i.e., for zero bars) are small compared to σ¯ as in [7], the difference is that in our case they are not the same.

The situation becomes more interesting when we change the stress bound to σ¯=2.2. The obtained structure consisting again of only 25 bars (compared to 37 or 31 bars in [7]) is shown in Fig. 2c. As before we have max1iNai2=a¯ and fu2=c. Our optimal volume V2=23.6982 is now much closer to the optimal volumes 23.6608 and 23.6633 in [7]. Similarly as in [7], we clearly observe the effect of vanishing constraints since our stress values show

σmax2:=max1iN|σi(a2,u2)|=24.1669σ^2:=max1iN:ai2>0|σi(a2,u2)|=2.2=σ¯.

Finally, we obtained 32 bars (in contrast to 24 bars in [7]) satisfying both

ai2<0.005=0.005a¯and|σi(a2,u2)|>2.2=σ¯.

To better demonstrate the performance of our algorithm we conclude this section by a table with more detailed information about solving Ten-bar Truss problem and 2 Cantilever Arm problems (CA1 with σ¯:=100 and CA2 with σ¯:=2.2). We use the following notation.

Problem Name of the test problem
(nq) Number of variables, number of all constraints
k Total number of outer iterations of the SQP method
(N0,,Nk-1) Total numbers of inner iterations corresponding to each outer iteration
k=0k-1j(k) Overall sum of steps made during line search
feval Total number of function evaluations, feval=k+k=0k-1j(k)
feval Total number of gradient evaluations, feval=k+1
Problem (nq) k (N0,,Nk-1) k=0k-1j(k) feval feval
Ten-bar Truss (18, 39) 14 (1,,1,2,2,2,2,1,1) 67 81 15
CA1 (272, 721) 401 (1,,1) 401 802 402
CA2 (272, 721) 1850 (1,,1) 1850 3700 1851

Acknowledgements

Open access funding provided by Austrian Science Fund (FWF). This work was supported by the Austrian Science Fund (FWF) under Grant P 26132-N25.

Contributor Information

Matúš Benko, Email: benko@numa.uni-linz.ac.at.

Helmut Gfrerer, Email: helmut.gfrerer@jku.at.

References

  • 1.Achtizger W, Kanzow C. Mathematical programs with vanishing constraints: optimality conditions and constraint qualifications. Math. Program. 2008;114:69–99. doi: 10.1007/s10107-006-0083-3. [DOI] [Google Scholar]
  • 2.Achtizger W, Hoheisel T, Kanzow C. A smoothing-regularization approach to mathematical programs with vanishing constraints. Comput. Optim. Appl. 2013;55:733–767. doi: 10.1007/s10589-013-9539-6. [DOI] [Google Scholar]
  • 3.Achtizger W, Kanzow C, Hoheisel T. On a relaxation method for mathematical programs with vanishing constraints. GAMM-Mitt. 2012;35:110–130. doi: 10.1002/gamm.201210009. [DOI] [Google Scholar]
  • 4.Benko M, Gfrerer H. An SQP method for mathematical programs with complementarity constraints with strong convergence properties. Kybernetika. 2016;52:169–208. doi: 10.1007/s10589-017-9894-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Benko M, Gfrerer H. On estimating the regular normal cone to constraint systems and stationary conditions. Optimization. 2017;66(1):61–92. doi: 10.1080/02331934.2016.1252915. [DOI] [Google Scholar]
  • 6.Fukushima M, Pang J S. Convergence of a smoothing continuation method for mathematical programs with complementarity constraints. In: Théra M, Tichatschke R, editors. Ill-Posed Variational Problems and Regularization Techniques. Lecture Notes in Economics and Mathematical Systems. Berlin: Springer; 1999. [Google Scholar]
  • 7.Hoheisel, T.: Mathematical programs with vanishing constraints. Dissertation, Department of Mathematics, University of Würzburg (2009)
  • 8.Izmailov AF, Solodov MV. Mathematical programs with vanishing constraints: optimality conditions, sensitivity, and a relaxation method. J. Optim. Theory Appl. 2009;142:501–532. doi: 10.1007/s10957-009-9517-4. [DOI] [Google Scholar]
  • 9.Robinson SM. Stability theory for systems of inequalities, part II: differentiable nonlinear systems. SIAM J. Numer. Anal. 1976;13:497–513. doi: 10.1137/0713043. [DOI] [Google Scholar]
  • 10.Scholtes S. Convergence properties of a regularization scheme for mathematical programs with complementarity constraints. SIAM J. Optim. 2001;11:918–936. doi: 10.1137/S1052623499361233. [DOI] [Google Scholar]

Articles from Computational Optimization and Applications are provided here courtesy of Springer

RESOURCES