New verifiable stationarity concepts for a class of mathematical programs with disjunctive constraints

Matúš Benko; Helmut Gfrerer

doi:10.1080/02331934.2017.1387547

. 2017 Oct 12;67(1):1–23. doi: 10.1080/02331934.2017.1387547

New verifiable stationarity concepts for a class of mathematical programs with disjunctive constraints

Matúš Benko ^a,^*, Helmut Gfrerer

PMCID: PMC5761710 PMID: 29375237

Abstract

In this paper, we consider a sufficiently broad class of non-linear mathematical programs with disjunctive constraints, which, e.g. include mathematical programs with complemetarity/vanishing constraints. We present an extension of the concept of $Q$ -stationarity which can be easily combined with the well-known notion of M-stationarity to obtain the stronger property of so-called $Q_{M}$ -stationarity. We show how the property of $Q_{M}$ -stationarity (and thus also of M-stationarity) can be efficiently verified for the considered problem class by computing $Q$ -stationary solutions of a certain quadratic program. We consider further the situation that the point which is to be tested for $Q_{M}$ -stationarity, is not known exactly, but is approximated by some convergent sequence, as it is usually the case when applying some numerical method.

Keywords: Mathematical programs with disjunctive constraints, B-stationarity, M-stationarity, $Q_{M}$ -stationarity

1. Introduction

In this paper, we consider the following mathematical program with disjunctive constraints (MPDC)

\begin{matrix} min_{x \in R^{n}} & f (x) \\ subject to & F_{i} (x) \in D_{i} : = ⋃_{j = 1}^{K_{i}} D_{i}^{j}, i = 1, \dots, m_{D}, \end{matrix}

(1)

where the mappings $f : R^{n} \to R$ and $F_{i} : R^{n} \to R^{l_{i}}$ , $i = 1, \dots, m_{D}$ are assumed to be continuously differentiable and $D_{i}^{j} \subset R^{l_{i}}$ , $j = 1, \dots, K_{i}$ , $i = 1, \dots, m_{D}$ are convex polyhedral sets.

Denoting $m : = \sum_{i = 1}^{m_{D}} l_{i}$ ,

\begin{matrix} F : = (F_{1}, \dots, F_{m_{D}}) : R^{n} \to R^{m}, D : = \prod_{i = 1}^{m_{D}} D_{i} \end{matrix}

(2)

we can rewrite the MPDC (1) in the form

\begin{matrix} min_{x \in R^{n}} f (x) subject to F (x) \in D . \end{matrix}

(3)

It is easy to see that D can also be written as the union of $\prod_{i = 1}^{m_{D}} K_{i}$ convex polyhedral sets by

\begin{matrix} D = ⋃_{ν \in J} D (ν) with J : = \prod_{i = 1}^{m_{D}} {1, \dots, K_{i}}, D (ν) : = \prod_{i = 1}^{m_{D}} D_{i}^{ν_{i}} . \end{matrix}

(4)

As an example for MPDC consider a mathematical program with complementarity constraints (MPCC) given by

\begin{matrix} min_{x \in R^{n}} & f (x) \\ subject to & g_{i} (x) \leq 0, i = 1, \dots m_{I}, \\ h_{i} (x) = 0, i = 1, \dots m_{E}, \\ G_{i} (x) \geq 0, H_{i} (x) \geq 0, G_{i} (x) H_{i} (x) = 0, i = 1, \dots m_{C} \end{matrix}

(5)

with $f : R^{n} \to R$ , $g_{i} : R^{n} \to R$ , $i = 1, \dots, m_{I}$ , $h_{i} : R^{n} \to R$ , $i = 1, \dots, m_{E}$ , $G_{i}, H_{i} : R^{n} \to R$ , $i = 1, \dots, m_{C}$ . This problem fits into our setting (1) with $m_{D} = m_{C} + 1$ ,

\begin{matrix} F_{1} & = {(g_{1}, \dots, g_{m_{I}}, h_{1} \dots, h_{m_{E}})}^{T}, D_{1}^{1} = R_{-}^{m_{I}} \times {0}^{m_{E}}, l_{1} = m_{I} + m_{E}, K_{1} = 1 \\ F_{i + 1} & = {(- G_{i}, - H_{i})}^{T}, D_{i + 1}^{1} = {0} \times R_{-}, D_{i + 1}^{2} = R_{-} \times {0}, l_{i + 1} = K_{i + 1} = 2, i = 1, \dots, m_{C} . \end{matrix}

MPCC is known to be a difficult optimization problem, because, due to the complementarity constraints $G_{i} (x) \geq 0$ , $H_{i} (x) \geq 0$ , $G_{i} (x) H_{i} (x) = 0$ , many of the standard constraint qualifications of nonlinear programming are violated at any feasible point. Hence, it is likely that the usual Karush–Kuhn–Tucker conditions fail to hold at a local minimizer and various first-order optimality conditions such as Abadie (A-), Bouligand (B-), Clarke (C-), Mordukhovich (M-) and Strong (S-) stationarity conditions have been studied in the literature [1–9].

Another prominent example is the mathematical program with vanishing constraints (MPVC)

\begin{matrix} min_{x \in R^{n}} & f (x) \\ subject to & g_{i} (x) \leq 0, i = 1, \dots m_{I}, \\ h_{i} (x) = 0, i = 1, \dots m_{E}, \\ H_{i} (x) \geq 0, G_{i} (x) H_{i} (x) \leq 0, i = 1, \dots m_{V} \end{matrix}

(6)

with $f : R^{n} \to R$ , $g_{i} : R^{n} \to R$ , $i = 1, \dots, m_{I}$ , $h_{i} : R^{n} \to R$ , $i = 1, \dots, m_{E}$ , $G_{i}, H_{i} : R^{n} \to R$ , $i = 1, \dots, m_{V}$ . Again, the problem MPVC can be written in the form (1) with $m_{D} = m_{V} + 1$ , $F_{1}$ , $D_{1}^{1}$ as in the case of MPCC and

\begin{matrix} F_{i + 1} = {(- H_{i}, G_{i})}^{T}, D_{i + 1}^{1} = {0} \times R, D_{i + 1}^{2} = R_{-}^{2}, l_{i + 1} = K_{i + 1} = 2, i = 1, \dots, m_{V} . \end{matrix}

Similar as in the case of MPCC, many of the standard constraint qualifications of non-linear programming can be violated at a local solution of (6) and a lot of stationarity concepts have been introduced. For a comprehensive overview for MPVC we refer to [10] and the references therein.

However, when we do not formulate MPCC or MPVC as a non-linear program but as a disjunctive program MPDC, then first-order optimality conditions can be formulated which are valid under weak constraint qualifications. We know that a local minimizer is always B-stationary, which geometrically means that no feasible descent direction exists, or, in a dual formulation, that the negative gradient of the objective belongs to the regular normal cone of the feasible region, cf. [11, Theorem 6.12]. The difficult task is now to estimate this regular normal cone. For this regular normal cone always a lower inclusion is available, which yields so-called S-stationarity conditions. For an upper estimate, one can use the limiting normal cone which results in the so-called M-stationarity conditions. The notions of S-stationarity and M-stationarity have been introduced in [12] for general programs (3). S-stationarity always implies B-stationarity, but it requires some strong qualification condition on the constraints which is too restrictive. On the other hand, M-stationarity requires only some weak constraint qualification but it does not preclude the existence of feasible descent directions. Further, it is not known in general how to efficiently verify the M-stationarity conditions, since the description of the limiting normal cone involves some combinatorial structure which is not known to be resolved without enumeration techniques. These difficulties in verifying M-stationarity have also some impact for numerical solution procedures. E.g. for many algorithms for MPCC it cannot be guaranteed that a limit point is M-stationary, cf. [13].

In the recent paper [14], we derived another upper estimate for the regular normal cone yielding so-called $Q$ -stationarity conditions. $Q$ -stationarity has the advantage over S-stationarity that it does not require such unnecessarily strong constraint qualification conditions. $Q$ -stationarity can be easily combined with M-stationarity to obtain so-called $Q_{M}$ -stationarity which is stronger than M-stationarity. This is one of the advantages of $Q_{M}$ -stationarity: there are several stationarity notions, in particular in the MPCC literature, like M-, C-, A- and weak stationarity, which are valid under weak constraint qualification conditions. M-stationarity is known to be the strongest stationarity concept and we even improve M-stationarity by $Q_{M}$ -stationarity.

For the disjunctive formulations of the problems MPCC and MPVC the $Q$ - and $Q_{M}$ -stationarity conditions have been worked out in detail in [14]. In this paper, we extend this approach to the general problem MPDC. We show that under a qualification condition which ensures S-stationarity of local minimizers, $Q$ -stationarity and S-stationarity are equivalent. Further, we prove that under some weak constraint qualification every local minimizer of MPDC is a $Q_{M}$ -stationary solution and we provide an efficient algorithm for verifying $Q_{M}$ -stationarity of some feasible point. More exactly, this algorithm either proves the existence of some feasible descent direction, i.e. the point is not B-stationary, or it computes multipliers fulfilling the $Q_{M}$ -stationarity condition. To this end, we consider quadratic programs with disjunctive constraints (QPDC), i.e. the objective function f in MPDC is a convex quadratic function and the mappings $F_{i}$ , $i = 1, \dots, m_{D}$ are linear. We propose a basic algorithm for QPDC, which either returns a $Q$ -stationary point or proves that the problem is unbounded. Further, we show that M-stationarity for MPDC is related with $Q$ -stationarity of some QPDC and the combination of the two parts yields the algorithm for verifying $Q_{M}$ -stationarity. This algorithm does not rely on enumeration techniques and this is another big advantage of the concepts of $Q$ - and $Q_{M}$ -stationarity.

Our approach is well suited to the MPDC (1) when all the numbers $K_{i}$ , $i = 1, \dots, m_{D}$ are small or of moderate size. Our disjunctive structure is not induced by integral variables like, e.g. in [15]. It is also not related to the approach of considering the convex hull of a family of convex sets like in [16,17].

The outline of the paper is as follows. In Section 2, we recall some basic definitions from variational analysis and discuss various stationarity concepts. In Section 3, we introduce the concepts of $Q$ - and $Q_{M}$ -stationarity for general optimization problems. These concepts are worked out in more detail for MPDC in Section 4. In Section 5, we consider quadratic programs with disjunctive linear constraints. We present a basic algorithm for solving such problems, which either return a $Q$ -stationary solution or prove that the problem is not bounded below. In the next section, we demonstrate how this basic algorithm can be applied to a certain quadratic program with disjunctive linear constraints in order to verify M-stationarity or $Q_{M}$ -staionarity of a point or to compute a descent direction. In the last Section 7, we present some results for numerical methods for solving MPDC which prevent convergence to non M-stationary and non- $Q_{M}$ -stationary points.

Our notation is fairly standard. In Euclidean space $R^{n}$ we denote by $‖ \cdot ‖$ and $⟨ \cdot, \cdot ⟩$ the Euclidean norm and scalar product, respectively, whereas we denote by ${‖ u ‖}_{\infty} : = max {| u_{i} | | i = 1, \dots, n}$ the maximum norm. The closed ball around some point x with radius r is denoted by $B (x, r)$ . Given some cone $Q \subset R^{n}$ , we denote by $Q^{\circ} : = {q^{*} \in R^{n} | ⟨ q^{*}, q ⟩ \leq 0 \forall q \in Q}$ its polar cone. By $d (x, A) : = inf {‖ x - - y ‖ | y \in A}$ we refer to the usual distance of some point x to a set A. We denote by $0^{+} C$ the recession cone of a convex set C.

2. Preliminaries

For the reader’s convenience, we start with several notions from variational analysis. Given a set $Ω \subset R^{n}$ and a point $\bar{z} \in Ω$ , the cone

\begin{matrix} T_{Ω} (\bar{z}) = {w | \exists w_{k} \to w, t_{k} ↓ 0 with \bar{z} + t_{k} w_{k} \in Ω} \end{matrix}

is called the (Bouligand/Severi) tangent/contingent cone to $Ω$ at $\bar{z}$ . The (Fréchet) regular normal cone to $Ω$ at $\bar{z} \in Ω$ can be equivalently defined either by

\begin{matrix} {\hat{N}}_{Ω} (\bar{z}) : = {v^{*} \in R^{d} | \underset{z \overset{Ω}{\to} \bar{z}}{lim sup} \frac{⟨ v^{*}, z - \bar{z} ⟩}{‖ z - \bar{z} ‖} \leq 0}, \end{matrix}

where $z \overset{Ω}{\to} \bar{z}$ means that $z \to \bar{z}$ with $z \in Ω$ , or as the dual/polar to the contingent cone, i.e. by

\begin{matrix} {\hat{N}}_{Ω} (\bar{z}) : = T_{Ω} {(\bar{z})}^{\circ} . \end{matrix}

For convenience, we put $N_{Ω} (\bar{z}) : = \emptyset$ for $\bar{z} \notin Ω$ . Further, the (Mordukhovich) limiting/basic normal cone to $Ω$ at $\bar{z} \in Ω$ is given by

N_{Ω} (\bar{z}) : = {w^{*} \in R^{d} | \exists z_{k} \to \bar{z}, w_{k}^{*} \to w^{*} with w_{k}^{*} \in {\hat{N}}_{Ω} (z_{k}) for all k} .

If $Ω$ is convex, then both the regular and the limiting normal cones coincide with the normal cone in the sense of convex analysis. Therefore, we will use in this case the notation $N_{Ω}$ .

Consider now the general mathematical program

\begin{matrix} min_{x \in R^{n}} f (x) subject to F (x) \in D \end{matrix}

(7)

where $f : R^{n} \to R$ , $F : R^{n} \to R^{m}$ are continuously differentiable and $D \subset R^{m}$ is a closed set. Let

\begin{matrix} Ω : = {x \in R^{n} | F (x) \in D} \end{matrix}

(8)

denote the feasible region of the program (7). Then a necessary condition for a point $\bar{x} \in Ω$ being locally optimal is

\begin{matrix} ⟨ \nabla f (\bar{x}), u ⟩ \geq 0 \forall u \in T_{Ω} (\bar{x}), \end{matrix}

(9)

which is the same as

\begin{matrix} - \nabla f (\bar{x}) \in {\hat{N}}_{Ω} (\bar{x}), \end{matrix}

(10)

cf. [11, Theorem 6.12]. The main task of applying this first-order optimality condition now is the computation of the regular normal cone ${\hat{N}}_{Ω} (\bar{x})$ which is very difficult for nonconvex D.

We always have the inclusion

\begin{matrix} \nabla F {(\bar{x})}^{T} {\hat{N}}_{D} (F (\bar{x})) \subset {\hat{N}}_{Ω} (\bar{x}), \end{matrix}

(11)

but equality will hold in (11) for nonconvex sets D only under comparatively strong conditions, e.g. when $\nabla F (\bar{x})$ is surjective, see [11, Exercise 6.7]. The following weaker sufficient condition for equality in (11) uses the notion of metric subregularity.

Definition 1:

A multifunction $Ψ : R^{n} ⇉ R^{m}$ is called metrically subregular at a point $(\bar{x}, \bar{y})$ of its graph $gph Ψ$ with modulus $κ > 0$ , if there is a neighborhood U of $\bar{x}$ such that

$\begin{matrix} d (x, Ψ^{- 1} (\bar{y})) \leq κ d (\bar{y}, Ψ (x)) \forall x \in U . \end{matrix}$

Theorem 1:

[18, Theorem 4]]Let $Ω$ be given by (8) and $\bar{x} \in Ω$ . If the multifunction $x ⇉ F (x) - D$ is metrically subregular at $(\bar{x}, 0)$ and if there exists a subspace $L \subset R^{m}$ such that

$\begin{matrix} T_{D} (F (\bar{x})) + L \subset T_{D} (F (\bar{x})) \end{matrix}$ (12)

and

$\begin{matrix} \nabla F (\bar{x}) R^{n} + L = R^{m}, \end{matrix}$ (13)

then

$\begin{matrix} {\hat{N}}_{Ω} (\bar{x}) = \nabla F {(\bar{x})}^{T} {\hat{N}}_{D} (F (\bar{x})) . \end{matrix}$

In order to state an upper estimate for the regular normal cone ${\hat{N}}_{Ω} (\bar{x})$ we need some constraint qualification.

Definition 2:

[12, Definition 6]] Let $Ω$ be given by (8) and let $\bar{x} \in Ω$ .

(1)
We say that the generalized Abadie constraint qualification (GACQ) holds at $\bar{x}$ if
$\begin{matrix} T_{Ω} (\bar{x}) = T_{Ω}^{lin} (\bar{x}), \end{matrix}$ (14)
where $T_{Ω}^{lin} (\bar{x}) : = {u \in R^{n} | \nabla F (\bar{x}) u \in T_{D} (F (\bar{x}))}$ denotes the linearized cone.

(2)
We say that the generalized Guignard constraint qualification (GGCQ) holds at $\bar{x}$ if
$\begin{matrix} {(T_{Ω} (\bar{x}))}^{\circ} = {(T_{Ω}^{lin} (\bar{x}))}^{\circ} . \end{matrix}$ (15)

Obviously GGCQ is weaker than GACQ, but GACQ is easier to verify by using some advanced tools of variational analysis. E.g. if the mapping $x ⇉ F (x) - D$ is metrically subregular at $(\bar{x}, 0)$ then GACQ is fulfilled at $\bar{x}$ , cf. [19, Proposition 1]. Tools for verifying metric subregularity of constraint systems can be found e.g. in [20].

Proposition 1:

[14, Proposition 3]]Let $Ω$ be given by (8), let $\bar{x} \in Ω$ and assume that GGCQ is fulfilled, while the mapping $u ⇉ \nabla F (\bar{x}) u - T_{D} (F (\bar{x}))$ is metrically subregular at (0, 0). Then

$\begin{matrix} {\hat{N}}_{Ω} (\bar{x}) \subset \nabla F {(\bar{x})}^{T} N_{T_{D} (F (\bar{x}))} (0) \subset \nabla F {(\bar{x})}^{T} N_{D} (F (\bar{x})) . \end{matrix}$ (16)

Note that we always have $N_{T_{D} (F (\bar{x}))} (0) \subset N_{D} (F (\bar{x}))$ , see [11, Proposition 6.27]. However, if D is the union of finitely many convex polyhedral sets, then equality

\begin{matrix} N_{T_{D} (F (\bar{x}))} (0) = N_{D} (F (\bar{x})) \end{matrix}

(17)

holds. This is due to the fact that by the assumption on D there is some neighborhood V of 0 such that $(D - F (\bar{x})) \cap V = T_{D} (F (\bar{x})) \cap V$ .

Let us mention that metric subregularity of the constraint mapping $x ⇉ F (x) - D$ at $(\bar{x}, 0)$ does not only imply GACQ and consequently GGCQ, but also metric subregularity of the mapping $u ⇉ \nabla F (\bar{x}) u - T_{D} (F (\bar{x}))$ at (0, 0) with the same modulus, see [21, Proposition 2.1].

The concept of metric subregularity has the drawback that, in general, it is not stable under small perturbations. It is well known that the stronger property of metric regularity is robust.

Definition 3:

A multifunction $Ψ : R^{n} ⇉ R^{m}$ is called metrically regular near a point $(\bar{x}, \bar{y})$ of its graph $gph Ψ$ with modulus $κ > 0$ , if there are neighborhoods U of $\bar{x}$ and V of $\bar{y}$ such that

$\begin{matrix} d (x, Ψ^{- 1} (y)) \leq κ d (y, Ψ (x)) \forall (x, y) \in U \times V . \end{matrix}$

The infimum of the moduli $κ$ for which the property of metric regularity holds is denoted by

$\begin{matrix} reg Ψ (\bar{x}, \bar{y}) . \end{matrix}$

In the following proposition, we gather some well-known properties of metric regularity:

Proposition 2:

Let $\bar{x} \in F^{- 1} (D)$ where $F : R^{n} \to R^{m}$ is continuously differentiable and D is the union of finitely many convex polyhedral sets and consider the multifunctions $x ⇉ Ψ (x) : = F (x) - D$ and $u ⇉ D Ψ (\bar{x}) (u) : = \nabla F (\bar{x}) u - T_{D} (F (\bar{x}))$ . Then

$\begin{matrix} reg Ψ (\bar{x}, \bar{y}) = reg D Ψ (\bar{x}) (0, 0) = max {\frac{1}{{‖ \nabla F (\bar{x})}^{T} λ ‖} | λ \in N_{D} (F (\bar{x})) = N_{T_{D} (F (\bar{x}))} (0), ‖ λ ‖ = 1} . \end{matrix}$

Moreover for every $κ > reg Ψ (\bar{x}, \bar{y})$ there is a neighborhood W of $\bar{x}$ such that for all $x \in W$ the mapping $u ⇉ \nabla F (x) u - T_{D} (F (\bar{x}))$ is metrically regular near (0, 0) with modulus $κ$ ,

$\begin{matrix} {‖ λ ‖ \leq κ ‖ \nabla F (x)}^{T} λ ‖ \forall λ \in N_{D} (F (\bar{x})) = N_{T_{D} (F (\bar{x}))} (0) \end{matrix}$ (18)

and

$\begin{matrix} d (u, \nabla F {(x)}^{- 1} T_{D} (F (\bar{x}))) \leq κ d (\nabla F (x) u, T_{D} (F (\bar{x}))) \forall u \in R^{n} . \end{matrix}$

Proof The statement follows from [11, Exercise 9.44] together with the facts that by our assumption on D condition (17) holds and that $T_{D} (F (\bar{x}))$ is a cone.

We now recall some well known stationarity concepts based on the considerations above.

Definition 4:

Let $\bar{x}$ be feasible for the program (7).

(i)
We say that $\bar{x}$ is B-stationary, if (9) or, equivalently, (10) hold.

(ii)
We say that $\bar{x}$ is S-stationary, if
$\begin{matrix} - \nabla f (\bar{x}) \in \nabla F {(\bar{x})}^{T} {\hat{N}}_{D} (F (\bar{x})) . \end{matrix}$

(iii)
We say that $\bar{x}$ is M-stationary, if
$\begin{matrix} - \nabla f (\bar{x}) \in \nabla F {(\bar{x})}^{T} N_{D} (F (\bar{x})) . \end{matrix}$

Every local minimizer of (7) is B-stationary and this stationarity concept is considered to be the most preferable one. S- and M-stationarity have been introduced in [12] as a generalization of these notions for MPCC. Using the inclusion (5) it immediately follows, that S-stationarity implies B-stationarity. However the reverse implication only holds true under some additional condition on the constraints, e.g. under the assumptions of Theorem 1. Note that there always hold the inclusions

\begin{matrix} \nabla F {(\bar{x})}^{T} {\hat{N}}_{D} (F (\bar{x})) \subset (T_{Ω}^{lin} (\bar{x}))^{\circ} \subset (T_{Ω} (\bar{x}))^{\circ} = {\hat{N}}_{Ω} (\bar{x}) . \end{matrix}

In order that a B-stationary point is also S-stationarity, both inclusions must be fulfilled with equality, i.e. besides the GGCQ $(T_{Ω}^{lin} (\bar{x}))^{\circ} = (T_{Ω} (\bar{x}))^{\circ}$ which allows to replace the tangent cone by the linearized tangent cone, we need another constraint qualification condition ensuring $\nabla F {(\bar{x})}^{T} {\hat{N}}_{D} (F (\bar{x})) = (T_{Ω}^{lin} (\bar{x}))^{\circ}$ like the conditions (12) and (13). It is well known that this additional condition is much more restrictive than the usual constraint qualifications allowing the linearization of the problem like metric (sub)regularity of the constraint mapping $F (\cdot) - D$ . Thus in the general case one cannot expect that a local minimizer is also S-stationary. This is the reason why other stationarity concepts like M-stationarity have also to be considered.

A B-stationary point is M-stationary under the very weak assumptions of Proposition 1. However, the inclusion ${\hat{N}}_{Ω} (\bar{x}) \subset \nabla F {(\bar{x})}^{T} N_{D} (F (\bar{x}))$ can be strict, implying that a M-stationary point $\bar{x}$ needs not to be B-stationary. Hence, M-stationarity does eventually not preclude the existence of feasible descent directions, i.e. directions $u \in T_{Ω} (\bar{x})$ with $⟨ \nabla f (\bar{x}), u ⟩ < 0$ .

3. On $Q$ - and $Q_{M}$ -stationarity

In this section, we consider an extension of the concept of $Q$ -stationarity as introduced in the recent paper [14]. $Q$ -stationarity is based on the following simple observation.

Consider the general program (7), assume that GGCQ holds at the point $\bar{x} \in Ω$ and assume that we are given K convex cones $Q_{i} \subset T_{D} (F (\bar{x}))$ , $i = 1, \dots, K$ . Then for each $i = 1, \dots, K$ we obviously have $T_{Ω}^{lin} (\bar{x}) = \nabla F {(\bar{x})}^{- 1} T_{D} (F (\bar{x})) \supset \nabla F {(\bar{x})}^{- 1} Q_{i}$ implying

\begin{matrix} {\hat{N}}_{Ω} (\bar{x}) = {(T_{Ω}^{lin} (\bar{x}))}^{\circ} \subset {(F {(\bar{x})}^{- 1} Q_{i})}^{\circ} . \end{matrix}

If we further assume that ${(F {(\bar{x})}^{- 1} Q_{i})}^{\circ} = \nabla F {(\bar{x})}^{T} Q_{i}^{\circ}$ and by taking into account, that by [14, Lemma 1] we have

\begin{matrix} (\nabla F {(\bar{x})}^{T} S_{1}) \cap (\nabla F {(\bar{x})}^{T} S_{2}) = \nabla F {(\bar{x})}^{T} (S_{1} \cap (ker \nabla F {(\bar{x})}^{T} + S_{2})) \end{matrix}

for arbitrary sets $S_{1}, S_{2} \subset R^{m}$ , we obtain

\begin{matrix} {\hat{N}}_{Ω} (\bar{x}) & \subset ⋂_{i = 1}^{K} \nabla F {(\bar{x})}^{T} Q_{i}^{\circ} = \nabla F {(\bar{x})}^{T} (Q_{1}^{\circ} \cap (ker \nabla F {(\bar{x})}^{T} + Q_{2}^{\circ})) \cap ⋂_{i = 3}^{K} \nabla F {(\bar{x})}^{T} Q_{i}^{\circ} \\ = \nabla F {(\bar{x})}^{T} (Q_{1}^{\circ} \cap (ker \nabla F {(\bar{x})}^{T} + Q_{2}^{\circ}) \cap (ker \nabla F {(\bar{x})}^{T} + Q_{3}^{\circ})) \cap ⋂_{i = 4}^{K} \nabla F {(\bar{x})}^{T} Q_{i}^{\circ} = \dots \\ = \nabla F {(\bar{x})}^{T} (Q_{1}^{\circ} \cap ⋂_{i = 2}^{K} (ker \nabla F {(\bar{x})}^{T} + Q_{i}^{\circ})) . \end{matrix}

Here, we use the convention that for sets $S_{1}, \dots, S_{K} \subset R^{m}$ we set $⋂_{i = l}^{K} S_{i} = R^{m}$ for $l > K$ . It is an easy consequence of (11), that equality holds in this inclusion, provided $\nabla F {(\bar{x})}^{T} (Q_{1}^{\circ} \cap ⋂_{i = 2}^{K} (ker \nabla F {(\bar{x})}^{T} + Q_{i}^{\circ})) \subset \nabla F {(\bar{x})}^{T} {\hat{N}}_{D} (F (\bar{x})$ . Hence, we have shown the following theorem.

Theorem 2:

Assume that GGCQ holds at $\bar{x} \in Ω$ and assume that $Q_{1}, \dots, Q_{K}$ are convex cones contained in $T_{D} (F (\bar{x}))$ . If

$\begin{matrix} {(\nabla F {(\bar{x})}^{- 1} Q_{i})}^{\circ} = \nabla F {(\bar{x})}^{T} Q_{i}^{\circ}, i = 1, \dots, K, \end{matrix}$ (19)

then

$\begin{matrix} {\hat{N}}_{Ω} (\bar{x}) \subset \nabla F {(\bar{x})}^{T} (Q_{1}^{\circ} \cap ⋂_{i = 2}^{K} (ker \nabla F {(\bar{x})}^{T} + Q_{i}^{\circ})) = ⋂_{i = 1}^{K} \nabla F {(\bar{x})}^{T} Q_{i}^{\circ} . \end{matrix}$ (20)

Further, if

$\begin{matrix} \nabla F {(\bar{x})}^{T} (Q_{1}^{\circ} \cap ⋂_{i = 2}^{K} (ker \nabla F {(\bar{x})}^{T} + Q_{i}^{\circ})) \subset \nabla F {(\bar{x})}^{T} {\hat{N}}_{D} (F (\bar{x})), \end{matrix}$ (21)

then equality holds in (20) and ${\hat{N}}_{Ω} (\bar{x}) = \nabla F {(\bar{x})}^{T} {\hat{N}}_{D} (F (\bar{x}))$ .

Remark 1:

Condition (19) is e.g. fulfilled, if for each $i = 1, \dots, K$ either there is a direction $u_{i}$ with $\nabla F (\bar{x}) u_{i} \in ri Q_{i}$ or $Q_{i}$ is a convex polyhedral set, cf. [14, Proposition 1].

The proper choice of $Q_{1}, \dots, Q_{K}$ is crucial in order that (20) provides a good estimate for the regular normal cone. It is obvious that we want to choose the cones $Q_{i}$ , $i = 1, \dots, K$ as large as possible in order that the inclusion (20) is tight. Further, it is reasonable that a good choice of $Q_{1}, \dots, Q_{K}$ fulfills

\begin{matrix} ⋂_{i = 1}^{K} Q_{i}^{\circ} = {\hat{N}}_{D} (F (\bar{x})) \end{matrix}

(22)

because then equation (21) holds whenever $\nabla F (\bar{x})$ has full rank. We now show that (21) holds not only under this full rank condition but also under some weaker assumption.

Theorem 3:

Assume that GGCQ holds at $\bar{x} \in Ω$ and assume that we are given convex cones $Q_{1}, \dots, Q_{K} \subset T_{D} (F (\bar{x}))$ fulfilling (19), (22) and

$\begin{matrix} ker \nabla F {(\bar{x})}^{T} \cap (Q_{1}^{\circ} - Q_{i}^{\circ}) = {0}, i = 2, \dots, K . \end{matrix}$ (23)

Then

${\hat{N}}_{Ω} (\bar{x}) = \nabla F {(\bar{x})}^{T} {\hat{N}}_{D} (F (\bar{x})) = \nabla F {(\bar{x})}^{T} (Q_{1}^{\circ} \cap ⋂_{i = 2}^{K} (ker \nabla F {(\bar{x})}^{T} + Q_{i}^{\circ})) .$

In particular, (23) holds if there is a subspace

$\begin{matrix} L \subset ⋂_{i = 1}^{K} (Q_{i} \cap (- Q_{i})) \end{matrix}$ (24)

such that (13) holds.

Proof The statement follows from Theorem 2 if we can show that (21) holds. Consider $x^{*} \in \nabla F {(\bar{x})}^{T} (Q_{1}^{\circ} \cap ⋂_{i = 2}^{K} (ker \nabla F {(\bar{x})}^{T} + Q_{i}^{\circ}))$ . Then there are elements $λ^{i} \in Q_{i}^{\circ}$ , $i = 1, \dots, K$ and $μ^{i} \in ker \nabla F {(\bar{x})}^{T}$ such that $λ^{1} = μ^{i} + λ^{i}$ , $i = 2, \dots, K$ and $x^{*} = \nabla F {(\bar{x})}^{T} λ^{1}$ . We conclude $μ^{i} = λ^{1} - λ^{i} \in Q_{1}^{\circ} - Q_{i}^{\circ}$ , implying $μ^{i} \in ker \nabla F {(\bar{x})}^{T} \cap (Q_{1}^{\circ} - Q_{i}^{\circ}) = {0}$ and thus

\begin{matrix} λ^{1} = λ^{2} = \dots = λ^{K} \in ⋂_{i = 1}^{K} Q_{i}^{\circ} = {\hat{N}}_{D} (F (\bar{x})) . \end{matrix}

Hence, $x^{*} \in \nabla F {(\bar{x})}^{T} {\hat{N}}_{D} (F (\bar{x}))$ and (21) is verified. In order to show the last assertion note that from (24), we conclude $L \subset Q_{i}$ and consequently $Q_{i}^{\circ} \subset L^{\circ} = L^{⊥}$ . Thus $Q_{1}^{\circ} - Q_{i}^{\circ} \subset L^{⊥} - L^{⊥} = L^{⊥}$ , $i = 2, \dots, K$ . Since

\begin{matrix} ker \nabla F {(\bar{x})}^{T} \cap L^{⊥} = ({(ker \nabla F {(\bar{x})}^{T})}^{⊥} + L)^{⊥} = {(\nabla F (\bar{x}) R^{n} + L)}^{⊥} = {0}, \end{matrix}

it follows that (23) holds.

Corollary 1:

Assume that GGCQ holds at $\bar{x} \in Ω$ and assume that we are given convex cones $Q_{1}, \dots, Q_{K} \subset T_{D} (F (\bar{x}))$ fulfilling (19) and (22). Further assume that there is some subspace L fulfilling (12) and (13). Then, the sets

$\begin{matrix} {\tilde{Q}}_{i} : = Q_{i} + L, i = 1, \dots, K \end{matrix}$

are convex cones contained in $T_{D} (F (\bar{x}))$ ,

$\begin{matrix} {(\nabla F {(\bar{x})}^{- 1} {\tilde{Q}}_{i})}^{\circ} = \nabla F {(\bar{x})}^{T} {\tilde{Q}}_{i}^{\circ}, i = 1, \dots, K \end{matrix}$ (25)

and

${\hat{N}}_{Ω} (\bar{x}) = \nabla F {(\bar{x})}^{T} {\hat{N}}_{D} (F (\bar{x})) = \nabla F {(\bar{x})}^{T} ({\tilde{Q}}_{1}^{\circ} \cap ⋂_{i = 2}^{K} (ker \nabla F {(\bar{x})}^{T} + {\tilde{Q}}_{i}^{\circ})) .$

Proof Firstly observe that ${\tilde{Q}}_{i} = Q_{i} + L \subset T_{D} (F (\bar{x})) + L \subset T_{D} (F (\bar{x}))$ by (12). Next, consider $z \in ri {\tilde{Q}}_{i}$ . By (13) there exists $u \in R^{n}$ and $l \in L$ such that $\nabla F (\bar{x}) u + l = z$ . Because of $- l \in L \subset {\tilde{Q}}_{i}$ we have $z - 2 l \in {\tilde{Q}}_{i}$ and thus $\nabla F (\bar{x}) u = z - l = \frac{1}{2} z + \frac{1}{2} (z - 2 l) \in ri {\tilde{Q}}_{i}$ by [22, Theorem 6.1] implying (25) by taking into account Remark 1. Further, from $Q_{i} \subset {\tilde{Q}}_{i} \subset T_{D} (F (\bar{x}))$ it follows that

\begin{matrix} {\hat{N}}_{D} (F (\bar{x})) = {(T_{D} (F (\bar{x})))}^{\circ} \subset ⋂_{i = 1}^{K} {\tilde{Q}}_{i}^{\circ} \subset ⋂_{i = 1}^{K} Q_{i}^{\circ} = {\hat{N}}_{D} (F (\bar{x})) . \end{matrix}

Finally, note that $L \subset {\tilde{Q}}_{i} \cap (- {\tilde{Q}}_{i})$ , $i = 1, \dots, K$ and the assertion follows from Theorem 3.

The following definition is motivated by Theorem 2.

Definition 5:

Let $\bar{x}$ be feasible for the program (7) and let $Q_{1}, \dots, Q_{K}$ be convex cones contained in $T_{D} (F (\bar{x}))$ fulfilling (19).

(i)
We say that $\bar{x}$ is $Q$ -stationary with respect to $Q_{1}, \dots, Q_{K}$ , if
$\begin{matrix} - \nabla f (\bar{x}) \in \nabla F {(\bar{x})}^{T} (Q_{1}^{\circ} \cap ⋂_{i = 2}^{K} (ker \nabla F {(\bar{x})}^{T} + Q_{i}^{\circ})) . \end{matrix}$

(ii)
We say that $\bar{x}$ is $Q_{M}$ -stationary with respect to $Q_{1}, \dots, Q_{K}$ , if
$\begin{matrix} - \nabla f (\bar{x}) \in \nabla F {(\bar{x})}^{T} (N_{D} (F (\bar{x})) \cap Q_{1}^{\circ} \cap ⋂_{i = 2}^{K} (ker \nabla F {(\bar{x})}^{T} + Q_{i}^{\circ})) . \end{matrix}$

Note that this definition is an extension of the definition of $Q$ - and $Q_{M}$ -stationarity in [14], where only the case $K = 2$ was considered.

The following corollary is an immediate consequence of the definitions and Theorem 2.

Corollary 2:

Assume that GGCQ is fulfilled at the point $\bar{x}$ feasible for (7). Further assume that we are given convex cones $Q_{1}, \dots, Q_{K} \subset T_{D} (F (\bar{x}))$ fulfilling (19). If $\bar{x}$ is B-stationary, then $\bar{x}$ is $Q$ -stationary with respect to $Q_{1}, \dots, Q_{K}$ . Conversely, if $\bar{x}$ is $Q$ -stationary with respect to $Q_{1}, \dots, Q_{K}$ and (21) is fulfilled, then $\bar{x}$ is S-stationary and consequently B-stationary.

We know that under the assumptions of Proposition 1 every B-stationary point $\bar{x}$ for the problem (7) is both M-stationary and $Q$ -stationary with respect to every collection of cones $Q_{1}, \dots, Q_{K} \subset T_{D} (F (\bar{x}))$ fulfilling (19), i.e.

\begin{matrix} - \nabla f (\bar{x}) & \in & \nabla F {(\bar{x})}^{T} N_{D} (F (\bar{x})) \cap \nabla F {(\bar{x})}^{T} (Q_{1}^{\circ} \cap ⋂_{i = 2}^{K} (ker \nabla F {(\bar{x})}^{T} + Q_{i}^{\circ})) \\ = \nabla F {(\bar{x})}^{T} ((ker \nabla F {(\bar{x})}^{T} + N_{D} (F (\bar{x}))) \cap Q_{1}^{\circ} \cap ⋂_{i = 2}^{K} (ker \nabla F {(\bar{x})}^{T} + Q_{i}^{\circ})) . \end{matrix}

Comparing this relation with the definition of $Q_{M}$ -stationarity we see that $Q_{M}$ -stationarity with respect to $Q_{1}, \dots, Q_{K}$ is stronger than the simultaneous fulfilment of M-stationarity and $Q$ -stationarity with respect to $Q_{1}, \dots, Q_{K}$ . We refer to [14, Example 2] for an example which shows that $Q_{M}$ -stationarity is strictly stronger than M-stationarity. This is one of the advantages of $Q_{M}$ -stationarity:

However, to ensure $Q_{M}$ -stationarity of a B-stationary point $\bar{x}$ , some additional assumption has to be fulfilled.

Lemma 1:

Let $\bar{x}$ be B-stationary for the program (7) and assume that the assumptions of Proposition 1 are fulfilled at $\bar{x}$ . Further assume that for every $λ \in N_{T_{D} (F (\bar{x}))} (0)$ there exists a convex cone $Q_{λ} \subset T_{D} (F (\bar{x}))$ containing $λ$ and satisfying ${(\nabla F {(\bar{x})}^{- 1} Q_{λ})}^{\circ} = \nabla F {(\bar{x})}^{T} Q_{λ}^{\circ}$ . Then there exists a convex cone $Q_{1} \subset T_{D} (F (\bar{x}))$ fulfilling ${(\nabla F {(\bar{x})}^{- 1} Q_{1})}^{\circ} = \nabla F {(\bar{x})}^{T} Q_{1}^{\circ}$ such that for every collection $Q_{2}, \dots, Q_{K} \subset T_{D} (F (\bar{x}))$ fulfilling (19) the point $\bar{x}$ is $Q_{M}$ -stationary with respect to $Q_{1}, \dots, Q_{K}$ .

Proof From the definition of B-stationarity and (16) we deduce the existence of $λ \in N_{T_{D} (F (\bar{x}))} (0)$ fulfilling $- \nabla f (\bar{x}) = \nabla F {(\bar{x})}^{T} λ$ . By taking $Q_{1} = Q_{λ}$ we obviously have $λ \in N_{T_{D} (F (\bar{x}))} (0) \cap Q_{1}^{\circ} \subset N_{D} (F (\bar{x})) \cap Q_{1}^{\circ}$ implying $- \nabla f (\bar{x}) \in \nabla F {(\bar{x})}^{T} (N_{D} (F (\bar{x})) \cap Q_{1}^{\circ})$ . Now consider cones $Q_{2}, \dots, Q_{K} \subset T_{D} (F (\bar{x}))$ fulfilling (19). Similar to the derivation of Theorem 2 we obtain

\begin{matrix} - \nabla f (\bar{x}) & \in & \nabla F {(\bar{x})}^{T} (N_{D} (F (\bar{x})) \cap Q_{1}^{\circ}) \cap ⋂_{i = 2}^{K} \nabla F {(\bar{x})}^{T} Q_{i}^{\circ} \\ = \nabla F {(\bar{x})}^{T} (N_{D} (F (\bar{x})) \cap Q_{1}^{\circ} \cap ⋂_{i = 2}^{K} (ker \nabla F {(\bar{x})}^{T} + Q_{i}^{\circ})) \end{matrix}

and the lemma is proved.

Lemma 2:

Let $\bar{x}$ be feasible for (7) and assume that $T_{D} (F (\bar{x}))$ is the union of finitely many closed convex cones $C_{1}, \dots, C_{p}$ . Then for every $λ \in N_{T_{D} (F (\bar{x}))} (0)$ there is some $\bar{i} \in {1, \dots, p}$ satisfying $λ \in C_{\bar{i}}^{\circ}$ .

Proof Consider $λ \in N_{T_{D} (F (\bar{x}))} (0)$ . By the definition of the limiting normal cone there are sequences $t_{k} \overset{T_{D} (F (\bar{x}))}{⟶} 0$ and $λ_{k} \to λ$ with

λ_{k} \in {\hat{N}}_{T_{D} (F (\bar{x}))} (t_{k}) = {(⋃_{i : t_{k} \in C_{i}} T_{C_{i}} (t_{k}))}^{\circ} = ⋂_{i : t_{k} \in C_{i}} {(T_{C_{i}} (t_{k}))}^{\circ} = ⋂_{i : t_{k} \in C_{i}} N_{C_{i}} (t_{k}) .

By passing to a subsequence if necessary we can assume that there is an index $\bar{i}$ such that $t_{k} \in C_{\bar{i}}$ for all k and we obtain $λ_{k} \in N_{C_{\bar{i}}} (t_{k}) = {c^{*} \in C_{\bar{i}}^{\circ} | ⟨ c^{*}, t_{k} ⟩ = 0} \subset C_{\bar{i}}^{\circ}$ . Since the polar cone $C_{\bar{i}}^{\circ}$ is closed, we deduce $λ \in C_{\bar{i}}^{\circ}$ .

If $T_{D} (F (\bar{x}))$ is the union of finitely many convex polyhedral cones $C_{1}, \dots, C_{p}$ , then the mapping $u ⇉ \nabla F (\bar{x}) u - T_{D} (F (\bar{x}))$ is a polyhedral multifunction and thus metrically subregular at (0, 0) by Robinson’s result [23]. Further we know that for any convex polyhedral cone Q we have ${(\nabla F {(\bar{x})}^{- 1} Q)}^{\circ} = \nabla F {(\bar{x})}^{T} Q^{\circ}$ . Hence, we obtain the following corollary.

Corollary 3:

Assume that $\bar{x}$ is B-stationary for the program (7), that GGCQ is fulfilled at $\bar{x}$ and that $T_{D} (F (\bar{x}))$ is the union of finitely many convex polyhedral cones. Then there is a convex polyhedral cone $Q_{1} \subset T_{D} (F (\bar{x}))$ such that for every collection $Q_{2}, \dots, Q_{K}$ of convex polyhedral cones contained in $T_{D} (F (\bar{x}))$ the point $\bar{x}$ is $Q_{M}$ -stationary with respect to $Q_{1}, \dots, Q_{K}$ .

Let us notice that in contrast to S-, M- and many other types of stationarity the properties of $Q$ - and $Q_{M}$ -stationarity cannot be characterized by some single multiplier. In fact, Q- and $Q_{M}$ -stationarity with respect to $Q_{1}, \dots, Q_{K}$ implies the existence of K multipliers $λ_{1}, \dots, λ_{K}$ satisfying

\begin{matrix} λ_{i} \in Q_{i}^{\circ}, \nabla f (\bar{x}) + \nabla F {(\bar{x})}^{T} λ_{i} = 0, i = 1, \dots, K . \end{matrix}

In case of $Q_{M}$ -stationarity the multiplier $λ_{1}$ also fulfills the M-stationarity conditions.

Further, let us note that $Q_{M}$ -stationarity, although it is stronger that M-stationarity, does not imply B-stationarity in general. Thus, in general $Q_{M}$ -stationarity is not a sufficient condition for a local minimizer as well.

4. Application to MPDC

It is clear that $Q$ -stationarity is not a very strong optimality condition for every choice of $Q_{1}, \dots, Q_{K} \subset T_{D} (F (\bar{x}))$ . As mentioned above the fulfillment of (22) is desirable. For the general problem (7), it can be impossible to choose the cones $Q_{1}, \dots, Q_{K}$ such that (22) holds. If $T_{D} (F (\bar{x}))$ is the union of finitely many convex cones $C_{1}, \dots, C_{p}$ then we obviously have

\begin{matrix} {\hat{N}}_{D} (F (\bar{x})) = ⋂_{i = 1}^{p} C_{i}^{\circ} . \end{matrix}

However, to consider $Q$ -stationarity with respect to $C_{1}, \dots, C_{p}$ is in general not a feasible approach because p is often very large. We will now work out that the concepts of $Q$ - and $Q_{M}$ -stationarity are tailored for the MPDC (1). In what follows let D and F be given by (2).

Given a point $y = (y_{1}, \dots, y_{m_{D}}) \in D$ , we denote by

\begin{matrix} A_{i} (y) : = {j \in {1, \dots, K_{i}} | y_{i} \in D_{i}^{j}}, i = 1, \dots, m_{D} \end{matrix}

the indices of sets $D_{i}^{j}$ which contain $y_{i}$ . Further we choose for each $i = 1, \dots, m_{D}$ some index set $J_{i} (y) \subset A_{i} (y)$ such that

\begin{matrix} T_{D_{i}} (y_{i}) = ⋃_{j \in J_{i} (y)} T_{D_{i}^{j}} (y_{i}) . \end{matrix}

(26)

Obviously the choice $J_{i} (y) = A_{i} (y)$ is feasible but for practical reasons it is better to choose $J_{i} (y)$ smaller if possible. E.g. if $T_{D_{i}^{j_{2}}} (y_{i}) \subset T_{D_{i}^{j_{1}}} (y_{i})$ holds for some indices $j_{1}, j_{2} \in A_{i} (y)$ , then we will not include $j_{2}$ in $J_{i} (y)$ . Such a situation can occur e.g. in case of MPVC when $(- H_{i} (\bar{x}), G_{i} (\bar{x})) = (0, a)$ with $a < 0$ .

Now consider

\begin{matrix} ν \in J (y) : = \prod_{i = 1}^{m_{D}} J_{i} (y) . \end{matrix}

Since for every $i = 1, \dots, m_{D}$ the set $D_{i}$ is the union of finitely many convex polyhedral sets, for every tangent direction $t \in T_{D_{i}} (y_{i})$ we have $y_{i} + α t \in D_{i}$ for all $α > 0$ sufficiently small. Hence, we can apply [24, Proposition 1] to obtain

\begin{matrix} T_{D (ν)} (y) = \prod_{i = 1}^{m_{D}} T_{D_{i}^{ν_{i}}} (y_{i}), ν \in J (y) \end{matrix}

with $D (ν)$ given by (4), and

\begin{matrix} T_{D} (y) = \prod_{i = 1}^{m_{D}} T_{D_{i}} (y_{i}) = \prod_{i = 1}^{m_{D}} (⋃_{j \in J_{i} (y)} T_{D_{i}^{j}} (y_{i})) = ⋃_{ν \in J (y)} T_{D (ν)} (y) . \end{matrix}

(27)

We will apply this setting in particular to points $y = F (\bar{x})$ with $\bar{x}$ feasible for MPDC.

Lemma 3:

Let $\bar{x}$ be feasible for the MPDC (1) and assume that we are given K elements $ν^{1}, \dots, ν^{K} \in J (F (\bar{x}))$ such that

$\begin{matrix} {ν_{i}^{1}, \dots, ν_{i}^{K}} = J_{i} (F (\bar{x})), i = 1, \dots, m_{D} . \end{matrix}$ (28)

Then for each $l = 1, \dots, K$ the cone $Q_{l} : = T_{D (ν^{l})} (F (\bar{x}))$ is a convex polyhedral cone contained in $T_{D} (F (\bar{x}))$ , $(\nabla F {(\bar{x})}^{- 1} Q_{l})^{\circ} = \nabla F {(\bar{x})}^{T} Q_{l}^{\circ}$ , and

$\begin{matrix} ⋂_{l = 1}^{K} Q_{l}^{\circ} = {\hat{N}}_{D} (F (\bar{x})) . \end{matrix}$

Proof Obviously, for every $l = 1, \dots, K$ the cone $Q_{l}$ is convex and polyhedral because it is the product of convex polyhedral cones. This implies $(\nabla F {(\bar{x})}^{- 1} Q_{l})^{\circ} = \nabla F {(\bar{x})}^{T} Q_{l}^{\circ}$ and $Q_{l} \subset T_{D} (F (\bar{x}))$ follows from (27). By taking into account (27) the last assertion follows from

\begin{matrix} {\hat{N}}_{D} (F (\bar{x})) & = (T_{D} (F (\bar{x})))^{\circ} = \prod_{i = 1}^{m_{D}} (⋃_{j \in J_{i} (F (\bar{x}))} T_{D_{i}^{j}} (F_{i} (\bar{x})))^{\circ} = \prod_{i = 1}^{m_{D}} (⋃_{l = 1}^{K} T_{D_{i}^{ν_{i}^{l}}} (F_{i} (\bar{x})))^{\circ} \\ = \prod_{i = 1}^{m_{D}} (⋂_{l = 1}^{K} (T_{D_{i}^{ν_{i}^{l}}} (F_{i} (\bar{x})))^{\circ}) = ⋂_{l = 1}^{K} (\prod_{i = 1}^{m_{D}} (T_{D_{i}^{ν_{i}^{l}}} (F_{i} (\bar{x})))^{\circ}) = ⋂_{l = 1}^{K} (\prod_{i = 1}^{m_{D}} T_{D_{i}^{ν_{i}^{l}}} (F_{i} (\bar{x})))^{\circ} \\ = ⋂_{l = 1}^{K} Q_{l}^{\circ} . \end{matrix}

Definition 6:

Let $\bar{x}$ be feasible for the MPDC (1) and let index sets $J_{i} (F (\bar{x})) \subset A_{i} (\bar{x})$ , $i = 1, \dots, m_{D}$ fulfilling (26) be given. Further we denote by $Q (\bar{x})$ the collection of all elements $(ν^{1}, \dots, ν^{K})$ with $ν^{l} \in J (F (\bar{x})) = \prod_{i = 1}^{m_{D}} J_{i} (F (\bar{x}))$ , $l = 1, \dots, K$ such that (28) holds.

(1)
We say that $\bar{x}$ is $Q$ -stationary ( $Q_{M}$ -stationary) for (1) with respect to $(ν^{1}, \dots, ν^{K}) \in Q (\bar{x})$ , if $\bar{x}$ is $Q$ -stationary ( $Q_{M}$ -stationary) with respect to $Q_{1}, \dots, Q_{K}$ in the sense of Definition 5 with $Q_{l} : = T_{D (ν^{l})} (F (\bar{x}))$ , $l = 1, \dots, K$ .

(2)
We say that $\bar{x}$ is $Q$ -stationary ( $Q_{M}$ -stationary) for (1) if $\bar{x}$ is $Q$ -stationary ( $Q_{M}$ -stationary) for (1) with respect to some $(ν^{1}, \dots, ν^{K}) \in Q (\bar{x})$ .

Definition 6 is an extension of the definition of $Q$ - and $Q_{M}$ -stationarity made for MPCC and MPVC in [14]. Note that the number K appearing in the definition of $Q (\bar{x})$ is not fixed. Denoting $K_{min} (\bar{x})$ the minimal number K such that $(ν^{1}, \dots, ν^{K}) \in Q (\bar{x})$ , we obviously have

\begin{matrix} K_{min} (\bar{x}) = max_{i = 1, \dots, m_{D}} | J_{i} (F (\bar{x})) | \leq max_{i = 1, \dots, m_{D}} K_{i} . \end{matrix}

(29)

We see from (27) that the tangent cone $T_{D} (F (\bar{x}))$ is the union of the $| J (F (\bar{x})) | = \prod_{i = 1}^{m_{D}} | J_{i} (F (\bar{x})) |$ convex polyhedral cones $T_{D (ν)} (y)$ . Hence, the minimal number $K_{min} (\bar{x})$ is much smaller than the number of components of the tangent cone, except when all or nearly all sets $J_{i} (F (\bar{x}))$ have cardinality 1. E.g. when $K_{i} \leq 2$ holds for all $i = 1, \dots, m_{D}$ as it is the case of the MPCC (5), then we have $K_{min} (\bar{x}) \leq 2$ whereas the number of convex cones building the tangent cone $T_{D} (F (\bar{x}))$ grows exponentially with the number of biactive constraints, i.e. complementarity constraints satisfying $G_{i} (\bar{x}) = H_{i} (\bar{x}) = 0$ . The concepts of $Q$ - and $Q_{M}$ -stationarity for MPDC take advantage of the fact that although the tangent cone is the union of a huge number of cones, its polar cone can be written as the intersection of a small number of polars. Further, it is clear that for every $ν^{1} \in J (F (\bar{x}))$ and every $K \geq K_{min} (\bar{x})$ we can find $ν^{2}, \dots, ν^{K} \in J (F (\bar{x}))$ such that $(ν^{1}, \dots, ν^{K}) \in Q (\bar{x})$ .

We allow K to be greater than $K_{min} (\bar{x})$ for numerical reasons primarily. Recall that for testing $Q$ -stationarity with respect to $(ν^{1}, \dots, ν^{K})$ , we have to check for all $l = 1, \dots, K$ whether $- \nabla f (\bar{x}) \in \nabla F {(\bar{x})}^{T} Q_{l}^{\circ}$ , or equivalently, that $u = 0$ is a solution of the linear optimization program

\begin{matrix} min ⟨ \nabla f (\bar{x}), u ⟩ subject to \nabla F (\bar{x}) u \in Q_{l} \end{matrix}

with $Q_{l} = T_{D (ν^{l})} (F (\bar{x}))$ . Theoretically the treatment of degenerated linear constraints is not a big problem but the numerical practice tells us the contrary. In [25] we have implemented an algorithm for solving MPVC based on $Q$ -stationarity and the degeneracy of the linear constraints was the reason when the algorithm crashed. The possibility of choosing $K > K_{min} (\bar{x})$ gives us more flexibility to avoid linear programs with degenerated constraints.

The following theorem follows from Corollaries 2, 3, Theorem 3 and the considerations above.

Theorem 4:

Let $\bar{x}$ be feasible for the MPDC (1) and assume that GGCQ is fulfilled at $\bar{x}$ .

(i)
If $\bar{x}$ is B-stationary then $\bar{x}$ is $Q$ -stationary with respect to every element $(ν^{1}, \dots, ν^{K}) \in Q (\bar{x})$ and there exists some ${\bar{ν}}^{1} \in J (F (\bar{x}))$ such that $\bar{x}$ is $Q_{M}$ -stationary with respect to every $({\bar{ν}}^{1}, ν^{2}, \dots, ν^{K}) \in Q (\bar{x})$ .

(ii)
Conversely, if $\bar{x}$ is $Q$ -stationary with respect to some $(ν^{1}, \dots, ν^{K}) \in Q (\bar{x})$ and
$\begin{matrix} \nabla F {(\bar{x})}^{T} (Q_{1}^{\circ} \cap ⋂_{l = 2}^{K} (ker \nabla F {(\bar{x})}^{T} + Q_{l}^{\circ})) \subset \nabla F {(\bar{x})}^{T} {\hat{N}}_{D} (F (\bar{x})), \end{matrix}$ (30)
where $Q_{l} : = T_{D (ν^{l})} F (\bar{x})$ , $l = 1, \dots, K$ , then $\bar{x}$ is S-stationary and consequently B-stationary. In particular, (30) is fulfilled if
$\begin{matrix} ker \nabla F {(\bar{x})}^{T} \cap (Q_{1}^{\circ} - Q_{l}^{\circ}) = {0}, l = 2, \dots, K . \end{matrix}$ (31)

5. On quadratic programs with disjunctive constraints

In this section, we consider the special case of quadratic programs with disjunctive constraints (QPDC)

\begin{matrix} min_{x \in R^{n}} & q (x) : = \frac{1}{2} x^{T} B x + d^{T} x \\ subject to & A_{i} x \in D_{i} : = ⋃_{j = 1}^{K_{i}} D_{i}^{j}, i = 1, \dots, m_{D}, \end{matrix}

(32)

where B is a positive semidefinite $n \times n$ matrix, $d \in R^{n}$ , $A_{i}$ , $i = 1, \dots, m_{D}$ are $l_{i} \times n$ matrices and $D_{i}^{j} \subset R^{l_{i}}$ , $i = 1, \dots, m_{D}$ , $j = 1, \dots, K_{j}$ are convex polyhedral sets, i.e. QPDC is a special case of MPDC with $f (x) = q (x)$ and $F_{i} (x) = A_{i} x$ , $i = 1, \dots, m_{D}$ . In what follows, we denote by A the $m \times n$ matrix

\begin{matrix} A = (\begin{matrix} A_{1} \\ ⋮ \\ A_{m_{D}} \end{matrix}), \end{matrix}

where $m : = \sum_{i = 1}^{m_{D}} l_{i}$ .

We start our analysis with the following preparatory lemma.

Lemma 4:

Assume that the convex quadratic program

$\begin{matrix} min_{x \in R^{n}} \frac{1}{2} x^{T} B x + d^{T} x subject to A x \in C \end{matrix}$ (33)

is feasible, where B is some symmetric positive semidefinite $n \times n$ matrix, $d \in R^{n}$ , A is a $m \times n$ matrix and $C \subset R^{m}$ is a convex polyhedral set. Then either there exists a direction w satisfying

$\begin{matrix} B w = 0, A w \in 0^{+} C, d^{T} w < 0, \end{matrix}$ (34)

or the program (33) has a global solution $\bar{x}$ .

Proof Assume that for every w with $B w = 0$ , $A w \in 0^{+} C$ we have $d^{T} w \geq 0$ , i.e. 0 is a global solution of the program

\begin{matrix} min d^{T} w subject to w \in S : = \{w | (\begin{matrix} B \\ A \end{matrix}) w \in {0}^{n} \times 0^{+} C\} . \end{matrix}

Since C is a convex polyhedral set, its recession cone $0^{+} C$ is a convex polyhedral cone and so is ${0}^{n} \times 0^{+} C$ as well. Hence,

\begin{matrix} {\hat{N}}_{S} (0) = S^{\circ} = (B^{T} ⋮ A^{T}) {({0}^{n} \times 0^{+} C)}^{\circ} = B^{T} R^{n} + A^{T} {(0^{+} C)}^{\circ} \end{matrix}

and from the first-order optimality condition $- d \in {\hat{N}}_{S} (0)$ we derive the existence of multipliers $μ_{B} \in R^{n}$ and $μ_{C} \in {(0^{+} C)}^{\circ}$ such that

\begin{matrix} - d = B^{T} μ_{B} + A^{T} μ_{C} . \end{matrix}

The convex polyhedral set C is the sum of the convex hull $Σ$ of its extreme points and its recession cone. Hence, for every x feasible for (33) there is some $c_{1} \in Σ$ and some $c_{2} \in 0^{+} C$ such that $A x = c_{1} + c_{2}$ and, by taking into account $μ_{C}^{T} c_{2} \leq 0$ , we obtain

\begin{matrix} \frac{1}{2} x^{T} B x + d^{T} x & = \frac{1}{2} x^{T} B x - μ_{B}^{T} B x - μ_{C}^{T} A x \\ = \frac{1}{2} {(x - μ_{B})}^{T} B (x - μ_{B}) - \frac{1}{2} μ_{B}^{T} B μ_{B} - μ_{C}^{T} c_{1} - μ_{C}^{T} c_{2} \\ \geq & - \frac{1}{2} μ_{B}^{T} B μ_{B} - μ_{C}^{T} c_{1} . \end{matrix}

(35)

The set $Σ$ is compact and we conclude that the objective of (33) is bounded below on the feasible domain $A^{- 1} C$ by $- \frac{1}{2} μ_{B}^{T} B μ_{B} - {max}_{c_{1} \in Σ} μ_{C}^{T} c_{1}$ . Thus

\begin{matrix} α : = inf \{\frac{1}{2} x^{T} B x + d^{T} x | A x \in C\} \end{matrix}

is finite and there remains to show that the infimum is attained. Consider some sequence $x_{k} \in A^{- 1} C$ with ${lim}_{k \to \infty} \frac{1}{2} x_{k}^{T} B x_{k} + d^{T} x_{k} = α$ . We conclude from (35) that ${(x_{k} - μ_{B})}^{T} B (x_{k} - μ_{B})$ is bounded which in turn implies that the sequence $B^{1 / 2} x_{k}$ is bounded. Hence, the sequence $x_{k}^{T} B x_{k} = {‖ B^{1 / 2} x_{k} ‖}^{2}$ is bounded as well and we can conclude also the boundedness of $d^{T} x_{k}$ . By passing to a subsequence we can assume that the sequence $(B^{1 / 2} x_{k}, d^{T} x_{k})$ converges to some $(z, β)$ and it follows that $α = \frac{1}{2} {‖ z ‖}^{2} + β$ . Since C is a convex polyhedral set, it follows by applying [22, Theorem 19.3] twice, that the sets $A^{- 1} C$ and ${(B^{1 / 2} u, d^{T} u) | u \in A^{- 1} C}$ are convex and polyhedral. Since convex polyhedral sets are closed, it follows that $(z, β) \in {(B^{1 / 2} u, d^{T} u) | u \in A^{- 1} C}$ . Thus there is some $\bar{x} \in A^{- 1} C$ with $(z, β) = (B^{1 / 2} \bar{x}, d^{T} \bar{x})$ and $\frac{1}{2} {\bar{x}}^{T} B \bar{x} + d^{T} \bar{x} = α$ follows. This shows that $\bar{x}$ is a global minimizer for (33).

In what follows, we assume that we have at hand an algorithm for solving (33), which either computes a global solution $\bar{x}$ or a descent direction w fulfilling (34). Such an algorithm is, e.g. the active set method as described in [26], where we have to rewrite the constraints equivalently in the form $⟨ A^{T} a_{i}, x ⟩ \leq b_{i}$ , $i = 1, \dots, p$ using the representation of C as the intersection of finitely many half-spaces, $C = {c | ⟨ a_{i}, c ⟩ \leq b_{i}, i = 1, \dots, p}$ .

Consider now the following algorithm. Inline graphic

Algorithm 1 can be considered as a kind of active index set strategy. The set $(ν^{k, 1}, \dots, ν^{k, K}) \in Q (x^{k})$ chosen in step (2) acts as a working set and is a subset of the active pieces of the disjunctive constraints. The number K will also depend on $x^{k}$ and for practical reasons it is desirable to keep K small to have small numerical effort in each iteration. Recall that we can always choose K equal to the number $K_{min} (x^{k})$ given by (29) which is bounded by ${max}_{i = 1, \dots, m_{D}} K_{i}$ . The working set is used for testing for unboundedness of the problem and $Q$ -stationarity, respectively, by investigating the quadratic subproblems $(Q P^{k, l})$ . If one of these subproblem’s problem appears unbounded, we stop the algorithm because of unboundedness of the whole program. If $x^{k}$ is a solution for every quadratic subproblem, we stop the algorithm because $x^{k}$ is $Q$ -stationary. On the other hand, if $x^{k}$ is not a solution of one of these subproblems, then we take the point $x^{k + 1}$ as the solution of this subproblem, yielding a smaller objective function value. Then, we repeat the whole procedure by generating a new working set and testing for termination.

In the next theorem, we show that Algorithm 1 is finite. However, we do not know any nontrivial bound on the number of iterations needed, as usual for active set strategies.

Theorem 5:

Algorithm 1 terminates after a finite number of iterations either with some feasible point and some descent direction w indicating that QPDC is unbounded below or with some $Q$ -stationary solution.

Proof If Algorithm 1 terminates in step (2) the output is a feasible point together with some descent direction showing that QPDC is unbounded below. If the algorithm does not terminate in step (2), the computed sequence of function values $q (x^{k})$ is strictly decreasing. Moreover, denoting $ν^{k} : = ν^{k - 1, l}$ where l is the index chosen in step (4), we see that for each $k \geq 2$ the point $x^{k}$ is global minimizer of the problem

\begin{matrix} min q (x) subject to A x \in D (ν^{k}) . \end{matrix}

This shows that all the vectors $ν^{k}$ must be pairwise different and since there is only a finite number of possible choices for $ν^{k}$ , the algorithm must stop in step (3). We will now show that the final iterate $x^{k}$ is $Q$ -stationary with respect to $(ν^{k, 1}, \dots, ν^{k, K})$ . Since for each $l = 1, \dots, K$ the point $x^{k}$ is a global minimizer of the subproblem $(Q^{k, l})$ , it also satisfies the first order optimality condition

\begin{matrix} ⟨ \nabla q (x^{k}), u ⟩ \geq 0 for every u \in R^{n} satisfying A u \in T_{D (ν^{k, l})} (A x^{k})) . \end{matrix}

This shows $Q$ -stationarity of $x^{k}$ and the theorem is proved.

6. On verifying $Q_{M}$ -stationarity for MPDC

The following theorem is crucial for the verification of M-stationarity.

Theorem 6:

(i)
Let $\bar{x}$ be feasible for the general program (7). If there exists a B-stationary solution of the program
$\begin{matrix} min_{(u, v) \in R^{n} \times R^{m}} ⟨ \nabla f (\bar{x}), u ⟩ + \frac{1}{2} {‖ v ‖}^{2} subject to \nabla F (\bar{x}) u + v \in T_{D} (F (\bar{x})), \end{matrix}$ (36)
then $\bar{x}$ is M-stationary.

(ii)
Let $\bar{x}$ be B-stationary for the MPDC (1) and assume that GGCQ holds at $\bar{x}$ . Then the program (36) has a global solution.

Proof

(i)
Let $(\bar{u}, \bar{v})$ denote a B-stationary solution, i.e. $- (\nabla f (\bar{x}), \bar{v}) \in {\hat{N}}_{Γ} (\bar{u}, \bar{v})$ , where $Γ = {(\nabla F (\bar{x}) ⋮ I)}^{- 1} T_{D} (F (\bar{x}))$ . Since the matrix $(\nabla F (\bar{x}) ⋮ I)$ obviously has full rank, we have ${\hat{N}}_{Γ} (\bar{u}, \bar{v}) = {(\nabla F (\bar{x}) ⋮ I)}^{T} {\hat{N}}_{T_{D} (F (\bar{x}))} (\nabla F (\bar{x}) \bar{u} + \bar{v})$ by [11, Exercise 6.7]. Thus there exists a multiplier $λ \in {\hat{N}}_{T_{D} (F (\bar{x}))} (\nabla F (\bar{x}) \bar{u} + \bar{v})$ such that $- \nabla f (\bar{x}) = \nabla F {(\bar{x})}^{T} λ$ and $- \bar{v} = λ$ . Using [11, Proposition 6.27] we have ${\hat{N}}_{T_{D} (F (\bar{x}))} (\nabla F (\bar{x}) \bar{u} + \bar{v}) \subset N_{T_{D} (F (\bar{x}))} (\nabla F (\bar{x}) \bar{u} + \bar{v}) \subset N_{T_{D} (F (\bar{x}))} (0) \subset N_{D} (F (\bar{x}))$ establishing M-stationarity of $\bar{x}$ .
(ii)
Consider for arbitrarily fixed $ν \in J (F (\bar{x}))$ the convex quadratic program
$\begin{matrix} min_{(u, v) \in R^{n} \times R^{m}} ⟨ \nabla f (\bar{x}), u ⟩ + \frac{1}{2} {‖ v ‖}^{2} subject to \nabla F (\bar{x}) u + v \in T_{D (ν)} (F (\bar{x})) . \end{matrix}$ (37)
Assuming that this quadratic program does not have a solution, by Lemma 4 we could find a direction $(w_{u}, w_{v})$ satisfying
$\begin{matrix} (\begin{matrix} 0 & 0 \\ 0 & I \end{matrix}) (\begin{matrix} w_{u} \\ w_{v} \end{matrix}) = 0, \nabla F (\bar{x}) w_{u} + w_{v} \in 0^{+} T_{D (ν)} (F (\bar{x})), ⟨ \nabla f (\bar{x}), w_{u} ⟩ + ⟨ 0, w_{v} ⟩ < 0 . \end{matrix}$
This implies $w_{v} = 0$ , $\nabla F (\bar{x}) w_{u} \in 0^{+} T_{D (ν)} (F (\bar{x})) = T_{D (ν)} (F (\bar{x})) \subset T_{D} (F (\bar{x}))$ and $⟨ \nabla f (\bar{x}), w_{u} ⟩ < 0$ and thus, together with GGCQ, $- \nabla f (\bar{x}) \notin {(T_{Ω}^{lin} (\bar{x}))}^{\circ} = {\hat{N}}_{D} (F (\bar{x}))$ contradicting our assumption that $\bar{x}$ is B-stationary for (1). Hence, the quadratic program (37) must possess some global solution $(u_{ν}, v_{ν})$ . By choosing $\bar{ν} \in J (F (\bar{x}))$ such that $⟨ \nabla f (\bar{x}), u_{\bar{ν}} ⟩ = min {⟨ \nabla f (\bar{x}), u_{ν} ⟩ | ν \in J (F (\bar{x}))}$ it follows from (27) that $(u_{\bar{ν}}, v_{\bar{ν}})$ is a global solution of (36).

We now want to apply Algorithm 1 to the problem (36). Note that the point (0, 0) is feasible for (36) and therefore we can start Algorithm 1 with $(u^{1}, v^{1}) = (0, 0)$ .

Corollary 4:

Let $\bar{x}$ be feasible for the MPDC (1) and apply Algorithm 1 to the QPDC (36). If the algorithm returns an iterate together with some descent direction indicating that (36) is unbounded below and if GGCQ is fulfilled at $\bar{x}$ , then $\bar{x}$ is not B-stationary. On the other hand, if the algorithm returns a $Q$ -stationary solution, then $\bar{x}$ is M-stationary.

Proof Observe that in case when Algorithm 1 returns a $Q$ -stationary solution, by Theorem 4(ii) this solution is B-stationary because the Jocobian of the constraints $(\nabla F (\bar{x}) ⋮ I)$ obviously has full rank. Now the statement follows from Theorem 6.

We now want to analyse how the output of Algorithm 1 can be further utilized. Recalling that $T_{D} (F (\bar{x}))$ has the disjunctive structure

\begin{matrix} T_{D} (F (\bar{x})) = \prod_{i = 1}^{m_{D}} (⋃_{j \in J_{i} (F (\bar{x}))} T_{D_{i}^{j}} (F_{i} (\bar{x}))), \end{matrix}

we define for $y = (y_{1}, \dots, y_{m_{D}}) \in T_{D} (F (\bar{x}))$ the index sets

\begin{matrix} A_{i}^{T_{D}} (y) : = {j \in J_{i} (F (\bar{x})) | y_{i} \in T_{D_{i}^{j}} (F_{i} (\bar{x}))}, i = 1, \dots, m_{D} . \end{matrix}

Further we choose for each $i = 1, \dots, m_{D}$ some index set $J_{i}^{T_{D}} (y) \subset A_{i}^{T_{D}} (y)$ such that

\begin{matrix} T_{T_{D_{i}} (F_{i} (\bar{x}))} (y_{i}) = ⋃_{j \in J_{i}^{T_{D}} (y)} T_{T_{D_{i}^{j}} (F_{i} (\bar{x}))} (y_{i}) \end{matrix}

(38)

and set

\begin{matrix} J^{T_{D}} (y) : = \prod_{i = 1}^{m_{D}} J_{i}^{T_{D}} (y) . \end{matrix}

Note that we always have

\begin{matrix} J^{T_{D}} (y) \subset J (F (\bar{x})) . \end{matrix}

In order to verify $Q$ -stationarity for the problem (36) at some feasible point (u, v), we have to consider the set $Q^{T_{D}} (u, v)$ consisting of all $(ν^{1}, \dots, ν^{K})$ with $ν^{l} \in J^{T_{D}} (\nabla F (\bar{x}) u + v)$ , $l = 1, \dots, K$ such that

\begin{matrix} {ν_{i}^{1}, \dots, ν_{i}^{K}} = J_{i}^{T_{D}} (\nabla F (\bar{x}) u + v), i = 1, \dots, m_{D} . \end{matrix}

At the k-th iterate $(u^{k}, v^{k})$ we have to choose $(ν^{k, 1}, \dots, ν^{k, K}) \in Q^{T_{D}} (u^{k}, v^{k})$ and then for each $l = 1, \dots, K$ we must analyse the convex quadratic program

\begin{matrix} (Q P^{k, l}) min_{u, v} ⟨ \nabla f (\bar{x}), u ⟩ + \frac{1}{2} {‖ v ‖}^{2} subject to \nabla F (\bar{x}) u + v \in T_{D (ν^{k, l})} (F (\bar{x})) . \end{matrix}

If for some $\bar{l} \in {1, \dots, K}$ this quadratic program is unbounded below then Algorithm 1 returns the index $\bar{ν} : = ν^{k, \bar{l}}$ together with a descent direction $(w_{u}, w_{v})$ fulfilling, as argued in the proof of Theorem 6(ii),

\begin{matrix} w_{v} = 0, \nabla F (\bar{x}) w_{u} \in 0^{+} T_{D (\bar{ν})} (F (\bar{x})) = T_{D (\bar{ν})} (F (\bar{x})), ⟨ \nabla f (\bar{x}), w_{u} ⟩ < 0 . \end{matrix}

Therefore, $w_{u}$ constitutes a feasible descent direction, provided GACQ holds at $\bar{x}$ , i.e. for every $α > 0$ sufficiently small the projection of $\bar{x} + α w_{u}$ on the feasible set $F^{- 1} (D)$ yields a point with a smaller objective function value than $\bar{x}$ . If GACQ also holds for the constraint $F (x) \in D (\bar{ν})$ at $\bar{x}$ , then we can also project the point $\bar{x} + α w_{u}$ on $F^{- 1} (D (\bar{ν}))$ in order to reduce the objective function.

Now, assume that the final iterate $(u^{k}, v^{k})$ of Algorithm 1 is $Q$ -stationary for (36) and consequently $\bar{x}$ is M-stationary for the MPDC (1). Setting $λ : = - v^{k}$ , the first order optimality conditions for the quadratic programs $(Q P^{k, l})$ result in

\begin{matrix} - \nabla f (\bar{x}) & = \nabla F {(\bar{x})}^{T} λ, \\ λ & \in & ⋂_{l = 1}^{K} N_{T_{D (ν^{k, l})} (F (\bar{x}))} (\nabla F (\bar{x}) u^{k} + v^{k}) = {\hat{N}}_{T_{D} (F (\bar{x}))} (\nabla F (\bar{x}) u^{k} + v^{k}) \subset N_{D} (F (\bar{x})) . \end{matrix}

From this we conclude $- \nabla f (\bar{x}) \in \nabla F {(\bar{x})}^{T} (Q_{1}^{\circ} \cap N_{D} (F (\bar{x}))$ with $Q_{1} : = T_{D (\bar{ν})} (F (\bar{x})) \subset T_{T_{D (\bar{ν})} (F (\bar{x}))} (\nabla F (\bar{x}) u^{k} + v^{k})$ where $\bar{ν} = ν^{k, 1}$ is the index vector returned from Algorithm 1. Now choosing $ν^{2}, \dots, ν^{K}$ such that $(\bar{ν}, ν^{2}, \dots, ν^{K}) \in Q (\bar{x})$ we can simply check by testing $- \nabla f (\bar{x}) \in N_{D (ν^{l})} (F (\bar{x}))$ , $l = 2, \dots, K,$ whether $\bar{x}$ is $Q_{M}$ stationary or $\bar{x}$ is not B-stationary.

Further, we have the following corollary.

Corollary 5:

Let $\bar{x}$ be B-stationary for the MPDC (1) and assume that GGCQ is fulfilled at $\bar{x}$ . Let $\bar{ν}$ be the index vector returned by Algorithm 1 applied to (36). Then $\bar{ν} \in J (F (\bar{x}))$ and for every $ν^{2}, \dots, ν^{K}$ with $(\bar{ν}, ν^{2}, \dots, ν^{K}) \in Q (\bar{x})$ the point $\bar{x}$ is $Q_{M}$ -stationary with respect to $(\bar{ν}, ν^{2}, \dots, ν^{K})$ .

7. Numerical aspects

In practice, the point $\bar{x}$ which should be checked for M-stationarity and $Q_{M}$ -stationarity, respectively, often is not known exactly. E.g. $\bar{x}$ can be the limit point of a sequence generated by some numerical method for solving MPDC. Hence, let us assume that we are given some point $\tilde{x}$ close to $\bar{x}$ and we want to state some rules when we can consider $\tilde{x}$ as approximately M-stationary or $Q_{M}$ -stationary. Let us assume that the convex polyhedral sets $D_{i}^{j}$ have the representation

\begin{matrix} D_{i}^{j} = {y | ⟨ a_{l}^{i, j}, y ⟩ \leq b_{l}^{i, j}, l = 1, \dots, p^{i, j}}, i = 1, \dots, m_{D}, j = 1, \dots, K_{i}, \end{matrix}

where without loss of generality we assume $‖ a_{l}^{i, j} ‖ = 1$ .

We use here the following approach.

In the first step of Algorithm 2, we want to estimate the tangent cone $T_{D} (F (\bar{x}))$ . In fact, to calculate $T_{D} (F (\bar{x}))$ we need not to know the point $F (\bar{x})$ , we only need the index sets of constraints active at $\bar{x}$ and these index sets are approximated by $ϵ$ -active constraints. Note that whenever ${\tilde{A}}_{i} (\tilde{x}, ϵ) = {\tilde{A}}_{i} (\bar{x}, 0) = A_{i} (F (\bar{x}))$ and ${\tilde{I}}_{i}^{j} (\tilde{x}, ϵ) = {\tilde{I}}_{i}^{j} (\bar{x}, 0)$ , $i = 1, \dots, m_{D}$ , $j \in A_{i} (F (\bar{x}))$ this approach yields the exact tangent cones $T_{D_{i}^{j}} (F (\bar{x})) = T_{i}^{j} (\tilde{x}, ϵ)$ for all $i = 1, \dots, m_{D}$ , $j \in A_{i} (F (\bar{x}))$ . To be consistent with the notation of Section 4 we make the convention that in this case the index vector $\bar{ν}$ computed in step (2) belongs to $J (\bar{x})$ and also, whenever we determine $ν^{2}, \dots ν^{K}$ is step (4), we have $(\bar{ν}, ν^{2}, \dots, ν^{K}) \in Q (\bar{x})$ . The regularization term $\frac{σ}{2} {‖ u ‖}^{2}$ in $Q P D C (\tilde{x}, ϵ, σ)$ forces the objective to be strictly convex and therefore Algorithm 1 will always terminate with a $Q$ -stationary solution. Further note that the verification of (39) requires the solution of $K - 1$ linear optimization problems.

The following theorem justifies Algorithm 2. In the sequel, we denote by $M (\bar{x})$ $(M_{s u b} (\bar{x}))$ the set of all $ν \in J (\bar{x})$ such that the mapping $F (\cdot) - D (ν)$ is metrically regular near $(\bar{x}, 0)$ (metrically subregular at $(\bar{x}, 0)$ ).

Theorem 7:

Let $\bar{x}$ be feasible for the MPDC (1) and assume that $\nabla f$ and $\nabla F$ are Lipschitz near $\bar{x}$ . Consider sequences $x_{t} \to \bar{x}$ , $ϵ_{t} ↓ 0$ , $σ_{t} ↓ 0$ and $η_{t} ↓ 0$ with

$\begin{matrix} lim_{t \to \infty} \frac{‖ x_{t} - \bar{x} ‖}{ϵ_{t}} = lim_{t \to \infty} \frac{\frac{σ_{t}}{η_{t}} + ‖ x_{t} - \bar{x} ‖}{η_{t}} = 0 \end{matrix}$

and let $({\tilde{u}}_{t}, {\tilde{v}}_{t})$ , ${\bar{ν}}^{t}$ and eventually $ν^{t, 2} \dots, ν^{t, K_{t}}$ and ${\bar{l}}_{t}$ denote the output of Algorithm 2 with input data $(x_{t}, ϵ_{t}, σ_{t}, η_{t})$ .

(i)
For all t sufficiently large and for all $i \in {1, \dots, m_{D}}$ we have
$\begin{matrix} {\tilde{A}}_{i} (x_{t}, ϵ_{t}) = A_{i} (F (\bar{x})), {\tilde{I}}_{i}^{j} (x_{t}, ϵ_{t}) = {\tilde{I}}_{i}^{j} (\bar{x}, 0), j \in A_{i} (F (\bar{x})) . \end{matrix}$ (39)

(ii)
Assume that the mapping $x ⇉ F (x) - D$ is metrically regular near $(\bar{x}, 0)$ .

(a)
If $\bar{x}$ is B-stationary then for all t sufficiently large the point $x_{t}$ is accepted as approximately M-stationary and approximately $Q_{M}$ -stationary.

(b)
If for infinitely many t the point $x_{t}$ is accepted as approximately M-stationary then $\bar{x}$ is M-stationary.

(c)
If for infinitely many t the point $x_{t}$ is accepted as approximately $Q_{M}$ -stationary and ${{\bar{ν}}^{t}, ν^{t, 2}, \dots, ν^{t, K_{t}}} \subset M (\bar{x})$ then $\bar{x}$ is $Q_{M}$ -stationary.

(d)
For every t sufficiently large such that the point $x_{t}$ is not accepted as approximately M-stationary and ${\bar{ν}}^{t} \in M_{s u b} (\bar{x})$ we have $min {f (x) | F (x) \in D ({\bar{ν}}^{t})} < f (\bar{x})$ .

(e)
For every t sufficiently large such that the point $x_{t}$ is not accepted as approximately $Q_{M}$ -stationary and $ν^{t, {\bar{l}}_{t}} \in M_{s u b} (\bar{x})$ we have $min {f (x) | F (x) \in D (ν^{t, {\bar{l}}_{t}})} < f (\bar{x})$ .

Proof (i) Let $R > 0$ be chosen such that f, F and their derivatives are Lipschitz on $B (\bar{x}, R)$ with constant L. It is easy to see that we can choose $ϵ > 0$ such that for all $i \in {1, \dots, m_{D}}$ we have ${\tilde{A}}_{i} (\bar{x}, ϵ) = {\tilde{A}}_{i} (\bar{x}, 0) = A_{i} (F (\bar{x}))$ and such that for every $j \in A_{i} (F (\bar{x}))$ we have ${\tilde{I}}_{i}^{j} (\bar{x}, ϵ) = {\tilde{I}}_{i}^{j} (\bar{x}, 0)$ . Consider t with $‖ x_{t} - \bar{x} ‖ < R$ , $L ‖ x_{t} - \bar{x} ‖ < ϵ_{t} < ϵ / 2$ and fix $i \in {1, \dots, m_{D}}$ . For every $j \in A_{i} (F (\bar{x}))$ we have

\begin{matrix} d (F_{i} (x_{t}), D_{i}^{j}) \leq ‖ F_{i} (x_{t}) - F_{i} (\bar{x}) ‖ \leq L ‖ x_{t} - \bar{x} ‖ < ϵ_{t}, \end{matrix}

whereas for $j \notin A_{i} (F (\bar{x}))$ we have

\begin{matrix} d (F_{i} (x_{t}), D_{i}^{j}) \geq d (F_{i} (\bar{x}), D_{i}^{j}) - ‖ F_{i} (x_{t}) - F_{i} (\bar{x}) ‖ \geq ϵ - L ‖ x_{t} - \bar{x} ‖ > ϵ_{t} \end{matrix}

showing ${\tilde{A}}_{i} (x_{t}, ϵ_{t}) = A_{i} (F (\bar{x}))$ . Now fix $j \in A_{i} (F (\bar{x}))$ and let $l \in {\tilde{I}}_{i}^{j} (\bar{x}, 0)$ , i.e. $⟨ a_{l}^{i, j}, F_{i} (\bar{x}) ⟩ = b_{l}^{i, j}$ . By taking into account $‖ a_{l}^{i, j} ‖ = 1$ we obtain

\begin{matrix} ⟨ a_{l}^{i, j}, F_{i} (x_{t}) ⟩ \geq b_{l}^{i, j} - ‖ F_{i} (x_{t}) - F_{i} (\bar{x}) ‖ > b_{l}^{i, j} - ϵ_{t} \end{matrix}

implying $l \in {\tilde{I}}_{i}^{j} (x_{t}, ϵ_{t})$ , whereas for $l \notin {\tilde{I}}_{i}^{j} (\bar{x}, 0) = {\tilde{I}}_{i}^{j} (\bar{x}, ϵ)$ we have

\begin{matrix} ⟨ a_{l}^{i, j}, F_{i} (x_{t}) ⟩ \leq ⟨ a_{l}^{i, j}, F_{i} (\bar{x}) ⟩ + ‖ F_{i} (x_{t}) - F_{i} (\bar{x}) ‖ < b_{l}^{i, j} - ϵ + ϵ_{t} < b_{l}^{i, j} - ϵ_{t} \end{matrix}

showing $l \notin {\tilde{I}}_{i}^{j} (x_{t}, ϵ_{t})$ . Hence, ${\tilde{I}}_{i}^{j} (x_{t}, ϵ_{t}) = \tilde{I} (\bar{x}, 0)$ . Because of our assumptions we have $‖ x_{t} - \bar{x} ‖ < R$ and $L ‖ x_{t} - \bar{x} ‖ < ϵ_{t} < ϵ / 2$ for all t sufficiently large and this proves (40).

(ii) In view of Proposition 2, we can choose $κ$ large enough such that the mappings $F (\cdot) - D$ , $u ⇉ \nabla F (\bar{x}) u - T_{D} (F (\bar{x}))$ and $F (\cdot) - D (ν)$ , $u ⇉ \nabla F (\bar{x}) u - T_{D (ν)} (F (\bar{x}))$ , $ν \in M (\bar{x})$ are metrically regular near $(\bar{x}, 0)$ with modulus $κ$ . By eventually shrinking R we can assume that for every $x \in B (\bar{x}, R)$ the mappings $u ⇉ \nabla F (x) u - T_{D} (F (\bar{x}))$ , $u ⇉ \nabla F (x) u - T_{D (ν)} (F (\bar{x}))$ , $ν \in M (\bar{x})$ are metrically regular near (0, 0) with modulus $κ + 1$ .

Without loss of generality we can assume that $x_{t} \in B (\bar{x}, R)$ and (40) holds for all t implying that $T_{D_{i}^{j}} (F (\bar{x})) = T_{i}^{j} (\tilde{x}, ϵ_{t})$ holds for all $i = 1, \dots, m_{D}$ , $j \in A_{i} (F (\bar{x}))$ . In fact, then the problem $Q P D C (x_{t}, ϵ_{t}, σ_{t})$ is the same as

\begin{matrix} min_{u, v} ⟨ \nabla f (x_{t}), u ⟩ + \frac{σ_{t}}{2} {‖ u ‖}^{2} + \frac{1}{2} {‖ v ‖}^{2} subject to \nabla F (x_{t}) u + v \in T_{D} (F (\bar{x})) . \end{matrix}

The point $({\tilde{u}}_{t}, {\tilde{v}}_{t})$ is $Q$ -stationary for this program and thus also S-stationary by Theorem 4(ii) and the full rank property of the matrix $(\nabla F (x_{t}) ⋮ I)$ . Hence, there is a multiplier $λ_{t} \in {\hat{N}}_{T_{D} (F (\bar{x}))} (\nabla F (x_{t}) {\tilde{u}}_{t} + {\tilde{v}}_{t}) \subset N_{T_{D} (F (\bar{x}))} (0)$ fulfilling ${\tilde{v}}_{t} + λ_{t} = 0$ , $\nabla f (x_{t}) + σ_{t} {\tilde{u}}_{t} + \nabla F {(x_{t})}^{T} λ_{t} = 0$ and we conclude

\begin{matrix} ‖ {\tilde{v}}_{t} ‖ = ‖ λ_{t} ‖ \leq (κ + 1) ‖ \nabla f (x_{t}) + σ_{t} {\tilde{u}}_{t} ‖ \end{matrix}

(40)

from (18).

By $Q$ -stationarity of $({\tilde{u}}_{t}, {\tilde{v}}_{t})$ we know that $({\tilde{u}}_{t}, {\tilde{v}}_{t})$ is the unique solution of the strictly convex quadratic program

\begin{matrix} min ⟨ \nabla f (x_{t}), u ⟩ + \frac{σ_{t}}{2} {‖ u ‖}^{2} + \frac{1}{2} {‖ v ‖}^{2} subject to \nabla F (x_{t}) u + v \in T_{D ({\bar{ν}}^{t})} (F (\bar{x})) . \end{matrix}

(41)

For every $α \geq 0$ , the point $α ({\tilde{u}}_{t}, {\tilde{v}}_{t})$ is feasible for this quadratic program and thus $α = 1$ is solution of

\begin{matrix} min_{α \geq 0} α ⟨ \nabla f (x_{t}), {\tilde{u}}_{t} ⟩ + α^{2} (\frac{σ_{t}}{2} ‖ {\tilde{u}}_{t} ‖^{2} + \frac{1}{2} {‖ {\tilde{v}}_{t} ‖}^{2}) \end{matrix}

implying

\begin{matrix} - ⟨ \nabla f (x_{t}), {\tilde{u}}_{t} ⟩ = σ_{t} ‖ {\tilde{u}}_{t} ‖^{2} + {‖ {\tilde{v}}_{t} ‖}^{2} . \end{matrix}

Hence,

\begin{matrix} σ_{t} ‖ {\tilde{u}}_{t} ‖ \leq - ⟨ \nabla f (x_{t}), \frac{{\tilde{u}}_{t}}{‖ {\tilde{u}}_{t} ‖} ⟩ \leq ‖ \nabla f (x_{t}) ‖ \end{matrix}

(42)

and from (41) we obtain

\begin{matrix} ‖ {\tilde{v}}_{t} ‖ = ‖ λ_{t} ‖ \leq 2 (κ + 1) ‖ \nabla f (x_{t}) ‖ . \end{matrix}

(43)

(a) Assume on the contrary that $\bar{x}$ is B-stationary but for infinitely many t the point $x_{t}$ is not accepted as approximately M-stationary and hence $‖ {\tilde{u}}_{t} ‖ \geq η_{t} / σ_{t}$ . This implies

\begin{matrix} d (\nabla F (\bar{x}) \frac{{\tilde{u}}_{t}}{‖ {\tilde{u}}_{t} ‖}, T_{D} (F (\bar{x}))) & \leq d (\nabla F (x_{t}) \frac{{\tilde{u}}_{t}}{‖ {\tilde{u}}_{t} ‖}, T_{D} (F (\bar{x}))) + L ‖ x_{t} - \bar{x} ‖ \leq \frac{‖ {\tilde{v}}_{t} ‖}{‖ {\tilde{u}}_{t} ‖} + L ‖ x_{t} - \bar{x} ‖ \\ \leq 2 (κ + 1) ‖ f (x_{t}) ‖ \frac{σ_{t}}{η_{t}} + L ‖ x_{t} - \bar{x} ‖ \end{matrix}

and by the metric regularity of $u ⇉ \nabla F (\bar{x}) u - T_{D} (F (\bar{x}))$ near (0, 0) we can find ${\hat{u}}_{t} \in \nabla F {(\bar{x})}^{- 1} T_{D} (F (\bar{x}))$ with

\begin{matrix} ‖ {\hat{u}}_{t} - \frac{{\tilde{u}}_{t}}{‖ {\tilde{u}}_{t} ‖} ‖ \leq κ (2 (κ + 1) ‖ f (x_{t}) ‖ \frac{σ_{t}}{η_{t}} + L ‖ x_{t} - \bar{x} ‖) . \end{matrix}

Our choice of the parameters $σ_{t}$ , $η_{t}$ together with (43) ensures that for t sufficiently large we have

\begin{matrix} ⟨ \nabla f (\bar{x}), {\hat{u}}_{t} ⟩ & \leq ⟨ \nabla f (\bar{x}), \frac{{\tilde{u}}_{t}}{‖ {\tilde{u}}_{t} ‖} ⟩ + ‖ \nabla f (\bar{x}) ‖ ‖ {\hat{u}}_{t} - \frac{{\tilde{u}}_{t}}{‖ {\tilde{u}}_{t} ‖} ‖ \\ \leq ⟨ \nabla f (x_{t}), \frac{{\tilde{u}}_{t}}{‖ {\tilde{u}}_{t} ‖} ⟩ + L ‖ x_{t} - \bar{x} ‖ + ‖ \nabla f (\bar{x}) ‖ ‖ {\hat{u}}_{t} - \frac{{\tilde{u}}_{t}}{‖ {\tilde{u}}_{t} ‖} ‖ \\ \leq - σ_{t} ‖ {\tilde{u}}_{t} ‖ + L ‖ x_{t} - \bar{x} ‖ + ‖ \nabla f (\bar{x}) ‖ ‖ {\hat{u}}_{t} - \frac{{\tilde{u}}_{t}}{‖ {\tilde{u}}_{t} ‖} ‖ \\ \leq - η_{t} + L ‖ x_{t} - \bar{x} ‖ + ‖ \nabla f (\bar{x}) ‖ κ (2 (κ + 1) ‖ f (x_{t}) ‖ \frac{σ_{t}}{η_{t}} + L ‖ x_{t} - \bar{x} ‖) < 0 \end{matrix}

which contradicts B-stationarity of $\bar{x}$ . Hence, for all t sufficiently large the point $x_{t}$ must be accepted as approximately M-stationary.

To prove the statement that $x_{t}$ is also accepted as approximately $Q_{M}$ -stationary for all t sufficiently large we can proceed in a similar way. Assume on the contrary that $\bar{x}$ is B-stationary but for infinitely many t the point $x_{t}$ is not accepted as approximately $Q_{M}$ -stationary. For those t, let $w_{t}$ denote some element fulfilling $\nabla F (x_{t}) w_{t} \in T_{D (ν^{t, {\bar{l}}_{t}})} \subset T_{D} (F (\bar{x}))$ , $‖ w_{t} ‖_{\infty} \leq 1$ and $⟨ \nabla f (x_{t}), w_{t} ⟩ \leq - η_{t} .$ Then, similar as before we can find ${\hat{w}}_{t} \in \nabla F {(\bar{x})}^{- 1} T_{D} (F (\bar{x}))$ such that

\begin{matrix} ‖ {\hat{w}}_{t} - w_{t} ‖ \leq κ ‖ \nabla F (\bar{x}) - \nabla F (x_{t}) ‖ ‖ w_{t} ‖ \leq κ L \sqrt{n} ‖ x_{t} - \bar{x} ‖ \end{matrix}

and for large t we obtain

\begin{matrix} ⟨ \nabla f (\bar{x}), {\hat{w}}_{t} ⟩ & \leq ⟨ \nabla f (x_{t}), w_{t} ⟩ + ‖ \nabla f (\bar{x}) - \nabla f (x_{t}) ‖ ‖ w_{t} ‖ + ‖ \nabla f (\bar{x}) ‖ ‖ {\hat{w}}_{t} - w_{t} ‖ \\ \leq - η_{t} + L \sqrt{n} (1 + κ ‖ \nabla f (\bar{x}) ‖) ‖ x_{t} - \bar{x} ‖ < 0 \end{matrix}

contradicting B-stationarity of $\bar{x}$ .

(b) By passing to a subsequence we can assume that for all t the point $x_{t}$ is accepted as approximately M-stationary and hence $σ_{t} ‖ u_{t} ‖ \leq η_{t} \to 0$ . By (44) we have that the sequence $λ_{t} \in N_{T_{D} (F (\bar{x}))} (0)$ is uniformly bounded and by passing to a subsequence once more we can assume that it converges to some $\bar{λ} \in N_{T_{D} (F (\bar{x}))} (0)$ . By [11, Proposition 6.27] we have $\bar{λ} \in N_{D} (F (\bar{x}))$ and together with

\begin{matrix} 0 = lim_{t \to \infty} (\nabla f (x_{t}) + \nabla F {(x_{t})}^{T} λ_{t}) = \nabla f (\bar{x}) + \nabla F {(\bar{x})}^{T} \bar{λ} \end{matrix}

M-stationarity of $\bar{x}$ is established.

(c) By passing to a subsequence we can assume that for all t the point $x_{t}$ is accepted as approximately $Q_{M}$ -stationary and ${{\bar{ν}}^{t}, ν^{t, 2}, \dots, ν^{t, K_{t}}} \subset M (\bar{x})$ . Hence, for all t the point $x_{t}$ is also accepted as M-stationary and by passing to a subsequence and arguing as in (b) we can assume that $λ_{t}$ converges to some $\bar{λ} \in N_{D} (F (\bar{x}))$ fulfilling $\nabla f (\bar{x}) + \nabla F {(\bar{x})}^{T} \bar{λ} = 0$ . Since the set $M (\bar{x})$ is finite, by passing to a subsequence once more we can assume that there is a number K and elements $\bar{ν}, ν^{2}, \dots, ν^{K}$ such that $K_{t} = K$ , ${\bar{ν}}^{t} = \bar{ν}$ and $ν^{t, l} = ν^{l}$ , $l = 2, \dots, K$ holds for all t. Since we assume that (40) holds we have $(\bar{ν}, ν^{2}, \dots, ν^{K}) \in Q (\bar{x})$ and we will now show that $\bar{x}$ is $Q_{M}$ -stationary with respect to $(\bar{ν}, ν^{2}, \dots, ν^{K})$ . Since $({\tilde{u}}_{t}, {\tilde{v}}_{t})$ also solves (42), it follows that $λ_{t} = - v_{t} \in N_{T_{D (\bar{ν})} (F (\bar{x}))} (\nabla F (x_{t}) {\tilde{u}}_{t} + {\tilde{v}}_{t}) \subset N_{D (\bar{ν})} (F (\bar{x}))$ and thus $\bar{λ} \in N_{D} (F (\bar{x})) \cap N_{D (\bar{ν})} (F (\bar{x}))$ implying $- \nabla f (\bar{x}) \in \nabla F {(\bar{x})}^{T} (N_{D} (F (\bar{x})) \cap (T_{D (\bar{ν})} (F (\bar{x})))^{\circ})$ . There remains to show $- \nabla f (\bar{x}) \in (T_{D (ν^{l})} (F (\bar{x})))^{\circ} = N_{D (ν^{l})} (F (\bar{x}))$ , $l = 2, \dots, K$ . Assume on the contrary that $- \nabla f (\bar{x}) \notin (T_{D (ν^{\bar{l}})} (F (\bar{x})))^{\circ}$ for some index $\bar{l} \in {2, \dots, K}$ . Then there is some $u \in \nabla F {(\bar{x})}^{- 1} T_{D (ν^{\bar{l}})} (F (\bar{x}))$ , ${‖ u ‖}_{\infty} = \frac{1}{2}$ such that $⟨ \nabla f (\bar{x}), u ⟩ = : - γ < 0$ and since $ν^{\bar{l}} \in M (\bar{x})$ , for each t there is some ${\hat{u}}_{t} \in \nabla F {(x_{t})}^{- 1} T_{D (ν^{\bar{l}})} (F (\bar{x}))$ with

‖ u - {\hat{u}}_{t} ‖ \leq (κ + 1) ‖ \nabla F (\bar{x}) - \nabla F (x_{t}) ‖ ‖ u ‖ \leq \frac{\sqrt{n}}{2} (κ + 1) L ‖ x_{t} - \bar{x} ‖ .

It follows that for all t sufficiently large we have $‖ {\hat{u}}_{t} ‖_{\infty} \leq 1$ and

\begin{matrix} ⟨ \nabla f (x_{t}), {\hat{u}}_{t} ⟩ & \leq ⟨ \nabla f (\bar{x}), u ⟩ + ‖ \nabla f (x_{t}) - \nabla f (\bar{x}) ‖ ‖ {\hat{u}}_{t} ‖ + ‖ \nabla f (\bar{x}) ‖ ‖ u - {\hat{u}}_{t} ‖ \\ \leq - γ + L \sqrt{n} (1 + \frac{κ + 1}{2}) ‖ x_{t} - \bar{x} ‖ < - η_{t} \end{matrix}

contradicting our assumption that $x_{t}$ is accepted as approximately $Q_{M}$ -stationary.

(d), (e) We assume that $κ$ is chosen large enough such that the mappings $F (\cdot) - D (ν)$ , $ν \in M_{s u b} (\bar{x})$ are metrically subregular at $(\bar{x}, 0)$ with modulus $κ$ . Then by [21, Proposition 2.1] the mappings $u ⇉ \nabla F (\bar{x}) u - T_{D (ν)} (F (\bar{x}))$ , $ν \in M_{s u b} (\bar{x})$ are metrically subregular at (0, 0) with modulus $κ$ as well. Taking into account that $({\tilde{u}}_{t}, {\tilde{v}}_{t})$ solves (42), we can copy the arguments from part (a) with $T_{D} (F (\bar{x}))$ replaced by $T_{D ({\bar{ν}}^{t})} (F (\bar{x}))$ to show the existence of ${\hat{u}}_{t} \in \nabla F {(\bar{x})}^{- 1} T_{D ({\bar{ν}}^{t})} (F (\bar{x}))$ with $⟨ \nabla f (\bar{x}), {\hat{u}}_{t} ⟩ < 0$ whenever $x_{t}$ is not accepted as approximately M-stationary and t is sufficiently large. In doing so, we also have to recognize that metric regularity of $u ⇉ \nabla F (\bar{x}) u - T_{D ({\bar{ν}}^{t})} (F (\bar{x}))$ can be replaced by the weaker property of metric subregularity. Since ${\bar{ν}}^{t} \in M_{s u b} (\bar{x})$ , ${\hat{u}}_{t}$ is a feasible descent direction and for sufficiently small $α > 0$ the projection of $\bar{x} + α {\hat{u}}_{t}$ on $F^{- 1} (D ({\bar{ν}}^{t}))$ yields a point with a smaller objective function value than $\bar{x}$ . This proves (d). In order to show (e), we can proceed in a similar way. Using the same arguments as in part (a), we can prove the existence of a feasible direction ${\hat{w}}_{t} \in T_{D (ν^{t, {\bar{l}}_{t}})}$ with $⟨ \nabla f (\bar{x}), {\hat{w}}_{t} ⟩ < 0$ , whenever t is sufficiently large and $x_{t}$ is not accepted as approximately $Q_{M}$ -stationary. Together with $ν^{t, {\bar{l}}_{t}} \in M_{s u b} (\bar{x})$ the assertion follows.

Funding Statement

The research was supported by the Austrian Science Fund (FWF) [grant number P26132-N25], [grant number P29190-N32].

Disclosure statement

No potential conflict of interest was reported by the authors.

References

[1]. Flegel ML, Kanzow C. A Fritz John approach to first order optimality conditions for mathematical programs with equilibrium constraints. Optimization. 2003;52:277–286. [Google Scholar]
[2]. Fukushima M, Pang JS. Complementarity constraint qualifications and simplified B-stationary conditions for mathematical programs with equilibrium constraints. Comput Optim Appl. 1999;13:111–136. [Google Scholar]
[3]. Kanzow C, Schwartz A. Mathematical programs with equilibrium constraints: enhanced Fritz John conditions, new constraint qualifications and improved exact penalty results. SIAM J Optim. 2010;20:2730–2753. [Google Scholar]
[4]. Outrata JV. Optimality conditions for a class of mathematical programs with equilibrium constraints. Math Oper Res. 1999;24:627–644. [Google Scholar]
[5]. Outrata JV. A generalized mathematical program with equilibrium constraints. SIAM J Control Optim. 2000;38:1623–1638. [Google Scholar]
[6]. Scheel H, Scholtes S. Mathematical programs with complementarity constraints: stationarity, optimality, and sensitivity. Math Oper Res. 2000;25:1–22. [Google Scholar]
[7]. Ye JJ. Constraint qualifications and necessary optimality conditions for optimization problems with variational inequality constraints. SIAM J Optim. 2000;10:943–962. [Google Scholar]
[8]. Ye JJ. Necessary and sufficient optimality conditions for mathematical programs with equilibrium constraints. J Math Anal Appl. 2005;307:350–369. [Google Scholar]
[9]. Ye JJ, Ye XY. Necessary optimality conditions for optimization problems with variational inequality constraints. Math Oper Res. 1997;22:977–997. [Google Scholar]
[10]. Hoheisel T. Mathematical programs with vanishing constraints [PhD thesis], Julius-Maximilians-Universität Würzburg; 2009. [Google Scholar]
[11]. Rockafellar RT, Wets RJ-B. Variational analysis. Berlin: Springer; 1998. [Google Scholar]
[12]. Flegel ML, Kanzow C, Outrata JV. Optimality conditions for disjunctive programs with application to mathematical programs with equilibrium constraints. Set-Valued Anal. 2007;15:139–162. [Google Scholar]
[13]. Kanzow C, Schwartz A. The price of inexactness: convergence properties of relaxation methods for mathematical programs with equilibrium constraints revisited. Math Oper Res. 2015;40:253–275. [Google Scholar]
[14]. Benko M, Gfrerer H. On estimating the regular normal cone to constraint systems and stationary conditions. Optimization. 2017;66:61–92. [Google Scholar]
[15]. Jeroslow R. Representability in mixed integer programming, I: characterization results. Discrete Appl Math. 1977;17:223–243. [Google Scholar]
[16]. Balas E. Disjunctive programming and a hierarchy of relaxations for discrete optimization problems. SIAM J Algebraic Discrete Meth. 1985;6:466–486. [Google Scholar]
[17]. Ceria S, Soares J. Convex programming for disjunctive convex optimization. Math Program Ser A. 1999;86:595–614. [Google Scholar]
[18]. Gfrerer H, Outrata JV. On computation of generalized derivatives of the normal-cone mapping and their applications. Math Oper Res. 2016;41:1535–1556. [Google Scholar]
[19]. Henrion R, Outrata JV. Calmness of constraint systems with applications. Math Program Ser B. 2005;104:437–464. [Google Scholar]
[20]. Gfrerer H, Klatte D. Lipschitz and Hölder stability of optimization problems and generalized equations. Math Program Ser A. 2016;158:35–75. [Google Scholar]
[21]. Gfrerer H. First order and second order characterizations of metric subregularity and calmness of constraint set mappings. SIAM J Optim. 2011;21:1439–1474. [Google Scholar]
[22]. Rockafellar RT. Convex analysis. Princeton (NJ): Princeton University Press; 1970. [Google Scholar]
[23]. Robinson SM. Some continuity properties of polyhedral multifunctions. Math Program Stud. 1981;14:206–214. [Google Scholar]
[24]. Gfrerer H, Ye JJ. New constraint qualifications for mathematical programs with equilibrium constraints via variational analysis. SIAM J Optim. 2017;27:842–865. [Google Scholar]
[25]. Benko M, Gfrerer H. An SQP method for mathematical programs with vanishing constraints with strong convergence properties. Comput Optim Appl. 2017;67:361–399. [DOI] [PMC free article] [PubMed] [Google Scholar]
[26]. Fletcher R. Practical methods of optimization. Vol. 2, Constrained optimization. Chichester: Wiley; 1981. [Google Scholar]

[CIT0001] [1]. Flegel ML, Kanzow C. A Fritz John approach to first order optimality conditions for mathematical programs with equilibrium constraints. Optimization. 2003;52:277–286. [Google Scholar]

[CIT0002] [2]. Fukushima M, Pang JS. Complementarity constraint qualifications and simplified B-stationary conditions for mathematical programs with equilibrium constraints. Comput Optim Appl. 1999;13:111–136. [Google Scholar]

[CIT0003] [3]. Kanzow C, Schwartz A. Mathematical programs with equilibrium constraints: enhanced Fritz John conditions, new constraint qualifications and improved exact penalty results. SIAM J Optim. 2010;20:2730–2753. [Google Scholar]

[CIT0004] [4]. Outrata JV. Optimality conditions for a class of mathematical programs with equilibrium constraints. Math Oper Res. 1999;24:627–644. [Google Scholar]

[CIT0005] [5]. Outrata JV. A generalized mathematical program with equilibrium constraints. SIAM J Control Optim. 2000;38:1623–1638. [Google Scholar]

[CIT0006] [6]. Scheel H, Scholtes S. Mathematical programs with complementarity constraints: stationarity, optimality, and sensitivity. Math Oper Res. 2000;25:1–22. [Google Scholar]

[CIT0007] [7]. Ye JJ. Constraint qualifications and necessary optimality conditions for optimization problems with variational inequality constraints. SIAM J Optim. 2000;10:943–962. [Google Scholar]

[CIT0008] [8]. Ye JJ. Necessary and sufficient optimality conditions for mathematical programs with equilibrium constraints. J Math Anal Appl. 2005;307:350–369. [Google Scholar]

[CIT0009] [9]. Ye JJ, Ye XY. Necessary optimality conditions for optimization problems with variational inequality constraints. Math Oper Res. 1997;22:977–997. [Google Scholar]

[CIT0010] [10]. Hoheisel T. Mathematical programs with vanishing constraints [PhD thesis], Julius-Maximilians-Universität Würzburg; 2009. [Google Scholar]

[CIT0011] [11]. Rockafellar RT, Wets RJ-B. Variational analysis. Berlin: Springer; 1998. [Google Scholar]

[CIT0012] [12]. Flegel ML, Kanzow C, Outrata JV. Optimality conditions for disjunctive programs with application to mathematical programs with equilibrium constraints. Set-Valued Anal. 2007;15:139–162. [Google Scholar]

[CIT0013] [13]. Kanzow C, Schwartz A. The price of inexactness: convergence properties of relaxation methods for mathematical programs with equilibrium constraints revisited. Math Oper Res. 2015;40:253–275. [Google Scholar]

[CIT0014] [14]. Benko M, Gfrerer H. On estimating the regular normal cone to constraint systems and stationary conditions. Optimization. 2017;66:61–92. [Google Scholar]

[CIT0015] [15]. Jeroslow R. Representability in mixed integer programming, I: characterization results. Discrete Appl Math. 1977;17:223–243. [Google Scholar]

[CIT0016] [16]. Balas E. Disjunctive programming and a hierarchy of relaxations for discrete optimization problems. SIAM J Algebraic Discrete Meth. 1985;6:466–486. [Google Scholar]

[CIT0017] [17]. Ceria S, Soares J. Convex programming for disjunctive convex optimization. Math Program Ser A. 1999;86:595–614. [Google Scholar]

[CIT0018] [18]. Gfrerer H, Outrata JV. On computation of generalized derivatives of the normal-cone mapping and their applications. Math Oper Res. 2016;41:1535–1556. [Google Scholar]

[CIT0019] [19]. Henrion R, Outrata JV. Calmness of constraint systems with applications. Math Program Ser B. 2005;104:437–464. [Google Scholar]

[CIT0020] [20]. Gfrerer H, Klatte D. Lipschitz and Hölder stability of optimization problems and generalized equations. Math Program Ser A. 2016;158:35–75. [Google Scholar]

[CIT0021] [21]. Gfrerer H. First order and second order characterizations of metric subregularity and calmness of constraint set mappings. SIAM J Optim. 2011;21:1439–1474. [Google Scholar]

[CIT0022] [22]. Rockafellar RT. Convex analysis. Princeton (NJ): Princeton University Press; 1970. [Google Scholar]

[CIT0023] [23]. Robinson SM. Some continuity properties of polyhedral multifunctions. Math Program Stud. 1981;14:206–214. [Google Scholar]

[CIT0024] [24]. Gfrerer H, Ye JJ. New constraint qualifications for mathematical programs with equilibrium constraints via variational analysis. SIAM J Optim. 2017;27:842–865. [Google Scholar]

[CIT0025] [25]. Benko M, Gfrerer H. An SQP method for mathematical programs with vanishing constraints with strong convergence properties. Comput Optim Appl. 2017;67:361–399. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0026] [26]. Fletcher R. Practical methods of optimization. Vol. 2, Constrained optimization. Chichester: Wiley; 1981. [Google Scholar]

PERMALINK

New verifiable stationarity concepts for a class of mathematical programs with disjunctive constraints

Matúš Benko

Helmut Gfrerer

Abstract

1. Introduction

2. Preliminaries

Definition 1:

Theorem 1:

Definition 2:

Proposition 1:

Definition 3:

Proposition 2:

Definition 4:

3. On Q- and QM-stationarity

Theorem 2:

Remark 1:

Theorem 3:

Corollary 1:

Definition 5:

Corollary 2:

Lemma 1:

Lemma 2:

Corollary 3:

4. Application to MPDC

Lemma 3:

Definition 6:

Theorem 4:

5. On quadratic programs with disjunctive constraints

Lemma 4:

Theorem 5:

6. On verifying QM-stationarity for MPDC

Theorem 6:

Corollary 4:

Corollary 5:

7. Numerical aspects

Theorem 7:

Funding Statement

Disclosure statement

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3. On $Q$ - and $Q_{M}$ -stationarity

6. On verifying $Q_{M}$ -stationarity for MPDC