On the Finite Complexity of Solutions in a Degenerate System of Quadratic Equations: Exact Formula

Olga Brezhneva; Agnieszka Prusińska; Alexey A Tret’yakov

doi:10.3390/e25081112

. 2023 Jul 25;25(8):1112. doi: 10.3390/e25081112

On the Finite Complexity of Solutions in a Degenerate System of Quadratic Equations: Exact Formula

Olga Brezhneva ^1,^†, Agnieszka Prusińska ^2,^*,^†, Alexey A Tret’yakov ^2,^3,^†

Editor: Ravi P Agarwal

PMCID: PMC10453035 PMID: 37628142

Abstract

The paper describes an application of the p-regularity theory to Quadratic Programming (QP) and nonlinear equations with quadratic mappings. In the first part of the paper, a special structure of the nonlinear equation and a construction of the 2-factor operator are used to obtain an exact formula for a solution to the nonlinear equation. In the second part of the paper, the QP problem is reduced to a system of linear equations using the 2-factor operator. The solution to this system represents a local minimizer of the QP problem along with its corresponding Lagrange multiplier. An explicit formula for the solution of the linear system is provided. Additionally, the paper outlines a procedure for identifying active constraints, which plays a crucial role in constructing the linear system.

Keywords: quadratic programming, singular problems, p-regularity, 2-factor-operator

MSC: 65H10, 90C20, 65K05, 90C30

1. Introduction

Consider the nonlinear equation $F (x) = 0$ with the mapping F defined by:

F (x) = B {[x]}^{2} + M x + N,

(1)

where $M \in R^{n \times n}$ is a matrix, $N \in R^{n}$ is a vector, and $B : R^{n} \to R^{n}$ is the map defined for $x \in R^{n}$ by:

B {[x]}^{2} = B (x, x) = {(x^{T} B_{1} x, \dots, x^{T} B_{n} x)}^{T},

(2)

where $B_{i}$ is an $n \times n$ symmetric matrix for $i = 1, \dots, n$ .

We also consider the quadratic programming (QP) problem with inequality constraints:

\begin{matrix} \underset{x}{minimize} & f (x) = \frac{1}{2} x^{T} Q x + c^{T} x \\ subject to & A x \leq b \end{matrix} (QP),

(3)

where Q is an $n \times n$ symmetric matrix, A is an $m \times n$ matrix, $c, x \in R^{n}$ , and $b \in R^{m}$ .

The paper describes an application of the p-regularity theory to nonlinear equations with the mapping F introduced in (1) and to the quadratic programming problem (3).

In recent years, there has been growing interest in nonlinear problems, including quadratic and polynomial equations, as well as nonlinear optimization problems, attracting specialists from various disciplines (see, for example, refs. [1,2,3,4] and references therein). Furthermore, it was observed that nonlinear problems are closely related to singular problems, as demonstrated in [5]. In fact, it has been discovered that essentially nonlinear problems and singular problems are locally equivalent. In this work, we aim to provide a theoretical foundation for this claim by introducing several auxiliary concepts as proposed in [5].

Definition 1.

Let V be a neighborhood of $x^{*}$ in $R^{n}$ , and let $U \subset R^{n}$ be a neighborhood of 0. A mapping $F : V \to R^{n}$ , where $F \in C^{2} (V)$ , is considered essentially nonlinear at $x^{*}$ if there exists a perturbation of the form:

$\tilde{F} (x^{*} + x) = F (x^{*} + x) + ω (x), where ∥ ω (x) ∥ = o (∥ x ∥),$

such that no nondegenerate transformation of coordinates $φ (x) : U \to V$ , where $φ \in C^{1} (U)$ , satisfies $φ (0) = x^{*}$ , $φ^{'} (0) = I_{n}$ , where $I_{n}$ is the identity map in $R^{n}$ , and:

$\tilde{F} (φ (x)) = \tilde{F} (x^{*}) + {\tilde{F}}^{'} (x^{*}) x for all x \in U .$

Definition 2.

We say that the mapping F is singular (or degenerate) at $x^{*}$ if it fails to be regular, meaning its derivative is not onto:

$Im F^{'} (x^{*}) \neq R^{n} .$

The relationship between the notions of essential nonlinearity and singularity is established in Theorem 1, which was derived in [5].

Theorem 1.

Let V be a neighborhood of $x^{*}$ in $R^{n}$ . Suppose $F : V \to R^{n}$ is $C^{2}$ and that $x^{*}$ is a solution of $F (x) = 0$ . Then F is essentially nonlinear at the point $x^{*}$ if and only if F is singular at the point $x^{*}$ .

The work presented in [5] primarily focuses on the construction of p-regularity and its applications in various areas of mathematics. However, it does not specifically cover quadratic nonlinear equations and quadratic programming problems. The current paper builds upon the foundation of the p-regularity theory established in [5] but introduces novel results. The main objective of this paper is to explore the key aspects of nonlinear problems, with a particular emphasis on systems of quadratic equations and quadratic programming problems that may involve singular solutions.

Specifically, we begin by considering the nonlinear equation $F (x) = 0$ . One of the main goals of the paper is to derive the exact formula for a solution $x^{*}$ of the nonlinear equation $F (x) = 0$ using the special form of the quadratic mapping F defined in (1). We demonstrate how to use a construction of a special 2-factor-operator to transform the original problem into a system of linear equations. The construction of the 2-factor-operator combines the mapping F with its first derivative $F^{'} (x)$ .

In the second part of the paper, we apply a similar approach to the quadratic programming problem (3) in order to derive explicit formulas for the solution $(x^{*}, λ^{*})$ , where $x^{*}$ represents a local minimizer of the QP problem and $λ^{*}$ is the corresponding Lagrange multiplier. Namely, using the special form of the QP problem and the 2-factor-operator, we reduce the system of optimality conditions for the QP problem to a system of linear equations, with the point $(x^{*}, λ^{*})$ as its solution. The paper also describes a procedure for identifying the active constraints, which plays a vital role in constructing the linear system.

Although there is literature on solutions of degenerate systems of quadratic equations, the approach presented in this paper is novel and distinct from the methods proposed by other authors. This approach can be applied to various problems and areas of mathematics where the problem involves solving a degenerate equation $F (x) = 0$ with a quadratic mapping F. Such nonlinear problems can arise in the numerical solutions and analysis of ordinary differential equations, partial differential equations, optimal control problems, algebraic geometry, and other fields. In the second part of the paper, we specifically focus on using the methods developed in the first sections to solve the QP problem (3). The quadratic programming problems have attracted the attention of many researchers and scientists, so there is an extensive body of literature on the topic. Some publications in this area include [6,7,8,9,10,11,12].

The outline of the paper. The main contribution and novelty of the paper are in the exact formulas for a solution of a nonlinear equation and of the quadratic programming problems, presented in Section 3 and Section 5, respectively.

In Section 2, we recall the main definitions of the p-regularity theory, as presented in [5], including the special case of $p = 2$ . Additionally, we introduce the p-factor method for solving singular nonlinear equations of the form $F (x) = 0$ and describe various versions of the 2-factor method.

Section 3 presents some of the key results of the paper, focusing on the application of a modified 2-factor method to solve the nonlinear equation $F (x) = 0$ with the mapping F defined as $F (x) = B {[x]}^{2} + M x + N,$ where $M \in R^{n \times n}$ is a matrix, $N \in R^{n}$ is a vector, and $B : R^{n} \to R^{n}$ is defined by (2). In this section, we introduce multiple approaches to obtain exact formulas for a solution to the nonlinear equation $F (x) = 0$ , demonstrating that the proposed methods converge to a solution $x^{*}$ of the nonlinear equation in just one iteration.

Section 4 focuses on an auxiliary result used in other parts of the paper. We present a theorem that describes the properties of a special mapping $μ (x)$ , which enables us to propose a procedure for determining r linearly independent vectors $f_{i_{k}}^{'} (x^{*})$ , $k = 1, \dots, r$ , at the solution $x^{*}$ of $F (x) = 0$ , without needing to know the exact value of $x^{*}$ . This procedure relies on information about the system of vectors ${f_{1}^{'} (x), \dots, f_{m}^{'} (x)}$ at some point x within a small neighborhood of $x^{*}$ .

Section 5 presents other novel results, focusing on deriving exact formulas for a solution of quadratic programming problems. The section is divided into three parts. In Section 5.1, we consider regular quadratic programming problems and propose three approaches to solving the QP problem and obtaining a formula for its solution. These approaches are based on the construction of the 2-factor-operator. Section 5.2 addresses the issue of identifying the active constraints and proposes strategies for numerically determining the set of active constraints $I (x^{*})$ . These techniques are then applied in the final part, Section 5.3, to address degenerate QPs. The paper concludes with some closing remarks in Section 6.

Notation. Let $a_{i}$ denote the rows of the $m \times n$ matrix A in problem (3), and let $b = {(b_{1}, \dots, b_{m})}^{T}$ , so that $a_{i}^{T} \in R^{n}$ and $b_{i} \in R$ for $i = 1, \dots, m$ .

The active set $I (x^{*})$ at any feasible point $x^{*}$ of problem (3) is the set of indices of the active constraints at $x^{*}$ , i.e., $I (x^{*}) = {i = 1, \dots, m ∣ a_{i} x^{*} - b_{i} = 0}$ .

Furthermore, $Ker S = {x \in R^{n} ∣ S x = 0}$ denotes the null-space (kernel) of a given linear operator $S : R^{n} \to R^{m}$ , and $Im S = {y \in R^{m} ∣ y = S x for some x \in R^{n}}$ is its image space.

Let $B : R^{n} \times R^{n} \to R^{n}$ be a bilinear symmetric mapping. The 2-form associated with B is the map $B {[\cdot]}^{2} : R^{n} \to R^{n}$ defined by $B {[x]}^{2} = B (x, x)$ for $x \in R^{n}$ . We also use the following notation: ${Im}^{2} B = {y \in R^{n} ∣ B {[x]}^{2} = y for some x \in R^{n}}$ and ${Ker}^{2} B = {x \in R^{n} ∣ B {[x]}^{2} = 0}$ . We denote by $N (x^{*})$ and $N_{ε} (x^{*})$ neighborhoods of a point $x^{*}$ , where $N_{ε} (x^{*})$ is an $ε$ -neighborhood of $x^{*}$ , i.e., an open ball of radius $ε$ centered at $x^{*}$ .

The notation for the scalar (dot) product of vectors x and y in $R^{n}$ , used in the paper, is $x \cdot y = x^{T} y$ .

We denote by $span (a^{1}, \dots, a^{m})$ the linear span of the given vectors $a^{1}, \dots, a^{m}$ . We also denote by $d (x, S)$ the distance between a point x and a set S.

2. Elements of the $p$ -Regularity Theory

We begin this section with the main definitions of the p-regularity theory, which are given in [5]. The primary focus is on the sufficiently smooth mapping F from $R^{n}$ to $R^{n}$ , defined as:

F (x) = {(f_{1} (x), \dots, f_{n} (x))}^{T},

(4)

where $f_{i} (x) : R^{n} \to R$ for $i = 1, \dots, n$ . After presenting the general case, we focus on the specific case of $p = 2$ . We introduce the definitions of the 2-regular mapping and the 2-factor-operator, which play a central role in the subsequent sections.

2.1. The Main Definitions and Constructions of the p-Regularity Theory

Throughout this section, we consider the nonlinear equation:

F (x) = 0,

(5)

where F is defined in Equation (4). Let $x^{*} \in R^{n}$ represent a solution to the nonlinear Equation (5).

The mapping F is called regular at $x^{*}$ if:

Im F^{'} (x^{*}) = R^{n},

(6)

or in other words, if:

rank F^{'} (x^{*}) = n,

where $F^{'} (x^{*})$ is the Jacobian matrix of the mapping F at $x^{*}$ . Conversely, the mapping F is called nonregular (irregular, degenerate) if the regularity condition (6) is not satisfied.

Let the space $R^{n}$ be decomposed into the direct sum:

R^{n} = Y_{1} \oplus \dots \oplus Y_{p},

(7)

where $Y_{1} = Im F^{'} (x^{*}),$ is defined as the closure of the image of the first derivative of F evaluated at $x^{*}$ , and p is chosen as the minimum number for which Equation (7) holds.

The remaining spaces are defined as follows. Let $Z_{1} = R^{n}$ , and let $Z_{2}$ be a closed complementary subspace to $Y_{1}$ . Let $P_{Z_{2}} : R^{n} \to Z_{2}$ be the projection operator onto $Z_{2}$ along $Y_{1}$ . Define $Y_{2}$ as the closed linear span of the image of the quadratic map $P_{Z_{2}} F^{(2)} (x^{*}) {[\cdot]}^{2}$ . More generally, we define $Y_{i}$ inductively for $i = 2, \dots, p - 1$ as:

Y_{i} = span Im P_{Z_{i}} F^{(i)} (x^{*}) {[\cdot]}^{i} \subseteq Z_{i},

where $Z_{i}$ is a choice of a complementary subspace for $(Y_{1} \oplus \dots \oplus Y_{i - 1})$ with respect to Y, $i = 2, \dots, p$ , and $P_{Z_{i}} : Y \to Z_{i}$ is the projection operator onto $Z_{i}$ along $(Y_{1} \oplus \dots \oplus Y_{i - 1})$ with respect to Y, $i = 2, \dots, p$ . Finally, we let $Y_{p} = Z_{p} .$

Define the following mappings:

F_{i} (x) : R^{n} \to Y_{i}, F_{i} (x) = P_{Y_{i}} F (x), i = 1, \dots, p,

(8)

where $P_{Y_{i}} : R^{n} \to Y_{i}$ is the projection operator onto $Y_{i}$ along $(Y_{1} \oplus \dots \oplus Y_{i - 1} \oplus Y_{i + 1} \oplus \dots \oplus Y_{p})$ with respect to $R^{n}$ for $i = 1, \dots, p$ . Then F can be represented as $F (x) = F_{1} (x) + \dots + F_{p} (x)$ or equivalently as $F (x) = (F_{1} (x), \dots, F_{p} (x)) .$

Definition 3.

The linear operator $Ψ_{p} (h) \in L (R^{n}, Y_{1} \oplus \dots \oplus Y_{p})$ , where $h \in R^{n}$ , $h \neq 0$ , is defined by:

$Ψ_{p} (h) = F_{1}^{'} (x^{*}) + F_{2}^{″} (x^{*}) [h] + \dots + F_{p}^{(p)} (x^{*}) {[h]}^{p - 1},$

and is called the p-factor operator.

Consider the nonlinear operator $Ψ_{p} {[\cdot]}^{p}$ defined by:

Ψ_{p} {[x]}^{p} = F_{1}^{'} (x^{*}) [x] + F_{2}^{″} (x^{*}) {[x]}^{2} + \dots + F_{p}^{(p)} (x^{*}) {[x]}^{p} .

Notice that $Ψ_{p} {[x]}^{p} = Ψ_{p} (x) [x]$ .

Definition 4.

The p-kernel of the operator $Ψ_{p}$ at the point $x^{*}$ is the set $H_{p} (x^{*})$ defined by:

$H_{p} (x^{*}) = {Ker}^{p} Ψ_{p},$

where:

${Ker}^{p} Ψ_{p} = {h \in R^{n} ∣ F_{1}^{'} (x^{*}) [h] + F_{2}^{″} (x^{*}) {[h]}^{2} + \dots + F_{p}^{(p)} (x^{*}) {[h]}^{p} = 0} .$

Please note that ${Ker}^{p} Ψ_{p} = ⋂_{k = 1}^{p} {Ker}^{k} F_{k}^{(k)} (x^{*})$ , where:

{Ker}^{k} F_{k}^{(k)} (x^{*}) = {ξ \in R^{n} ∣ F_{k}^{(k)} (x^{*}) {[ξ]}^{k} = 0}

is the k-kernel of $F^{(k)} (\cdot) {[\cdot]}^{k}$ .

Definition 5.

A mapping F is called p-regular at $x^{*}$ along h if $Im Ψ_{p} (h) = R^{n}$ .

Definition 6.

A mapping F is called p-regular at $x^{*}$ if it is p-regular along all $h \in H_{p} (x^{*}) ∖ {0}$ or $H_{p} (x^{*}) = {0}$ .

Now, we will focus on the special case of $p = 2$ , which we are using in the paper. We denote the image of the Jacobian matrix $F^{'} (x^{*})$ by $R_{1}$ : $R_{1} = Im F^{'} (x^{*}),$ and the orthogonal complementary subspace of $R_{1}$ in $R^{n}$ by $R_{2}$ . Then:

R^{n} = R_{1} \oplus R_{2} .

We also denote an $n \times n$ matrix of the orthogonal projection onto $R_{i}$ in $R^{n}$ by $P_{R_{i}}$ , $i = 1, 2 .$

Similarly to Equation (8), we introduce the mappings:

F_{i} (x) = P_{R_{i}} F (x), i = 1, 2 .

The p-factor operator plays the central role in the p-regularity theory. We give the following definition of the p-factor-operator for $p = 2$ .

Definition 7.

We define a 2–factor-operator of the mapping F at $x^{*}$ with respect to some vector $h \in R^{n}$ , $h \neq 0$ , as a linear operator from $R^{n}$ to $R^{n}$ , defined by one of the following equations:

$Ψ_{2} (h) = F_{1}^{'} (x^{*}) + F_{2}^{″} (x^{*}) [h],$ (9)

${\bar{Ψ}}_{2} (h) = F^{'} (x^{*}) + P_{R_{2}} F^{″} (x^{*}) [h],$ (10)

${\tilde{Ψ}}_{2} (h) = F^{'} (x^{*}) + F^{″} (x^{*}) [h] .$ (11)

Now we are ready to introduce another very important definition of the 2-regularity theory.

Definition 8.

The mapping F is called 2-regular at the point $x^{*}$ with respect to the element h if the image of a 2–factor-operator, defined by one of the Equations (9)–(11) is equal to $R^{n}$ .

Definition 9.

The mapping F is called 2-regular at $x^{*}$ if it is 2-regular at $x^{*}$ with respect to all the elements h from a set that is defined as:

1.
$H_{2} (x^{*}) = Ker F_{1}^{'} (x^{*}) ⋂ {Ker}^{2} F_{2}^{″} (x^{*})$ for $Ψ_{2} (h)$ defined by (9);

2.
${\bar{H}}_{2} (x^{*}) = Ker F^{'} (x^{*}) ⋂ {Ker}^{2} P_{R_{2}} F^{″} (x^{*})$ for ${\bar{Ψ}}_{2} (h)$ defined by (10);

3.
${\tilde{H}}_{2} (x^{*}) = Ker F^{'} (x^{*}) ⋂ {Ker}^{2} F^{″} (x^{*})$ for ${\tilde{Ψ}}_{2} (h)$ defined by (11).

2.2. The p-Factor-Method for Solving Singular Nonlinear Equations

In this section, we introduce the p-factor-method for solving the singular nonlinear equation $F (x) = 0$ . Then we consider the special case of $p = 2$ and describe several versions of the 2-factor-method.

Consider Equation (5) in the case when mapping $F^{'} (x)$ is singular at $x^{*}$ . In this case, the p-factor method is an iterative procedure defined by:

\begin{matrix} x^{k + 1} & = & x^{k} - {(F^{'} (x^{k}) + P_{2} F^{''} (x^{k}) [h] + \dots + P_{p} F^{(p)} (x^{k}) {[h]}^{p - 1})}^{- 1} \\ \cdot (F (x^{k}) + P_{2} F^{'} (x^{k}) [h] + \dots + P_{p} F^{(p - 1)} (x^{k}) {[h]}^{p - 1}), \end{matrix}

(12)

where $k = 0, 1, \dots$ , $P_{i} = P_{Y_{i}}$ for $i = 1, 2, \dots, p$ , and vector h, $∥ h ∥ = 1$ , is chosen in such a way that the p-factor operator $Ψ_{p}$ is nonsingular, which implies that the mapping F is p-regular at $x^{*}$ along h. The following theorem is valid for the p-factor-method (12).

Theorem 2.

Assume that mapping $F \in C^{p + 1} (R^{n})$ and there exists vector h, $∥ h ∥ = 1$ , such that the p-factor operator $Ψ_{p}$ is nonsingular. Given a point $x^{0} \in N_{ε} (x^{*})$ , where $ε > 0$ is sufficiently small and $N_{ε} (x^{*})$ is a neighborhood of $x^{*}$ , the sequence ${x^{k}}$ defined by Equation (12) converges quadratically to the solution $x^{*}$ of (5):

$∥ x^{k + 1} - x^{*} ∥ \leq C ∥ x^{k} - x^{*} ∥^{2}, k = 0, 1, \dots,$ (13)

where $C > 0$ is an independent constant.

Now, we are ready to describe several versions of the 2-factor-method.

For solving singular nonlinear Equation (5), the following iterative method, called the 2-factor-method, was proposed in [13]:

x^{k + 1} = x^{k} - {(F^{'} (x^{k}) + P_{R_{2}} F^{″} (x^{k}) [h])}^{- 1} (F (x^{k}) + P_{R_{2}} F^{'} (x^{k}) [h]), k = 0, 1, 2, \dots,

(14)

where the vector h, $∥ h ∥ = 1$ , is chosen in such a way that matrix $(F^{'} (x^{*}) + P_{R_{2}} F^{″} (x^{*}) [h])$ is invertible.

The following theorem states the convergence properties of the 2-factor-method (14).

Theorem 3.

Given a mapping $F \in C^{3} (R^{n})$ , let $x^{*}$ be a solution of Equation (5). Assume that there exists a vector $h \in R^{n}$ such that $∥ h ∥ = 1$ and F is 2-regular at the point $x^{*}$ with respect to the vector h with the 2-factor-operator ${\bar{Ψ}}_{2} (h)$ defined by (10).

Then there is a neighborhood $N (x^{*})$ of $x^{*}$ in $R^{n}$ such that for any $x^{0} \in N (x^{*})$ , the sequence ${x^{k}}$ generated by the 2-factor-method (14) converges to $x^{*}$ and:

$∥ x^{k + 1} - x^{*} ∥ \leq C ∥ x^{k} - x^{*} ∥^{2},$ (15)

where $C > 0$ is some constant.

Proof.

Since $P_{R_{2}}$ is the orthoprojector onto subspace $R_{2} = R_{1}^{⊥}$ , then for the mapping:

$Φ (x) = F (x) + P_{R_{2}} F^{'} (x) [h],$

we have $Φ (x^{*}) = 0$ . Moreover, because $Φ^{'} (x^{*}) = {\bar{Ψ}}_{2} (h)$ and the mapping F is 2-regular with respect to the vector h, by Definition 8 with ${\bar{Ψ}}_{2} (h)$ defined by (10), we obtain that $Im {\bar{Ψ}}_{2} (h) = R^{n}$ . Hence, the matrix $Φ^{'} (x^{*})$ is invertible.

Therefore, the 2-factor-method given in (14) is an application of Newton’s method to system $Φ (x) = 0$ in a sufficiently small neighborhood of $x^{*}$ . Then the statement of the theorem follows from the properties of Newton’s method [14] (Proposition 1.4.1). □

Now, we will introduce a modified version of the 2-factor-method (14). Assume that there exists a vector $h \neq 0$ such that $F^{'} (x^{*}) h = 0$ and the matrix $(F^{'} (x^{*})$ $+ F^{″} (x^{*}) [h])$ is invertible. Then for solving Equation (5), we can use the following modified 2-factor-method:

x^{k + 1} = x^{k} - {(F^{'} (x^{k}) + F^{″} (x^{k}) [h])}^{- 1} (F (x^{k}) + F^{'} (x^{k}) [h]), k = 0, 1, 2, \dots .

(16)

The following theorem states the convergence properties of method (16).

Theorem 4.

Given a mapping $F \in C^{3} (R^{n})$ , let $x^{*}$ be a solution of Equation (5). Assume that there exists a vector $h \in R^{n}$ such that $∥ h ∥ = 1$ , $F^{'} (x^{*}) h = 0$ , and F is 2-regular at the point $x^{*}$ with respect to the vector h, where the 2-factor-operator ${\tilde{Ψ}}_{2} (h)$ is defined by (11).

Then there is a neighborhood $N (x^{*})$ of $x^{*}$ in $R^{n}$ such that for any $x^{0} \in N (x^{*})$ , the sequence ${x^{k}}$ generated by the 2-factor-method (16) converges to $x^{*}$ , and relation (15) holds with some $C > 0$ .

The proof is similar to one of Theorem 3.

Now we introduce another version of the 2-factor-method.

Assume that the following conditions hold:

\begin{matrix} 1 . There exists a vector h \neq 0 such that F^{'} (x^{*}) h = 0; \\ 2 . Matrix (F^{″} (x^{*}) [h]) is invertible . \end{matrix}

(17)

Note that if conditions (17) are satisfied, then to solve Equation (5), we can use the following modified 2-factor-method:

x^{k + 1} = x^{k} - {(F^{″} (x^{k}) [h])}^{- 1} (F^{'} (x^{k}) [h]), k = 0, 1, 2, \dots .

(18)

For numerical realization of the 2-factor-method in the form (18), we only have to construct vector h satisfying conditions (17). Specifics of some problems allow us to choose vector h without any knowledge of the solution $x^{*}$ . We discuss the choice of the vector h in the following sections of the paper.

The following theorem states the convergence properties of method (18).

Theorem 5.

Given a mapping $F \in C^{3} (R^{n})$ , let $x^{*}$ be a solution of Equation (5). Assume that there exists a vector $h \in R^{n}$ such that $∥ h ∥ = 1$ and conditions (17) are satisfied.

Then, there is a neighborhood $N (x^{*})$ of $x^{*}$ in $R^{n}$ such that for any $x^{0} \in N (x^{*})$ , the sequence ${x^{k}}$ generated by the 2-factor-method (18) converges to $x^{*}$ and relation (15) holds with some $C > 0$ .

The proof is similar to one of Theorem 3.

3. Nonlinear Equations with Quadratic Mappings: the Exact Solution Formula

In this section, we consider the mapping F defined by Equation (1) as follows:

F (x) = B {[x]}^{2} + M x + N,

where $M \in R^{n \times n}$ is a matrix, $N \in R^{n}$ is a vector, and $B : R^{n} \to R^{n}$ is the map defined by (2). The mapping B is twice continuously differentiable [15], and its derivatives are given by ${(B {[x]}^{2})}^{'} = 2 B [x]$ and ${(B {[x]}^{2})}^{''} = 2 B$ for some arbitrary $x \in R^{n}$ . Let $x^{*}$ denote a solution of the equation $F (x) = 0$ .

We will now illustrate the application of the 2-factor method (18) for solving the nonlinear equation $F (x) = 0$ with the mapping F defined by (1). We will present multiple approaches to obtain an exact formula for $x^{*}$ , with the first approach being a specific case of the second approach. Additionally, we will show that for the mapping F, the method (18) converges to $x^{*}$ in just one iteration.

First approach to obtain an exact formula for the solution $x^{*}$ .

For the mapping F defined by (1), the assumptions (17) of Theorem 5 can be simplified to the existence of a vector h that satisfies the following conditions:

\begin{matrix} (1) (2 B [x^{*}] + M) h = 0; \\ (2) matrix (2 B [h]) is invertible . \end{matrix}

(19)

Under these assumptions (19), for the mapping F defined by (1) and a given point $x^{0}$ , the first iteration of the 2-factor-method (18) can be written as:

x^{1} = x^{0} - {(2 B [h])}^{- 1} ((2 B [x^{0}] + M) [h]),

which is equivalent to:

(2 B [h]) (x^{1} - x^{0}) = - ((2 B [x^{0}] + M) [h]) .

Using the property $2 B (h, x^{0}) = 2 B (x^{0}, h)$ , the last equation implies a one-step method for calculating $x^{1}$ and, consequently, finding the solution $x^{*}$ :

x^{*} = - \frac{1}{2} {(B [h])}^{- 1} (M h),

(20)

where the vector h satisfies conditions (19).

The numerical determination of the vector h depends on the specific characteristics of the problem. Alternatively, it can be obtained using the same method as described in the third approach below, which involves transforming the initial system into a system that is completely degenerate at the point $x^{*}$ .

Second approach to obtain an exact formula for the solution $x^{*}$ .

Now we present an alternative approach for obtaining a formula for the solution $x^{*}$ of the equation $F (x) = 0$ using the same mapping F defined by (1). This second approach is applicable to a broader variety of problems compared to the first approach.

Let $P_{1}$ denote the projector onto $Y_{1} = span ({Im}^{2} B)$ , and let $P_{2}$ denote the projector onto $Y_{1}^{⊥}$ , which is the orthogonal complementary subspace of $Y_{1}$ in $R^{n}$ . We note that $P_{2} (B {[x^{*}]}^{2}) = 0$ and:

4 P_{2} B (x^{*}, x) = P_{2} (B {[x + x^{*}]}^{2} - B {[x - x^{*}]}^{2}) = 0 .

Then for the mapping F defined in (1),

P_{1} F^{'} (x^{*}) = P_{1} (2 B [x^{*}] + M) and P_{2} F^{'} (x^{*}) = P_{2} M .

Assume that there exists a vector $h \in R^{n}$ satisfying the conditions:

\begin{matrix} (1) P_{1} (2 B [x^{*}] + M) h = 0; \\ (2) matrix (P_{2} M + 2 P_{1} B [h]) is invertible . \end{matrix}

(21)

Given the definition of $P_{2}$ , it follows that $P_{2} B {[x^{*}]}^{2} = 0$ . Substituting this into (1), we obtain $P_{2} (M x^{*} + N) = 0$ . Hence, the point $x^{*}$ satisfies the following identities:

\begin{matrix} P_{2} (M x^{*} + N) = 0, \\ P_{1} (2 B [x^{*}] + M) h = 0 . \end{matrix}

By adding these equations and assuming (21), we obtain the exact formula for the solution $x^{*}$ :

x^{*} = - {(P_{2} M + 2 P_{1} B [h])}^{- 1} (P_{2} N + P_{1} M h) .

(22)

Remark 1.

In the case when $P_{1} = I_{n}$ and, hence, $P_{2} = O_{n \times n}$ , assumptions (21) become (19), and Equation (22) reduces to (20).

Example 1.

Consider mapping $F : R^{2} \to R^{2}$ given by:

$F (x) = (\begin{matrix} x_{1}^{2} - x_{2}^{2} - 2 x_{1} + 1 \\ x_{1} x_{2} - x_{2} \end{matrix}) .$ (23)

We can represent the mapping F in the form (1) with:

$B = (\begin{matrix} (\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix}) \\ (\begin{matrix} 0 & 1 / 2 \\ 1 / 2 & 0 \end{matrix}) \end{matrix}), M = (\begin{matrix} - 2 & 0 \\ 0 & - 1 \end{matrix}), a n d N = (\begin{matrix} 1 \\ 0 \end{matrix}) .$

The equation $F (x) = 0$ has a locally unique solution $x^{*} = {(1, 0)}^{T}$ . In this example, $P_{1} = I_{2}$ and $P_{2} = O_{2 \times 2}$ . Hence, by Remark 1, we apply Equation (20) with $h = {(1, 0)}^{T}$ to obtain:

$x^{*} = - \frac{1}{2} {(B h)}^{- 1} (M h) = - \frac{1}{2} (\begin{matrix} 1 & 0 \\ 0 & 2 \end{matrix}) (\begin{matrix} - 2 \\ 0 \end{matrix}) = (\begin{matrix} 1 \\ 0 \end{matrix}),$

as claimed.

In a numerical implementation, an additional procedure is required to construct the vector h. Since the exact point $x^{*}$ is not known in advance, we only assume that a sufficiently small neighborhood of $x^{*}$ is provided to apply the procedure.

Third approach to obtain an exact formula for the solution $x^{*}$ .

While the first two approaches rely on knowledge of the element h, which is determined by $x^{*}$ , the third approach does not require such knowledge. Instead, all we need is for the starting point to belong to a sufficiently small neighborhood $N_{ε} (x^{*})$ of $x^{*}$ . Specifically, we have $x^{0} \in N_{ε} (x^{*})$ , where $ε > 0$ is sufficiently small.

Suppose that at the point $x^{*}$ , the first r vectors ${f_{1}^{'} (x^{*}), \dots, f_{r}^{'} (x^{*})}$ are linearly independent, where $f_{i}$ is defined in (4) for $i = 1, \dots, r$ . Assume also that the other vectors ${f_{r + 1}^{'} (x^{*}), \dots, f_{n}^{'} (x^{*})}$ are linear combinations of the first r vectors. Therefore, there exist coefficients $α_{i}^{j}$ such that:

f_{j}^{'} (x^{*}) = \sum_{i = 1}^{r} α_{i}^{j} f_{i}^{'} (x^{*}), j = r + 1, \dots, n .

Let us introduce the subspace $L (x)$ defined by:

L (x) = span (f_{1}^{'} (x), \dots, f_{r}^{'} (x)) .

We denote the orthogonal projection on the subspace $L (x)$ as $P_{L (x)}$ . Then, there exist coefficients $α_{i}^{j} (x)$ such that:

P_{L (x)} f_{j}^{'} (x) = \sum_{i = 1}^{r} α_{i}^{j} (x) f_{i}^{'} (x), j = r + 1, \dots, n .

In addition, introduce the notation:

{\hat{f}}_{j}^{'} (x) = f_{j}^{'} (x) - P_{L (x)} f_{j}^{'} (x), j = r + 1, \dots, n .

Then:

{\hat{f}}_{j} (x) = f_{j} (x) - \sum_{i = 1}^{r} α_{i}^{j} (x) f_{i} (x), j = r + 1, \dots, n .

(24)

Notice that $x^{*}$ is also a solution of the equation $\hat{F} (x) = 0$ , where $\hat{F} (x)$ is defined as:

\hat{F} (x) = {(f_{1} (x), \dots, f_{r} (x), {\hat{f}}_{r + 1} (x), \dots, {\hat{f}}_{n} (x))}^{T} .

The definition of $\hat{F} (x)$ implies that $\hat{F}$ is 2-regular at the point $x^{*}$ . In the case that some of the vectors $f_{k^{'}}^{'} (x^{*})$ , $k^{'} \in {r + 1, \dots, n},$ are not zero vectors, transformation (24) can be used to reduce those vectors to zero vectors. This ensures that ${\hat{f}}_{k^{'}}^{'} (x^{*}) = 0$ for all $k^{'} \in {r + 1, \dots, n}$ . Therefore, without loss of generality, we can assume that the mapping $F (x)$ satisfies $f_{j}^{'} (x^{*}) = 0$ for $j = r + 1, \dots, n$ . An example of a mapping that satisfies these conditions is:

F (x) = (\begin{matrix} x_{1} + x_{2} \\ x_{1} x_{2} \end{matrix}),

where $n = 2$ , $r = 1$ , $j = 2$ , $f_{1} (x) = x_{1} + x_{2}$ , $f_{2} (x) = x_{1} x_{2}$ , and $x^{*} = {(0, 0)}^{T} .$

Suppose there exist vectors $ξ_{1} \neq 0, \dots$ , $ξ_{r} \neq 0$ , and $h \neq 0$ , and indices $k_{i} \in {r + 1, \dots, n}$ , $i = 1, \dots, r,$ such that the system:

\{f_{k_{1}}^{^{″}} (x^{*}) ξ_{1}, \dots, f_{k_{r}}^{^{″}} (x^{*}) ξ_{r}, f_{r + 1}^{^{″}} (x^{*}) h, \dots, f_{n}^{^{″}} (x^{*}) h\}

is linearly independent.

Then the mapping $\bar{F} (x)$ defined by:

\bar{F} (x) = {(f_{k_{1}}^{'} (x) \cdot ξ_{1}, \dots, f_{k_{r}}^{'} (x) \cdot ξ_{r}, f_{r + 1}^{'} (x) \cdot h, \dots, f_{n}^{'} (x) \cdot h)}^{T}

(25)

has $x^{*}$ as its zero, that is $\bar{F} (x^{*}) = 0$ . At the same time, compared to the Jacobian matrix of $F (x)$ , the matrix:

{\bar{F}}^{'} (x^{*}) = [\begin{matrix} {(f_{k_{1}}^{''} (x^{*}) ξ_{1})}^{T} \\ \dots \\ {(f_{k_{r}}^{''} (x^{*}) ξ_{r})}^{T} \\ {(f_{r + 1}^{''} (x^{*}) h)}^{T} \\ \dots \\ {(f_{n}^{''} (x^{*}) h)}^{T} \end{matrix}]

is nonsingular. We can, therefore, consider the method:

x^{k + 1} = x^{k} - {({\bar{F}}^{'} (x^{k}))}^{- 1} \bar{F} (x^{k}), k = 1, 2, \dots .

(26)

Theorem 6.

Given a mapping $F \in C^{2} (R^{n})$ , let $x^{*}$ be a solution of Equation (5). Assume that there exist vectors $ξ_{1} \neq 0, \dots$ , $ξ_{r} \neq 0$ , and $h \neq 0$ , such that mapping $\bar{F} (x)$ defined in (25) is nonsingular at $x^{*}$ . Let $x^{0} \in N_{ε} (x^{*})$ , where $N_{ε} (x^{*})$ is a neighborhood of $x^{*}$ and $ε > 0$ is sufficiently small.

Then the sequence ${x^{k}}$ , $k = 1, 2, \dots$ defined by (26) is convergent to the point $x^{*}$ with the quadratic rate of convergence, that is:

$∥ x^{k + 1} - x^{*} ∥ \leq C ∥ x^{k} - x^{*} ∥^{2},$

where $C > 0$ is an independent constant.

Using definition of mapping F given by Equation (1), mappings $f_{i}$ introduced in (4) will have the following form:

f_{i} (x) = x^{T} B_{i} x + M_{i}^{T} x + N_{i}, i = 1, 2, \dots, n,

where $B_{i}$ is an $n \times n$ symmetric matrix, $M_{i} \in R^{n}$ , and $N_{i} \in R$ , $i = 1, 2, \dots, n .$

Given an initial point $x^{0}$ , we use the iterative method (26) to obtain:

x^{1} = x^{0} - {[\begin{matrix} {(2 B_{k_{1}} ξ_{1})}^{T} \\ ⋮ \\ {(2 B_{k_{r}} ξ_{r})}^{T} \\ {(2 B_{r + 1} h)}^{T} \\ ⋮ \\ {(2 B_{n} h)}^{T} \end{matrix}]}^{- 1} (\begin{matrix} {(2 B_{k_{1}} x^{0} + M_{k_{1}})}^{T} ξ_{1} \\ ⋮ \\ {(2 B_{k_{r}} x^{0} + M_{k_{r}})}^{T} ξ_{r} \\ {(2 B_{r + 1} x^{0} + M_{r + 1})}^{T} h \\ ⋮ \\ {(2 B_{n} x^{0} + M_{n})}^{T} h \end{matrix}) =

= {[\begin{matrix} {(2 B_{k_{1}} ξ_{1})}^{T} \\ ⋮ \\ {(2 B_{k_{r}} ξ_{r})}^{T} \\ {(2 B_{r + 1} h)}^{T} \\ ⋮ \\ {(2 B_{n} h)}^{T} \end{matrix}]}^{- 1} ((\begin{matrix} {(2 B_{k_{1}} ξ_{1})}^{T} x^{0} \\ ⋮ \\ {(2 B_{k_{r}} ξ_{r})}^{T} x^{0} \\ {(2 B_{r + 1} h)}^{T} x^{0} \\ ⋮ \\ {(2 B_{n} h)}^{T} x^{0} \end{matrix}) - (\begin{matrix} {(2 B_{k_{1}} x^{0})}^{T} ξ_{1} \\ ⋮ \\ {(2 B_{k_{r}} x^{0})}^{T} ξ_{r} \\ {(2 B_{r + 1} x^{0})}^{T} h \\ ⋮ \\ {(2 B_{n} x^{0})}^{T} h \end{matrix}) - (\begin{matrix} M_{k_{1}}^{T} ξ_{1} \\ ⋮ \\ M_{k_{r}}^{T} ξ_{r} \\ M_{r + 1}^{T} h \\ ⋮ \\ M_{n}^{T} h \end{matrix})) .

Because matrix $B_{i}$ is symmetric for any index i, then for any index j, we have:

{(2 B_{i} ξ_{j})}^{T} x^{0} = (2 B_{i} ξ_{j}) \cdot x^{0} = ξ_{j} \cdot (2 B_{i}^{T} x^{0}) = ξ_{j} \cdot (2 B_{i} x^{0}) = {(2 B_{i} x^{0})}^{T} ξ_{j} .

Therefore,

x^{1} = - \frac{1}{2} {[\begin{matrix} {(B_{k_{1}} ξ_{1})}^{T} \\ ⋮ \\ {(B_{k_{r}} ξ_{r})}^{T} \\ {(B_{r + 1} h)}^{T} \\ ⋮ \\ {(B_{n} h)}^{T} \end{matrix}]}^{- 1} (\begin{matrix} M_{k_{1}} \cdot ξ_{1} \\ ⋮ \\ M_{k_{r}} \cdot ξ_{r} \\ M_{r + 1} \cdot h \\ ⋮ \\ M_{n} \cdot h \end{matrix}) = x^{*}

(27)

Example 1

(Continuation). Consider mapping $F : R^{2} \to R^{2}$ defined in (23):

$F (x) = (\begin{matrix} x_{1}^{2} - x_{2}^{2} - 2 x_{1} + 1 \\ x_{1} x_{2} - x_{2} \end{matrix}),$

where

$B_{1} = [\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix}], B_{2} = [\begin{matrix} 0 & 1 / 2 \\ 1 / 2 & 0 \end{matrix}],$

$M_{1} = {(- 2, 0)}^{T}, M_{2} = {(0, - 1)}^{T}, N_{1} = 1, N_{2} = 0 .$

In this example, $x^{*} = {(1, 0)}^{T}$ is a solution of $F (x) = 0$ and:

$f_{1}^{'} (x^{*}) = {(2 x_{1}^{*} - 2, 2 x_{2}^{*})}^{T} = {(0, 0)}^{T}, f_{2}^{'} (x^{*}) = {(x_{2}^{*}, x_{1}^{*} - 1)}^{T} = {(0, 0)}^{T} .$

Therefore, mapping $\bar{F}$ defined in (25) takes the form:

$\bar{F} (x) = (\begin{matrix} f_{1}^{'} (x) \cdot h \\ f_{2}^{'} (x) \cdot h \end{matrix}),$

where h is chosen in such a way that the matrix ${\bar{F}}^{'} (x^{*})$ is nonsingular, and vectors $ξ_{i}$ are not used. For example, we can take $h = {(1, 0)}^{T}$ . Then Equation (27) has the form:

$\begin{matrix} x^{1} & = & - \frac{1}{2} {[\begin{matrix} {(B_{1} h)}^{T} \\ {(B_{2} h)}^{T} \end{matrix}]}^{- 1} (\begin{matrix} M_{1} \cdot h \\ M_{2} \cdot h \end{matrix}) = - \frac{1}{2} {[\begin{matrix} 1 & 0 \\ 0 & 1 / 2 \end{matrix}]}^{- 1} (\begin{matrix} - 2 \\ 0 \end{matrix}) \\ = & - [\begin{matrix} 1 / 2 & 0 \\ 0 & 1 \end{matrix}] (\begin{matrix} - 2 \\ 0 \end{matrix}) = (\begin{matrix} 1 \\ 0 \end{matrix}), \end{matrix}$ (28)

which is a solution of $F (x) = 0$ in this example.

The approaches described above can be modified to derive other methods for solving the equation $F (x) = 0$ . For example, using the equation $F^{'} {(x)}^{T} h = 0$ , where $h \in Ker F^{'} {(x^{*})}^{T}$ , we obtain the following method:

x^{k + 1} = x^{k} - {({(F^{'} {(x^{k})}^{T})}^{'} h)}^{- 1} ({(F^{'} (x^{k}))}^{T} h), k = 0, 1, \dots .

The sequence ${x_{k}}$ converges to $x^{*}$ under the assumption that ${(F^{'} {(x^{*})}^{T})}^{'} h$ is nonsingular. In this modification, unlike the second approach, we can construct an element h without the knowledge of the point $x^{*}$ , based on the information at an initial point $x^{0}$ .

Applying the modified method to Example 1, we obtain the same formulas and results as shown in Equation (28) above. To implement this approach, it is necessary to determine the vectors $f_{i}^{'} (x)$ , $i = 1, 2, \dots, n$ , which correspond to linearly independent vectors $f_{i}^{'} (x^{*}),$ $i \in {1, 2, \dots, n} .$ This can be achieved using information at a point $x^{0} \in N_{ε} (x^{*})$ , where $ε > 0$ is sufficiently small. If the assumption of p-regularity is satisfied, the identification of linearly independent vectors is performed using the method described in the next section.

4. Procedure for Identifying Zero Elements

The procedure for identifying zero elements could be used to implement the methods described in the previous sections numerically. Let $F (x) : R^{n} \to R^{m}$ be defined as:

F (x) = {(f_{1} (x), \dots, f_{m} (x))}^{T} .

(29)

In this section, we present a theorem that describes the properties of a special mapping $μ (x)$ , which allows us to propose the method for determining r linear independent vectors $f_{i_{k}}^{'} (x^{*})$ , $k = 1, \dots, r,$ at the solution $x^{*}$ of $F (x) = 0$ . This procedure is based on the information about the system of vectors ${f_{1}^{'} (x), \dots, f_{m}^{'} (x)}$ at some point x in a small neighborhood of $x^{*}$ . As a result, we can define the mapping $\hat{F} (x)$ with the first r components $f_{i_{k}} (x)$ , $k = 1, \dots, r,$ corresponding to the linearly independent vectors $f_{i_{k}}^{'} (x^{*})$ , $k = 1, \dots, r .$

Let $F \in C^{3} (R^{n})$ be 2-regular at the point $x^{*}$ . For some $x \in N_{ε} (x^{*})$ , where $ε$ is sufficiently small, we define the following mappings:

ρ (x) = min_{i = 1, \dots, m} d (f_{i}^{'} (x), span (f_{1}^{'} (x), \dots, f_{i - 1}^{'} (x), f_{i + 1}^{'} (x), \dots, f_{m}^{'} (x)))

and:

μ (x) = max \{{∥ F (x) ∥}^{\frac{1}{2}}, ρ (x)\},

(30)

where $d (x, S)$ denotes the distance between an element x and the set S. Note that if $F (x) = f_{1} (x)$ , then $ρ (x) = ∥ f_{1}^{'} (x) ∥$ .

The mapping $μ (x)$ is used to determine the maximum number r of linearly independent vectors in the system ${f_{1}^{'} (x^{*}), \dots, f_{m}^{'} (x^{*})}$ using a special procedure that relies on the information about the mapping $F (x)$ at the point $x \in N_{ε} (x^{*})$ . The properties of the mapping $μ (x)$ are stated in the following theorem, and the proof can be found in [16].

Theorem 7

(Minorant theorem). Let $F \in C^{3} (R^{n})$ be 2-regular at the point $x^{*}$ , and $F (x^{*}) = 0$ . Then there exist constants $ε > 0,$ $C^{'} > 0$ , and $C^{″} > 0$ such that the following inequality holds for any $x \in N_{ε} (x^{*})$ :

$C^{'} ∥ x - x^{*} ∥ \leq μ (x) \leq C^{″} {∥ x - x^{*} ∥}^{\frac{1}{2}},$

where function $μ (x)$ is defined in (30).

In addition to the properties of the mapping $μ (x)$ given in Theorem 7, we also need the following lemma (for the proof, see [16]).

Lemma 1.

For the non-negative mappings $g (x)$ and $μ (x)$ , let the following inequalities hold:

$| g (x_{1}) - g (x_{2}) | \leq L ∥ x_{1} - x_{2} ∥ for all x_{1}, x_{2} \in N_{σ} (x^{*}),$

$C^{'} ∥ x - x^{*} ∥ \leq μ (x) \leq C^{″} {∥ x - x^{*} ∥}^{\frac{1}{p}} for all x \in N_{σ} (x^{*}),$

where $L,$ $C^{'}$ , $C^{″}$ , and σ are positive constants, with $C^{″} \geq C^{'}$ and $p \geq 2$ .

Then, there exists a sufficiently small $ε > 0$ such that one of the following conditions holds:

1.
If $g (x) \leq \sqrt{μ (x)}$ for all $x \in N_{ε} (x^{*})$ , then $g (x^{*}) = 0 .$

2.
If $g (x) > \sqrt{μ (x)}$ for all $x \in N_{ε} (x^{*})$ , then $g (x^{*}) \neq 0 .$

Remark 2.

Based on the assumptions of Lemma 1, there exists a sufficiently small $ε > 0$ such that if the inequality $g (\bar{x}) \leq \sqrt{μ (\bar{x})}$ is satisfied for any $\bar{x} \in N_{ε} (x^{*})$ , then the inequality $g (x) \leq \sqrt{μ (x)}$ is satisfied for all $x \in N_{ε} (x^{*})$ , and hence $g (x^{*}) = 0$ .

Similarly, if the inequality $g (\bar{x}) > \sqrt{μ (\bar{x})}$ is satisfied for any $\bar{x} \in N_{ε} (x^{*})$ , then the inequality $g (x) > \sqrt{μ (x)}$ is satisfied for all $x \in N_{ε} (x^{*})$ , and hence $g (x^{*}) \neq 0$ .

Now we are ready to introduce an iterative method that determines indices $1, \dots, r$ corresponding to the linearly independent vectors $f_{i}^{'} (x^{*})$ , $i = 1, \dots, r .$

Method for determining linearly independent gradients at $x^{*}$ (identifying zero elements).

Using Lemma 1 and Remark 2, for a sufficiently small $ε > 0$ , $x \in N_{ε} (x^{*})$ , and $i = 1, \dots, m$ , consider two possible cases:

Case 1. $g (x) = ∥ f_{i}^{'} (x) ∥ \leq \sqrt{μ (x)}$ .

Case 2. $g (x) = ∥ f_{i}^{'} (x) ∥ > \sqrt{μ (x)}$ .

In Case 1, according to Remark 2, it follows that $f_{i}^{'} (x^{*}) = 0$ , whereas in Case 2, we have $f_{i}^{'} (x^{*}) \neq 0$ .

In addition to the properties of the mapping

Let $F (x) : R^{n} \to R^{m}$ be defined by (29) and $x^{*}$ be a solution of $F (x) = 0$ . Let x be in $N_{ε} (x^{*})$ , where $ε$ is sufficiently small. Define function $μ (x)$ using Equation (30).

Step 1. Identify the smallest index $i_{1}$ in the set $S_{1} = {1, \dots, m}$ such that $∥ f_{i_{1}}^{'} (x) ∥ > \sqrt{μ (x)}$ . According to Case 2 above, this implies that $f_{i_{1}}^{'} (x^{*}) \neq 0$ .
Step 2. Use Step 1 to identify if the set $S_{2} = {1, \dots, m} ∖ {i_{1}}$ has at least one index j such that $f_{j}^{'} (x^{*}) \neq 0$ . If it does not, the method is finished. Otherwise, identify the next smallest index $i_{2}$ in the set $S_{1}$ such that the following condition is satisfied:
$d (f_{i_{2}}^{'} (x), span (f_{i_{1}}^{'} (x))) > \sqrt{μ (x)} .$

According to Case 2 above, it means that the vectors $f_{i_{1}}^{'} (x^{*})$ and $f_{i_{2}}^{'} (x^{*})$ are linearly independent.

Step k. By this step, we have identified $k - 1$ linearly independent vectors $f_{i_{1}}^{'} (x^{*})$ , …, $f_{i_{k - 1}}^{'} (x^{*})$ , where $k = 3, 4, \dots, m$ . Use Step 1 to identify if the set $S_{k} = {1, \dots, m} ∖ {i_{1}, \dots, i_{k - 1}}$ has at least one index j such that $f_{j}^{'} (x^{*}) \neq 0$ . If it does not, the method is finished. Otherwise, identify the next smallest index $i_{k} \in {1, \dots, m} ∖ {i_{1}, \dots, i_{k - 1}}$ such that the following condition is satisfied:

d (f_{i_{k}}^{'} (x), span (f_{i_{1}}^{'} (x), \dots, f_{i_{k - 1}}^{'} (x))) > \sqrt{μ (x)} .

The inequality implies that the vectors $f_{i_{1}}^{'} (x^{*})$ , $\dots,$ $f_{i_{k}}^{'} (x^{*})$ are linearly independent.

Repeat Step k until the method is finished.

Without loss of generality, assume that the first r vectors ${f_{1}^{'} (x^{*}), \dots, f_{r}^{'} (x^{*})}$ are linearly independent and define mapping $\hat{F} (x)$ as:

\hat{F} (x) = (\begin{matrix} f_{1} (x) \\ ⋮ \\ f_{r} (x) \\ {\hat{f}}_{r + 1} (x) \\ ⋮ \\ {\hat{f}}_{m} (x) \end{matrix}),

(31)

where vectors ${\hat{f}}_{k} (x)$ are defined in such a way that:

{\hat{f}}_{k}^{'} (x^{*}) = 0, k = r + 1, \dots, m .

Namely, let:

f_{k}^{'} (x^{*}) = \sum_{i = 1}^{r} α_{i}^{k} (x^{*}) f_{i}^{'} (x^{*}), k = r + 1, \dots, m,

be a linear combination of the vectors $f_{1}^{'} (x^{*})$ , …, $f_{r}^{'} (x^{*}) .$ Coefficients $α^{k} (x)$ are determined by solving the following system of equations:

(f_{k}^{'} (x) - \sum_{i = 1}^{r} α_{i}^{k} (x) f_{i}^{'} (x)) \cdot f_{j}^{'} (x) = 0, j = 1, 2, \dots, r, k = r + 1, \dots, m .

In addition, define $B (x)$ to be a nonsingular matrix of the form:

B (x) = (\begin{matrix} 1 & \dots & 0 & 0 & \dots & 0 \\ ⋮ & ⋱ & ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & \dots & 1 & 0 & \dots & 0 \\ - α_{1}^{r + 1} (x) & \dots & - α_{r}^{r + 1} (x) & 1 & \dots & 0 \\ ⋮ & ⋱ & ⋮ & ⋮ & ⋱ & ⋮ \\ - α_{1}^{m} (x) & \dots & - α_{r}^{m} & 0 & \dots & 1 \end{matrix}) .

Let:

α^{k} (x) = (α_{1}^{k} (x), \dots, α_{r}^{k} (x)), k = r + 1, \dots, m .

Define the following vectors:

{\hat{f}}_{k} (x) = f_{k} (x), k = 1, \dots, r,

{\hat{f}}_{k} (x) = f_{k} (x) - \sum_{i = 1}^{r} α_{i}^{k} (x) f_{i} (x), k = r + 1, \dots, m .

These vectors allow us to transform the mapping $F (x)$ to $\hat{F} (x) = B (x) \cdot F (x)$ , where ${\hat{f}}_{1}^{'} (x^{*}) \neq 0,$ $\dots,$ ${\hat{f}}_{r}^{'} (x^{*}) \neq 0,$ and ${\hat{f}}_{r + 1}^{'} (x^{*}) = 0,$ $\dots,$ ${\hat{f}}_{m}^{'} (x^{*}) = 0 .$ The purpose of this transformation is to simplify the structure of the projection operators.

We present a simple example to illustrate an application of the proposed method.

Example 2.

Let $F : R^{2} \to R^{2}$ , $F (x) = {(f_{1} (x), f_{2} (x))}^{T}$ , where:

$f_{1} (x) = x_{1} + x_{2}, f_{2} (x) = x_{1} x_{2} .$

Then $x^{*} = {(0, 0)}^{T}$ is a solution of $F (x) = 0$ . Take $ε = \frac{1}{\sqrt{2}}$ and consider $\bar{x} = {(\frac{1}{2}, \frac{1}{2})}^{T} \in N_{ε} (x^{*})$ . The Jacobian matrix of F is:

$F^{'} (x) = (\begin{matrix} 1 & 1 \\ x_{2} & x_{1} \end{matrix}),$

$f_{1}^{'} (x^{*}) = (\begin{matrix} 1 \\ 1 \end{matrix}), a n d f_{2}^{'} (x^{*}) = (\begin{matrix} 0 \\ 0 \end{matrix}) .$

It is easy to see that vectors $f_{1}^{'} (x^{*})$ and $f_{2}^{'} (x^{*})$ are linearly dependent. We can check this by applying the method introduced above.

By using Equation (30), we define function $μ (x) = max \{{∥ F (x) ∥}^{\frac{1}{2}}, ρ (x)\}$ , where:

$ρ (x) = sin α ∥ f_{2}^{'} (x) ∥,$

and α is the angle between vectors $f_{1}^{'} (x)$ and $f_{2}^{'} (x)$ . Note that:

$cos α = \frac{f_{1}^{'} (x) \cdot f_{2}^{'} (x)}{∥ f_{1}^{'} (x) ∥ ∥ f_{2}^{'} (x) ∥},$

and hence:

$ρ (x) = \sqrt{\frac{∥ f_{1}^{'} {(x) ∥}^{2} {∥ f_{2}^{'} (x) ∥}^{2} - {(f_{1}^{'} (x) \cdot f_{2}^{'} (x))}^{2}}{∥ f_{1}^{'} {(x) ∥}^{2} {∥ f_{2}^{'} (x) ∥}^{2}}} ∥ f_{2}^{'} (x) ∥ = \frac{| x_{1} - x_{2} |}{\sqrt{2}} .$

Using $\bar{x} = {(\frac{1}{2}, \frac{1}{2})}^{T}$ , we obtain:

$ρ (\bar{x}) = 0, {∥ F (\bar{x}) ∥}^{\frac{1}{2}} = \sqrt[4]{1^{2} + {(\frac{1}{4})}^{2}} = \sqrt[4]{\frac{17}{16}}, and so μ (\bar{x}) = \sqrt[4]{\frac{17}{16}} .$

We are ready to apply the method described above.

In Step 1, we obtain $∥ f_{1}^{'} (\bar{x}) ∥ > \sqrt{μ (\bar{x})}$ because:

$\sqrt{2} > \sqrt[8]{\frac{17}{16}} .$

Hence, vector $f_{1}^{'} (x^{*}) \neq 0$ and $i_{1} = 1$ .

Then in Step 1 of the method with vector $f_{2}^{'} (x)$ , we also verify whether the following inequality holds:

$∥ f_{2}^{'} (\bar{x}) ∥ > μ {(\bar{x})}^{\frac{1}{2}} o r \sqrt{{\bar{x}}_{2}^{2} + {\bar{x}}_{1}^{2}} > μ {(\bar{x})}^{\frac{1}{2}} .$

Using point $\bar{x} = {(\frac{1}{2}, \frac{1}{2})}^{T}$ , we obtain:

$\sqrt{\frac{1}{2}} < \sqrt[8]{\frac{17}{16}} .$

Therefore, we conclude that $f_{2}^{'} (x^{*}) = 0$ .

Thus, in this example, the mapping $\hat{F} (x)$ defined in (31) has the form $\hat{F} (x) = {(f_{1}, {\hat{f}}_{2})}^{T}$ , where $f_{1} (x) = x_{1} + x_{2}$ and ${\hat{f}}_{2} = x_{1} x_{2}$ .

5. Quadratic Programming Problems

In this section, we consider the quadratic programming (QP) problem (3):

\begin{matrix} \underset{x}{minimize} & f (x) = \frac{1}{2} x^{T} Q x + c^{T} x \\ subject to & A x \leq b \end{matrix} (QP),

where Q is an $n \times n$ symmetric matrix, A is an $m \times n$ matrix, $c, x \in R^{n}$ , and $b \in R^{m}$ . The Lagrangian for problem (3) is defined by:

L (x, λ) = \frac{1}{2} x^{T} Q x + c^{T} x + \sum_{i = 1}^{m} λ_{i} (a_{i} x - b_{i}),

(32)

where $λ = (λ_{1}, \dots, λ_{m})$ is the vector of Lagrange multipliers and $a_{i}$ is the ith row of the matrix A. The Karush-Kuhn-Tucker (KKT) conditions [17] are satisfied at $x^{*}$ with some $λ^{*} \in R^{m}$ if:

\begin{matrix} Q x^{*} + c + \sum_{i = 1}^{m} λ_{i}^{*} a_{i}^{T} = 0, \\ λ_{i}^{*} (a_{i} x^{*} - b_{i}) = 0, λ_{i}^{*} \geq 0, a_{i} x^{*} \leq b_{i}, & for all i = 1, \dots, m . \end{matrix}

(33)

The point $x^{*}$ at which relations (33) are satisfied is called a stationary point or a KKT point. Observe that $(x^{*}, λ^{*})$ is a solution of the following system:

Φ (x, λ) = (\begin{matrix} Q x + c + A^{T} λ \\ Λ (A x - b) \end{matrix}) = 0, Λ = diag {(λ_{i})}_{i = 1, \dots, m}, λ_{i}^{*} \geq 0, A x^{*} \leq b .

(34)

We denote by $I (x^{*})$ the set of indices of the active constraints at $x^{*}$ :

I (x^{*}) = {i = 1, \dots, m ∣ a_{i} x^{*} = b_{i}} .

The following constraint qualification is used in the paper.

Definition 10

(Linear independence constraint qualification). The linear independence constraint qualification (LICQ) holds at a feasible point $x^{*}$ if the row-vectors $a_{j}$ , $j \in I (x^{*})$ , corresponding to the active at $x^{*}$ constraints, are linearly independent.

The modified second-order sufficient conditions (MSOSC) state that there exist a Lagrange multiplier vector $λ^{*}$ and a scalar $ν > 0$ such that:

ω^{T} \nabla^{2} L_{x x} (x^{*}, λ^{*}) ω \geq ν {∥ ω ∥}^{2}

(35)

for all $ω$ satisfying:

a_{i} ω = 0 \forall i \in I (x^{*}) .

We divide the presentation in this section into three parts. We start by considering regular QP problems in Section 5.1. Then, in Section 5.2, we discuss the issue of identifying the active constraints and propose numerical strategies for determining the set $I (x^{*})$ . We apply these techniques to degenerate QP problems in Section 5.3.

5.1. Regular Quadratic Programming

In this section, we consider regular quadratic programming (QP) problem (3). In other words, we assume that the Linear Independence Constraint Qualification (LICQ) and the Mangasarian-Fromovitz Constraint Qualification (MFCQ) conditions (35) hold. Recall that A is an $m \times n$ matrix of coefficients representing the constraints $A x \leq b$ in problem (3). Without loss of generality, assume that the first p constraints are active at $x^{*}$ , so that:

I (x^{*}) = {1, \dots, p} .

Then we can rewrite the matrix A in the following form:

A = [\begin{matrix} A_{A} \\ A_{N} \end{matrix}],

(36)

where $A_{A}$ is a $p \times n$ matrix of coefficients corresponding to the active constraints at $x^{*}$ , and $A_{N}$ is an $(m - p) \times n$ matrix of coefficients corresponding to the nonactive constraints at $x^{*}$ . It is important to note that we do not have prior knowledge of the set $I (x^{*})$ . We will discuss possible numerical realizations to approximate the set of active constraints in Section 5.2. Additionally, we introduce the following notation associated with the active constraints at the point $x^{*}$ :

b_{A} = {(b_{1}, \dots, b_{p})}^{T}, λ_{A} = {(λ_{1}, \dots, λ_{p})}^{T} .

Similarly,

b_{N} = {(b_{p + 1}, \dots, b_{m})}^{T}, λ_{N} = {(λ_{p + 1}, \dots, λ_{m})}^{T} .

In the following subsections, we will introduce three approaches to solving the QP problem (3) and provide formulas for the solution.

5.1.1. First Approach to Solving the QP Problem

In this subsection, we present an approach to solving the QP problem and obtaining a formula for its solution. This approach is based on the construction of the 2-factor-operator. For our consideration below, we need the following lemma.

Lemma 2.

Let V be an $n \times n$ matrix, G be a $p \times n$ matrix, such that the columns of $G^{T}$ are linearly independent, L be an $n \times l$ matrix, $G_{N} = d i a g {(g_{i})}_{i = 1}^{l}$ be a diagonal full rank matrix, and:

$(V x, x) > 0 f o r a l l x \in Ker G ∖ {0} .$ (37)

Then matrix Γ defined by:

$Γ = (\begin{matrix} V & G^{T} & L \\ G & O_{p \times p} & O_{p \times l} \\ O_{l \times n} & O_{l \times p} & G_{N} \end{matrix})$ (38)

is nonsingular.

Proof.

To prove the lemma, we must prove that the matrix $Γ$ defined by (38) has zero nullspace. Consider the following system that defines the nullspace of $Γ$ in the form of a vector $v = (x, y, z)$ , where $x \in R^{n}$ , $y \in R^{p}$ , and $z \in R^{l}$ :

$\begin{matrix} V x + G^{T} y + L z = 0 \\ G x = 0 \\ G_{N} z = 0 . \end{matrix}$ (39)

Since $G_{N}$ is a full-rank diagonal matrix, the third equation in the system (39) implies that $z = 0$ . Then, using the first equation, we obtain:

$\begin{matrix} 0 & = (V x) \cdot x + (G^{T} y) \cdot x \\ = (V x) \cdot x + y \cdot (G x) \\ = (V x) \cdot x . \end{matrix}$

Consequently, $x = 0$ ; otherwise, $(V x) \cdot x = 0$ , which contradicts the assumption (37) of the lemma. Therefore, the first equation in (39) reduces to $G^{T} y = 0$ , and since the columns of $G^{T}$ are linearly independent, we obtain $y = 0$ . Thus, the matrix $Γ$ (38) has a zero nullspace, $(x, y, z) = (0, 0, 0)$ , and therefore, $Γ$ is nonsingular. This concludes the proof of the lemma. □

Let $x \in R^{n}$ , $λ \in R^{m},$ and mapping $Φ$ be defined in (34), so that $Φ (x, λ) = (\begin{matrix} Q x + c + A^{T} λ \\ Λ (A x - b) \end{matrix})$ . Introduce mappings $P_{1}$ and $P_{2}$ as:

P_{1} = [\begin{matrix} I_{n \times n} & O_{n \times m} \\ O_{m \times n} & O_{m \times m} \end{matrix}], P_{2} = [\begin{matrix} O_{n \times n} & O_{n \times m} \\ O_{m \times n} & I_{m \times m} \end{matrix}] .

Recall that matrix $A_{A}$ is defined in (36), and introduce vector $\bar{h} \in R^{n + m}$ such that:

\bar{h} = {(h^{1}, h^{2})}^{T}, h^{1} \in R^{n}, h^{2} \in R^{m},

where $A_{A} h^{1} = 0$ , $h^{1} \neq 0$ , $h^{2} = (\underset{p}{\underset{︸}{1 \dots 1}}, \underset{m - p}{\underset{︸}{0 \dots 0}}) .$

Define mapping $Ψ$ as:

Ψ (x, λ) = P_{1} Φ (x, λ) + P_{2} Φ^{'} (x, λ) \bar{h} .

(40)

Recall that $a_{i}$ is the ith row of the matrix A and $b = {(b_{1}, \dots, b_{m})}^{T}$ . Then:

Φ^{'} (x, λ) = (\begin{matrix} Q & A^{T} \\ Λ A & S \end{matrix}), S = (\begin{matrix} a_{1} x - b_{1} & 0 & 0 \\ ⋱ \\ 0 & 0 & a_{m} x - b_{m} \end{matrix}),

and mapping $Ψ$ defined in (40) can be rewritten as:

Ψ (x, λ) = (\begin{matrix} Q x + c + A^{T} λ \\ Λ A h^{1} + S h^{2} \end{matrix}) .

Introduce matrix $Λ_{N} = diag {(λ_{i})}_{i = p + 1, \dots, m}$ . Then, taking into account the definition of $h^{1}$ and $h^{2}$ , we obtain:

Ψ (x, λ) = (\begin{matrix} Q x + c + A^{T} λ \\ A_{A} x - b_{A} \\ Λ_{N} A_{N} h^{1} \end{matrix}) .

Observe that if $(x^{*}, λ^{*})$ is a solution of (34), it is also a solution of $Ψ (x, λ) = 0,$ or, equivalently,

Ψ (x, λ) = (\begin{matrix} Q x + c + A^{T} λ \\ A_{A} x - b_{A} \\ Λ_{N} A_{N} h^{1} \end{matrix}) = 0 .

(41)

To obtain the formula for the solution $(x^{*}, λ^{*})$ , we rewrite the system (41) as:

(\begin{matrix} Q & A_{A}^{T} & A_{N}^{T} \\ A_{A} & O & O \\ O_{(m - p) \times n} & O & K \end{matrix}) (\begin{matrix} x \\ λ_{A} \\ λ_{N} \end{matrix}) = (\begin{matrix} - c \\ b_{A} \\ O_{m - p} \end{matrix}), K = (\begin{matrix} a_{p + 1} h^{1} & 0 & 0 \\ ⋱ \\ 0 & 0 & a_{m} h^{1} \end{matrix}) .

Assuming that LICQ and MSOSC hold and apply Lemma 2, we obtain that the matrix:

(\begin{matrix} Q & A_{A}^{T} & A_{N}^{T} \\ A_{A} & O & O \\ O_{(m - p) \times n} & O & K \end{matrix})

is invertible and obtain the formula for $(x^{*}, λ^{*})$ :

(\begin{matrix} x^{*} \\ λ_{A}^{*} \\ λ_{N}^{*} \end{matrix}) = {(\begin{matrix} Q & A_{A}^{T} & A_{N}^{T} \\ A_{A} & O & O \\ O_{(m - p) \times n} & O & K \end{matrix})}^{- 1} (\begin{matrix} - c \\ b_{A} \\ O_{m - p} \end{matrix}) .

(42)

5.1.2. Second Approach to Solving the QP Problem

Assume that we can estimate the set $I (x^{*})$ , which is in our notation $I (x^{*}) = {1, 2, \dots, p}$ . Taking into account that $λ_{p + 1}^{*} = 0, \dots, λ_{m}^{*} = 0$ and that $A_{A} x^{*} = b_{A}$ , system (34) can be reduced to the following one:

Φ (x, λ) = (\begin{matrix} Q x + c + A_{A}^{T} λ_{A} \\ A_{A} x - b_{A} \end{matrix}) = 0,

(43)

which can be written as:

(\begin{matrix} Q & A_{A}^{T} \\ A_{A} & O_{p \times p} \end{matrix}) (\begin{matrix} x \\ λ_{A} \end{matrix}) = (\begin{matrix} - c \\ b_{A} \end{matrix}) .

(44)

Under the assumptions LICQ and MSOSC, the following matrix is invertible,

(\begin{matrix} Q & A_{A}^{T} \\ A_{A} & O_{p \times p} \end{matrix})

and system (44) yields the formula for the solution $(x^{*}, λ^{*})$ :

(\begin{matrix} x^{*} \\ λ_{A}^{*} \end{matrix}) = {[\begin{matrix} Q & A_{A}^{T} \\ A_{A} & O_{p \times p} \end{matrix}]}^{- 1} (\begin{matrix} - c \\ b_{A} \end{matrix}) .

(45)

Remark 3.

System (41) reduces to system (43) by removing equations $Λ_{N} A_{N} h^{1}$ , corresponding to the nonactive constraints. Similarly, Equation (42) reduces to (45).

Remark 4.

We note that solutions of QP problems have the following specific property: if $x^{*}$ is a solution of the QP problem and $h^{T} L_{x x}^{″} (x, λ) h = 0$ for the vector $h \in Ker A$ , then the points $x = x^{*} + t h$ are also solutions of the QP problem.

5.1.3. Examples

In this section, we illustrate the two described approaches with examples. Namely, we consider the construction of system (41) required for the first approach. Then we illustrate using the exact formula (45) derived in the second approach.

Example 2.

Consider the problem:

$\begin{matrix} \underset{x}{minimize} & \frac{1}{2} x_{2}^{2} - x_{1} \\ subject to & x_{1} \leq 0, \\ x_{2} \leq 1 . \end{matrix}$ (46)

The matrix A in this example is $A = (\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix})$ and $b = (\begin{matrix} 0 \\ 1 \end{matrix}) .$ The solution to this problem is the point $(x_{1}^{*}, x_{2}^{*}) = (0, 0)$ with $λ_{1}^{*} = 1$ and $λ_{2}^{*} = 0$ . Hence, $I (x^{*}) = {1}$ , $A_{A} = (1, 0)$ , and $b_{A} = 0$ . Moreover,

$Q = (\begin{matrix} 0 & 0 \\ 0 & 1 \end{matrix}) a n d c = (\begin{matrix} - 1 \\ 0 \end{matrix}) .$

By choosing $h^{1} = {(0, 1)}^{T}$ and $h^{2} = {(1, 0)}^{T}$ , the system (41) reduces to the linear system:

$Ψ (x, λ) = (\begin{matrix} Q x + c + A^{T} λ \\ A_{A} x - b_{A} \\ Λ_{N} A_{N} h^{1} \end{matrix}) = (\begin{matrix} λ_{1} - 1 \\ x_{2} + λ_{2} \\ x_{1} \\ λ_{2} \end{matrix}) .$

Solving the system $Ψ (x, λ) = 0$ yields $(x_{1}^{*}, x_{2}^{*}, λ_{1}^{*}, λ_{2}^{*}) = (0, 0, 1, 0)$ , as claimed.

Now, let us illustrate the second approach. Specifically, using the formula (45) for the solution of problem (46) with $λ_{A} = λ_{1}$ , we obtain:

(\begin{matrix} x_{1}^{*} \\ x_{2}^{*} \\ λ_{1}^{*} \end{matrix}) = {[\begin{matrix} 0 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \end{matrix}]}^{- 1} (\begin{matrix} 1 \\ 0 \\ 0 \end{matrix}) = (\begin{matrix} 0 \\ 0 \\ 1 \end{matrix}),

as claimed.

Example 3.

Consider the problem:

$\begin{matrix} \underset{x}{minimize} & x_{1}^{2} + \frac{1}{2} x_{2}^{2} + x_{1} x_{2} \\ subject to & - x_{1} \leq 1, \\ x_{2} \leq 1 . \end{matrix}$ (47)

The solution to this problem is the point $(x_{1}^{*}, x_{2}^{*}) = (0, 0)$ with $λ_{1}^{*} = 0$ and $λ_{2}^{*} = 0$ . Hence, $I (x^{*}) = \emptyset$ . Moreover,

$Q = (\begin{matrix} 2 & 1 \\ 1 & 1 \end{matrix}) a n d c = (\begin{matrix} 0 \\ 0 \end{matrix}) .$

By choosing $h^{1} = {(1, 1)}^{T}$ and $h^{2} = {(0, 0)}^{T}$ , the system (41) reduces to the following linear system for problem (47):

$Ψ (x, λ) = (\begin{matrix} 2 x_{1} + x_{2} - λ_{1} \\ x_{1} + x_{2} + λ_{2} \\ - λ_{1} \\ λ_{2} \end{matrix}) .$

Solving $Ψ (x, λ) = 0$ yields $(x_{1}^{*}, x_{2}^{*}, λ_{1}^{*}, λ_{2}^{*}) = (0, 0, 0, 0)$ , as claimed.

To illustrate the second approach, we rewrite the exact formula (45) for the solution of problem (47) in the form:

(\begin{matrix} x_{1}^{*} \\ x_{2}^{*} \end{matrix}) = {[\begin{matrix} 2 & 1 \\ 1 & 1 \end{matrix}]}^{- 1} (\begin{matrix} 0 \\ 0 \end{matrix}) = (\begin{matrix} 0 \\ 0 \end{matrix}),

(48)

as claimed.

5.1.4. Third Approach to Solving the QP Problem

In this subsection, we present another approach to solving the QP problem. A formula that we obtain for the solution of the QP problem is also based on the construction of the 2-factor-operator.

First, we replace the inequality constraints in the QP problem with equality constraints of the form:

A x - b + y^{2} = 0,

where $y^{2} = {(y_{1}^{2}, \dots, y_{m}^{2})}^{T}$ . We then define the Lagrangian as follows:

\tilde{L} (x, y, λ) = \frac{1}{2} x^{T} Q x + c^{T} x + \sum_{i = 1}^{m} λ_{i} (a_{i} x - b_{i} + y_{i}^{2}) .

(49)

Introduce the notation:

Λ = diag {(λ_{j})}_{j = 1}^{m}, Y = diag {(y_{j})}_{j = 1}^{m}, e = {(1, 1, \dots, 1)}^{T} .

Then the point $(x^{*}, y^{*}, λ^{*})$ is a solution of the following system:

\tilde{Φ} (x, y, λ) = (\begin{matrix} Q x + c + A^{T} λ \\ 2 Λ Y e \\ A x - b + y^{2} \end{matrix}) = 0 .

(50)

The Jacobian matrix of the system (50) is given by:

{\tilde{Φ}}^{'} (x, y, λ) = (\begin{matrix} Q & O_{n \times m} & A^{T} \\ O_{m \times n} & 2 Λ & 2 Y \\ A & 2 Y & O \end{matrix}) .

Then with $h = {(h_{1}, h_{2}, h_{3})}^{T}, h_{1} \in R^{n}, h_{2}, h_{3} \in R^{m}$ , we obtain

{\tilde{Φ}}^{'} (x, y, λ) h = (\begin{matrix} Q h_{1} + A^{T} h_{3} \\ 2 Λ h_{2} + 2 Y h_{3} \\ A h_{1} + 2 Y h_{2} \end{matrix}) .

Assuming that LICQ and MSOSC hold, matrix ${\tilde{Φ}}^{'} (x^{*}, y^{*}, λ^{*})$ is singular if and only if the strict complementarity condition does not hold. In other words, the set of indices of the weakly active constraints,

I_{0} (x^{*}) = {j = 1, \dots, m ∣ λ_{j}^{*} = 0, y_{j}^{*} = 0},

is not empty.

Let $P_{1}$ be the matrix of the orthoprojector onto $Im {\tilde{Φ}}^{'} (x^{*}, y^{*}, λ^{*})$ , and $P_{2}$ be the matrix of the orthoprojector onto ${(Im {\tilde{Φ}}^{'} (x^{*}, y^{*}, λ^{*}))}^{⊥}$ . Note that $P_{1}$ will be a projector onto the linear part of the mapping $Φ$ , while $P_{2}$ will be a projector onto the quadratic part of $Φ$ .

Introduce vector $\hat{h} = ({\hat{h}}_{1}, {\hat{h}}_{2}, {\hat{h}}_{3})$ such that $P_{2} {\tilde{Φ}}^{'} (x^{*}, y^{*}, λ^{*}) \hat{h} = 0$ . Then,

\{\begin{matrix} 2 Λ {\hat{h}}_{2} + 2 Y {\hat{h}}_{3} & = 0 \\ A {\hat{h}}_{1} + 2 Y {\hat{h}}_{2} & = 0 \end{matrix}

{({\hat{h}}_{2})}_{i} = \{\begin{matrix} k, & k \in R ∖ {0}, when y_{i}^{*} = 0, and λ_{i}^{*} = 0, \\ 0, & when y_{i}^{*} = 0, λ_{i}^{*} \neq 0, \\ w, & w \in R ∖ {0}, when y_{i}^{*} \neq 0, λ_{i}^{*} = 0, \end{matrix}

{({\hat{h}}_{3})}_{i} = \{\begin{matrix} t, & t \in R ∖ {0}, when y_{i}^{*} = 0, and λ_{i}^{*} = 0, \\ 0, & when y_{i}^{*} \neq 0, λ_{i}^{*} = 0, \\ r, & r \in R ∖ {0}, when y_{i}^{*} = 0, λ_{i}^{*} \neq 0, \end{matrix}

and ${\hat{h}}_{1}$ is defined by:

A {\hat{h}}_{1} + 2 Y {\hat{h}}_{2} = 0 .

(51)

Observe that $P_{2} {\tilde{Φ}}^{'} (x^{*}, y^{*}, λ^{*}) \hat{h} = 0$ , i.e., $\hat{h} \in Ker P_{2} {\tilde{Φ}}^{'} (x^{*}, y^{*}, λ^{*})$ .

Define H as a diagonal matrix with elements in the rows corresponding to the components of the vector ${\hat{h}}_{2}$ , and K as a diagonal matrix with elements of the vector ${\hat{h}}_{3}$ , so that:

H = diag ({({\hat{h}}_{2})}_{i}), K = diag ({({\hat{h}}_{3})}_{i}), i = 1, \dots, m .

Then:

{\tilde{Φ}}^{″} (x, y, λ) \hat{h} = (\begin{matrix} O & O_{n \times m} & O \\ O_{m \times n} & 2 K & 2 H \\ O_{m \times n} & 2 H & O \end{matrix}) .

The 2-factor-operator for the mapping $\tilde{Φ}$ is defined as:

\tilde{Ψ} (x, y, λ) = P_{1} \tilde{Φ} (x, y, λ) + P_{2} {\tilde{Φ}}^{'} (x, y, λ) \hat{h}

\tilde{Ψ} (x, y, λ) = (\begin{matrix} Q x + c + A^{T} λ \\ 2 Λ {\hat{h}}_{2} + 2 Y {\hat{h}}_{3} \\ A {\hat{h}}_{1} + 2 Y {\hat{h}}_{2} \end{matrix}) .

We choose a vector ${\hat{h}}_{1}$ according to (51) so that matrix:

\tilde{Ψ} (x^{*}, y^{*}, λ^{*}) = (\begin{matrix} Q & O_{n \times m} & A^{T} \\ O_{m \times n} & 2 K & 2 H \\ O_{m \times n} & 2 H & O_{n \times m} \end{matrix})

is nonsingular. Then $(x^{*}, y^{*}, λ^{*})$ can be determined using the following formula:

(\begin{matrix} x^{*} \\ y^{*} \\ λ^{*} \end{matrix}) = {[\begin{matrix} Q & O_{n \times m} & A^{T} \\ O_{m \times n} & 2 K & 2 H \\ O_{m \times n} & 2 H & O_{m \times m} \end{matrix}]}^{- 1} (\begin{matrix} - c \\ O_{m} \\ - A {\hat{h}}_{1} \end{matrix}) .

5.2. Identification of the Active Constraints

In this section, we address the issue of identifying the active constraints and propose strategies for numerically identifying the set of active constraints $I (x^{*})$ .

We begin by considering the mapping $h (z) : R^{n} \to R^{n}$ , where $h \in C^{2} (R^{n})$ . We can also represent h as an n-vector of functions $h_{1}$ , …, $h_{n}$ , such that $h (z) = {(h_{1} (z), \dots, h_{n} (z))}^{T}$ .

Theorem 8.

Let $h \in C^{2} (R^{n} \to R^{n})$ be 2-regular at the point $z^{*}$ , and let $N_{ε} (z^{*})$ be a sufficiently small neighborhood of $z^{*}$ in $R^{n}$ . Assume that there exists a function $η (z) : N_{ε} (z^{*}) \to R$ such that $η (z^{*}) = 0$ and for all $z \in N_{ε} (z^{*})$ , we have:

$c_{1} ∥ z - z^{*} ∥^{2} \leq η (z) \leq c_{2} ∥ z - z^{*} ∥,$ (52)

where $c_{1}, c_{2} > 0$ are independent constants.

Then there exists a sufficiently small δ such that $0 < δ < ε$ , and for any $1 \leq i \leq n$ and any point $z \in N_{δ} (z^{*})$ , the following holds:

Either $| h_{i} {(z) | > η (z)}^{1 / 3}$ , which implies that $h_{i} (z^{*}) \neq 0,$

Or $| h_{i} {(z) | < η (z)}^{1 / 3}$ , which implies that $h_{i} (z^{*}) = 0 .$

Proof.

The proof is similar to the one in [5]. □

Let:

ϱ (x) = min_{i = 1, \dots, m} d (g_{i}^{'} (x), span (g_{1}^{'} (x), \dots, g_{i - 1}^{'} (x), g_{i + 1}^{'} (x), \dots, g_{m}^{'} (x)),

where $d (a, S)$ denotes the distance between a vector a and a set S. It turns out that if we take $η (x) = max \{{∥ g (x) ∥}^{1 / 2}, ϱ (x)\}$ , and g is 2-regular at $x^{*}$ , then inequality (52) holds with $z = x$ .

Theorem 8 can be used for the numerical determination of the set of active constraints $I (x^{*})$ in the QP problem. To apply Theorem 8, we need to define a function $η (\cdot)$ that satisfies the conditions of the theorem. Recall that for QP problem (3), we denote the Lagrange function defined in (32) by $L (x, λ)$ .

Under the assumptions of LICQ and MSOSC, the following holds for $x \in N_{ε} (x^{*})$ and $λ \in N_{ε} (λ^{*})$ :

c_{1} ∥ x - x^{*} ∥ \leq ∥ L_{x}^{'} (x, λ) ∥ + \sum_{i = 1}^{m} | min {λ_{i}, b_{i} - a_{i} x} | \leq c_{2} (∥ x - x^{*} ∥ + ∥ λ - λ^{*} ∥),

where $ε > 0$ is sufficiently small (see, for example, [18]). Hence, the required function $η (x, λ)$ can be defined by:

η (x, λ) = ∥ L_{x}^{'} (x, λ) ∥ + \sum_{i = 1}^{m} | min {λ_{i}, b_{i} - a_{i} x} | .

Then, according to Theorem 8, for every $i = 1, \dots, m$ , if:

| b_{i} - a_{i} {x | \leq η (x, λ)}^{1 / 2},

then it follows that $i \in I (x^{*})$ .

Moreover, if we introduce the function:

\tilde{η} (x, λ) = ∥ {(L_{A})}_{x}^{'} (x, λ) ∥ + \sum_{i = 1}^{m} | a_{i} x - b_{i} |,

where: $L_{A} (x, λ) = \frac{1}{2} x^{T} Q x + c^{T} x + \sum_{i = 1}^{m} λ_{i} (a_{i} x - b_{i})$ , then $\tilde{η} (x, λ)$ satisfies the estimate:

c_{1} ∥ (x, λ) - (x^{*}, λ^{*}) ∥ \leq \tilde{η} (x, λ) \leq c_{2} ∥ (x, λ) - (x^{*}, λ^{*}) ∥

for $(x, λ) \in N_{ε} (x^{*}, λ^{*})$ , where $ε > 0$ is a sufficiently small number.

Then, for any $i \in I (x^{*})$ , if:

| λ_{i} | \leq \tilde{η} {(x, λ)}^{1 / 2},

then $i \in I_{0} (x^{*}) = I (x^{*}) ∖ I_{+} (x^{*})$ . Here, $I_{0} (x^{*})$ represents the set of constraints that are weakly active, i.e, for which the associated multipliers are equal to zero, while $I_{+} (x^{*})$ denotes the set of constraints that are strongly active at the point $x^{*}$ , i.e., the associated Lagrange multipliers are positive.

5.3. General Case

Consider the Lagrange function in the form:

L (x, λ_{0}, λ) = λ_{0} (\frac{1}{2} x^{T} Q x + c^{T} x) + \sum_{i = 1}^{m} λ_{i} (a_{i} x - b_{i}) .

In this case, if $x^{*}$ is a solution of problem (3), then there exist multipliers $λ_{0}^{*}$ and $λ^{*}$ , not all zeros, such that $λ_{i}^{*} \geq 0$ , $A x^{*} \leq b$ , and the point $(x^{*}, λ_{0}^{*}, λ^{*})$ is a solution of the following system:

Φ_{0} (x, λ_{0}, λ) = (\begin{matrix} λ_{0} (Q x + c) + A^{T} λ \\ Λ (A x - b) \\ λ_{0} + \sum_{i = 1}^{m} λ_{i} - 1 \end{matrix}) = 0, Λ = diag {(λ_{i})}_{i = 1, \dots, m} .

(53)

Introduce the notation:

ξ (x, λ_{0}, λ) = ∥ λ_{0} (Q x + c) + \sum_{i = 1}^{m} λ_{i} a_{i}^{T} ∥ + \sum_{i = 1}^{m} | λ_{i} (a_{i} x - b_{i}) | + λ_{0} + \sum_{i = 1}^{m} λ_{i} - 1 .

(54)

We are making the following assumption for the rest of the section.

Assumption A1.

Assume that there exists $C_{1} > 0$ and a sufficiently small $ε > 0$ such that for any $(x, λ_{0}, λ) \in N_{ε} (x^{*}, λ_{0}^{*}, λ^{*})$ , the following holds:

$ξ (x, λ_{0}, λ) \geq C_{1} {∥ x - x^{*} ∥}^{2} .$

Remark 5.

It is easy to see that for any $(x, λ_{0}, λ) \in N_{ε} (x^{*}, λ_{0}^{*}, λ^{*})$ ,

$ξ (x, λ_{0}, λ) \leq C_{2} ∥ (x, λ_{0}, λ) - (x^{*}, λ_{0}^{*}, λ^{*}) ∥,$

where $C_{2}$ is an independent constant.

As follows from Assumption 1 and Theorem 8, for those indices $i = 1, \dots, m$ that satisfy the inequalty:

| a_{i} x - b_{i} | < ξ {(x, λ_{0}, λ)}^{1 / 4}

we make a conclusion that $i \in I (x^{*})$ .

We can illustrate Assumption 1 with the following examples, where Assumption 1 holds.

Example 4.

This example illustrates a choice of the function ξ in a more general setting.

Consider mapping F defined by either:

F (x, λ) = (\begin{matrix} x^{2} - λ^{2} \\ x λ \end{matrix})

F (x, λ) = (\begin{matrix} x - λ \\ x λ \end{matrix}) .

In each of the two cases, $(x^{*}, λ^{*}) = (0, 0)$ .

Introduce function $ξ$ defined as:

ξ (x, λ) = ∥ F (x, λ) ∥ .

It follows that for any $(x, λ) \in N_{ε} (x^{*}, λ^{*})$ , the inequality:

ξ (x, λ) \geq {C ∥ x - 0 ∥}^{2}

holds, where C is an independent constant.

Example 5.

Consider the problem:

$\begin{matrix} \underset{x}{minimize} & x_{2}^{2} - x_{1} \\ subject to & 2 x_{1} \leq 0, \\ x_{1} \leq 0 . \end{matrix} .$ (55)

The solution to this problem is the point $(x_{1}^{*}, x_{2}^{*}) = (0, 0)$ , so $I (x^{*}) = {1, 2}$ . Moreover, the system (53) in this example is given by:

$Φ_{0} (x, λ_{0}, λ) = (\begin{matrix} - λ_{0} + λ_{1} + λ_{2} \\ 2 λ_{0} x_{2} \\ 2 λ_{1} x_{1} \\ λ_{2} x_{1} \\ λ_{0} + λ_{1} + λ_{2} - 1 \end{matrix}) = 0 .$ (56)

We also introduce the function $ξ (x, λ_{0}, λ)$ , which can be defined using Equation (54), but in this case, we define it as $ξ (x, λ_{0}, λ) = | Φ_{0} (x, λ_{0}, λ) |$ .

Under Assumption 1, we use the function $ξ$ to determine the set $I (x^{*})$ . We also take into account the fact that the constraints in the problem are linear and the rank of the matrix

(\begin{matrix} 2 & 0 \\ 1 & 0 \end{matrix})

is 1. This implies that the constraints are linearly dependent. Consequently, we can eliminate, for instance, the second constraint from problem (55) and simplify system (56) to the following one:

Φ_{0} (x, λ_{0}, λ) = (\begin{matrix} - λ_{0} + λ_{1} \\ 2 λ_{0} x_{2} \\ 2 λ_{1} x_{1} \\ λ_{0} + λ_{1} - 1 \end{matrix}) = 0 .

Now, by introducing:

P_{1} = (\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}), P_{2} = (\begin{matrix} 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 \end{matrix}), and h = (\begin{matrix} 0 \\ 0 \\ 1 \\ 1 \end{matrix}),

we construct the modified 2-factor-system:

{\tilde{Φ}}_{0} (x, λ_{0}, λ) = P_{1} Φ_{0} (x, λ_{0}, λ) + P_{2} Φ_{0}^{'} (x, λ_{0}, λ) h = (\begin{matrix} - λ_{0} + λ_{1} \\ 2 x_{2} \\ 2 x_{1} \\ λ_{0} + λ_{1} - 1 \end{matrix}) = 0 .

This system implies that the solution is $x^{*} = 0$ .

Now we will demonstrate the application of the approach described in Section 5.1.2 to problem (55). By removing the first constraint, we obtain a regular QP problem with $A_{A} = (1, 0)$ . Additionally, in this example,

Q = (\begin{matrix} 0 & 0 \\ 0 & 1 \end{matrix}), c = (\begin{matrix} - 1 \\ 0 \end{matrix}) .

Then application of Equation (45) derived in Section 5.1.2 yields:

(\begin{matrix} x_{1}^{*} \\ x_{2}^{*} \\ λ^{*} \end{matrix}) = {[\begin{matrix} 0 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \end{matrix}]}^{- 1} (\begin{matrix} 1 \\ 0 \\ 0 \end{matrix}) = (\begin{matrix} 0 \\ 0 \\ 1 \end{matrix}),

as claimed.

There are various directions in which the approach proposed in this paper can be extended. The next example illustrates a degenerate QP problem, in which MSOSC does not hold at the solution. However, an approach proposed in this paper can be applied to find a solution to this problem. Moreover, the solution set is locally not unique.

There are various directions in which the approach proposed in this paper can be extended. The next example illustrates a degenerate QP problem in which MSOSC does not hold at the solution. However, the approach proposed in this paper can still be applied to find a solution to this problem. It is worth noting that the solution set in this case is locally not unique.

Example 6.

Consider the problem:

$\begin{matrix} \underset{x}{minimize} & x_{1} x_{2} - x_{1} \\ subject to & x_{1} \leq 0 . \end{matrix} .$ (57)

The solution to this problem is the set of points $X^{*} = {(0, x_{2}^{*}) ∣ x_{2}^{*} \in R}$ . We observe that the objective function in this example is satisfied as an equality for any $x^{*} \in X^{*}$ . Additionally, the system (53) for this example consists of one linear equation and three quadratic equations:

$Φ_{0} (x, λ_{0}, λ) = (\begin{matrix} λ_{0} x_{2} - λ_{0} + λ_{1} \\ λ_{0} x_{1} \\ λ_{1} x_{1} \\ λ_{0} + λ_{1} - 1 \end{matrix}) = 0 .$

Denote the projection of the point x onto the set $X^{*}$ by $P_{X^{*}} (x)$ . Also, define the notation $ξ (x, λ_{0}, λ) = ∥ Φ_{0} (x, λ_{0}, λ) ∥$ . For any point $(x, λ_{0}, λ) \in N_{ε} (x^{*}, λ_{0}^{*}, λ^{*})$ , we have the inequality:

ξ (x, λ_{0}, λ) \geq α ∥ x - P_{X^{*}} (x) ∥,

where $α > 0$ and $ε > 0$ is sufficiently small.

Consider, for example, the point $x^{*} = {(0, 1)}^{T}$ .

In problem (57), we replace the inequality $x_{1} \leq 0$ with the equation $x_{1} + y^{2} = 0$ , where $y \in R$ . We then introduce the Lagrange function in the form of (49) as follows:

L (x, y, λ) = (x_{1} x_{2} - x_{1}) + λ (x_{1} + y^{2}) .

If $λ^{*}$ is a Lagrange multiplier corresponding to the solution $(x_{1}^{*}, x_{2}^{*}, y^{*}) = (0, x_{2}^{*}, 0)$ , then the point $(x_{1}^{*}, x_{2}^{*}, y^{*}, λ^{*})$ is a solution of the following system:

Φ (x_{1}, x_{2}, y, λ) = (\begin{matrix} x_{2} - 1 + λ \\ x_{1} \\ 2 λ y \\ x_{1} + y^{2} \end{matrix}) = 0 .

(58)

The Jacobian matrix of this system is given by:

Φ^{'} (x_{1}, x_{2}, y, λ) = (\begin{matrix} 0 & 1 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 2 λ & 2 y \\ 1 & 0 & 2 y & 0 \end{matrix}) .

This Jacobian matrix becomes singular at $(x_{1}, x_{2}, y, λ) = (0, x_{2}, 0, λ)$ . To overcome this singularity, we can apply the approach described in the paper. Specifically, we notice that $Φ^{'} (x_{1}^{*}, x_{2}^{*}, y^{*}, λ^{*}) h = 0$ for $h = (0, 1, 1, - 1)$ . Moreover, the point $(x_{1}^{*}, x_{2}^{*}, y^{*}, λ^{*}) = (0, 1, 0, 0)$ is one of the solutions of the system defined in (58), corresponding to a solution of the QP problem (57).

Additionally,

Φ^{'} (x_{1}, x_{2}, y, λ) (\begin{matrix} 0 \\ 1 \\ 1 \\ - 1 \end{matrix}) = (\begin{matrix} 0 \\ 0 \\ 2 λ - 2 y \\ 2 y \end{matrix}), Φ^{″} (x_{1}, x_{2}, y, λ) (\begin{matrix} 0 \\ 1 \\ 1 \\ - 1 \end{matrix}) = (\begin{matrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & - 2 & 2 \\ 0 & 0 & 2 & 0 \end{matrix}) .

The 2-factor-operator of $Φ (x_{1}^{*}, x_{2}^{*}, y^{*}, λ^{*})$ with respect to the vector $h = (0, 1, 1, - 1)$ , which is defined similarly to the operator in Equation (11), is given by:

Φ^{'} (x_{1}^{*}, x_{2}^{*}, y^{*}, λ^{*}) + Φ^{″} (x_{1}^{*}, x_{2}^{*}, y^{*}, λ^{*}) h = (\begin{matrix} 0 & 1 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & - 2 & 2 \\ 1 & 0 & 2 & 0 \end{matrix}) .

Note that the 2-factor-operator is nonsingular and the system:

Φ (x_{1}, x_{2}, y, λ) + Φ^{'} (x_{1}, x_{2}, y, λ) h = (\begin{matrix} x_{2} - 1 + λ \\ x_{1} \\ 2 λ y + 2 λ - 2 y \\ x_{1} + y^{2} + 2 y \end{matrix}) = 0

has the point $(x_{1}^{*}, x_{2}^{*}, y^{*}, λ^{*}) = (0, 1, 0, 0)$ as its regular solution.

6. Conclusions

The paper focused on applying the p-regularity theory to nonlinear equations with quadratic mappings and quadratic programming (QP) problems. The first part of the paper used the special structure of the nonlinear equation and the construction of a 2-factor operator to derive a formula for the solution of the equation. In the second part, the QP problem was reduced to a system of linear equations using a 2-factor-operator. The solution of the system is a local minimizer of the QP problem with a corresponding Lagrange multiplier. The formula for the solution of the linear system was given. The paper also described a procedure for identifying the active constraints, which was used in constructing the linear system.

The paper primarily focuses on the case where the matrix $F^{'} (x^{*})$ is degenerate at the solution of the nonlinear equation $F (x) = 0$ . However, the matrix $F^{'} (x^{*})$ does not need to be degenerate. While we do not explicitly address the identification of degeneracy at a solution point, it is possible to determine the degeneracy of the matrix $F^{'} (x^{*})$ by examining the behavior of the mapping F in a small neighborhood of the solution $x^{*}$ . Specifically, a function $ν_{p} (x)$ can be defined, such that:

c_{1} ∥ x - x^{*} ∥^{p} \leq ν_{p} (x) \leq c_{2} ∥ x - x^{*} ∥

for some natural number p and constants $c_{1}$ and $c_{2}$ . Based on the conclusion about the degeneracy of the matrix $F^{'} (x^{*})$ , an appropriate method can be chosen to solve the system of equations $F (x) = 0$ , as stated in the following theorem.

Theorem 9.

Let $F \in C^{p} (R^{n}, R^{n})$ be such that $F (x^{*}) = 0$ , and let there exist $x \in N_{ε} (x^{*})$ , where $ε > 0$ is sufficiently small. Then we have the following two cases:

In the first case, for all $i \in {1, 2, \dots, n}$ , we have:
$d (f_{i}^{'} (x), span (f_{1}^{'} (x), \dots, f_{i - 1}^{'} (x), f_{i + 1}^{'} (x), \dots, f_{m}^{'} (x))) > ν_{p} {(x)}^{1 / p + 1} .$

In this case, $\det F^{'} (x^{*}) \neq 0$ , indicating that F is not degenerate at $x^{*}$ .

In the second case, there exists an index $i \in {1, 2, \dots, n}$ such that:
$d (f_{i}^{'} (x), span (f_{1}^{'} (x), \dots, f_{i - 1}^{'} (x), f_{i + 1}^{'} (x), \dots, f_{m}^{'} (x))) < ν_{p} {(x)}^{1 / p + 1} .$

In this case, $\det F^{'} (x^{*}) = 0$ , indicating that F is degenerate at $x^{*}$ .

Certainly, the construction of the function $ν_{p} (x)$ is an important consideration. One approach to constructing such a function is provided in the following lemma, specifically for the case of $p = 2$ .

Lemma 3.

Let $F \in C^{2} (R^{n}, R^{n})$ and assume that either ${(F^{'} (x^{*}))}^{- 1}$ exists, or for any $h \in K e r F^{'} (x^{*})$ , there exists ${(F^{″} (x^{*}) h)}^{- 1}$ with $∥ h ∥ = 1$ . Then, there exists a sufficiently small $ε > 0$ such that the following inequality holds for all $x \in N_{ε} (x^{*})$ :

$∥ F (x) ∥ \geq C ∥ x - x^{*} ∥^{2},$

where C is a positive constant.

Based on this lemma, one can choose the function $ν_{2} (x) = ∥ F (x) ∥$ .

It is worth noting that the proposed approach also covers the case where the system of equations consists of both linear and quadratic equations. Moreover, the approach can be extended to solve multilinear equations with polynomials of degree p, given by the equation:

F (x) = Q_{p} {[x]}^{p} + Q_{p - 1} {[x]}^{p - 1} + \dots + Q_{1} [x] + Q_{0} = 0,

where $Q_{k} {[x]}^{k}$ is k-multilinear mapping for $k = 1, \dots, p$ . Additionally, polynomial programming problems can be formulated as follows:

\begin{matrix} minimize & f (x) \\ subject to & f_{i} (x) \leq 0, i = 1, \dots, m, \end{matrix}

where $f_{i} (x)$ are polynomial mappings.

There are various possible directions for future research work, based on the results obtained in this paper. While the focus of the current work was on obtaining exact formulas for the solutions of nonlinear equations with quadratic mappings and quadratic programming problems, it would be interesting to generalize the proposed approaches to other classes of problems, including systems of equations with both linear and quadratic mappings. Another direction would be an extension of presented methods to polynomial equations and polynomial programming problems. Another direction of future research could be focusing on numerical studies and the implementation of the methods described in the paper.

Acknowledgments

The authors thank the anonymous reviewers for their careful reading of our manuscript and for their insightful comments and suggestions that helped us improve the quality of the paper.

Author Contributions

Conceptualization, A.A.T.; methodology, O.B., A.P. and A.A.T.; validation, O.B., A.P. and A.A.T.; formal analysis, O.B., A.P. and A.A.T.; investigation, O.B., A.P. and A.A.T.; resources, O.B. and A.A.T.; writing—original draft preparation, O.B. and A.P.; supervision, O.B. and A.A.T.; project administration, A.P.; funding acquisition, A.P. and A.A.T. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript.

Funding Statement

This research was funded by the Ministry of Education and Science, grant number 144/23/B.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

1.Barvinok A., Rudelson M. When a system of real quadratic equations has a solution. Adv. Math. 2022;403:108391. doi: 10.1016/j.aim.2022.108391. [DOI] [Google Scholar]
2.Bereznev V.A. Theoretical and Applied Problems of Nonlinear Analysis. Russian Academy of Sciences, Computing Center; Moscow, Russia: 2007. [Google Scholar]
3.Li N., Zhi L. Improved two-step Newton’s method for computing simple multiple zeros of polynomial systems. Numer. Algorithm. 2022;9:19–50. doi: 10.1007/s11075-022-01253-7. [DOI] [Google Scholar]
4.Poirier A., Torres J. Approximating roots by quadratic iteration. Proyecciones J. Math. 2023;42:407–431. doi: 10.22199/issn.0717-6279-5447. [DOI] [Google Scholar]
5.Tret’yakov A.A., Marsden J.E. Factor–analysis of nonlinear mappings: p–regularity theory. Commun. Pure Appl. Anal. 2003;2:425–445. [Google Scholar]
6.Anitescu M. A superlinearly convergent sequential quadratically constrained quadratic programming algorithm for degenerate nonlinear programming. SIAM J. Optim. 2002;12:949–978. doi: 10.1137/S1052623499365309. [DOI] [Google Scholar]
7.Fletcher R. Resolving degeneracy in quadratic programming. Degeneracy in optimization problems. Ann. Oper. Res. 1993;46/47:307–334. doi: 10.1007/BF02023102. [DOI] [Google Scholar]
8.De Marchi A. On a primal-dual Newton proximal method for convex quadratic programs. Comput. Optim. Appl. 2022;81:369–395. doi: 10.1007/s10589-021-00342-y. [DOI] [Google Scholar]
9.Permenter F. Log-domain interior-point methods for convex quadratic programming. Optim. Lett. 2023;17:1613–1631. [Google Scholar]
10.Gould N.I.M., Orban D., Robinson D.P. Trajectory-following methods for large-scale degenerate convex quadratic programming. Math. Program. Comput. 2013;5:113–142. doi: 10.1007/s12532-012-0050-3. [DOI] [Google Scholar]
11.Yamakawa Y., Takayuki O. A stabilized sequential quadratic semidefinite programming method for degenerate nonlinear semidefinite programs. Comput. Optim. Appl. 2022;83:1027–1064. doi: 10.1007/s10589-022-00402-x. [DOI] [Google Scholar]
12.Dostal Z., Brzobohaty T., Horak D., Kozubek T., Vodstrcil T. On R-linear convergence of semi-monotonic inexact augmented Lagrangians for bound and equality constrained quadratic programming problems with application. Comput. Math. Appl. 2014;67:515–526. doi: 10.1016/j.camwa.2013.11.009. [DOI] [Google Scholar]
13.Belash K.N., Tret’yakov A.A. Methods for solving degenerate problems. USSR Comput. Math. Math. Phys. 1988;28:90–94. doi: 10.1016/0041-5553(88)90116-4. [DOI] [Google Scholar]
14.Bertsekas D.P. Nonlinear Programming. Athena Scientific; Belmont, MA, USA: 1999. [Google Scholar]
15.Alekseev V.M., Tikhomirov V.M., Fomin S.V. Optimal Control. Consultants Bureau; New York, NY, USA: London, UK: 1987. [Google Scholar]
16.Szczepanik E., Tret’yakov A.A. p-Regularity Theory and Methods of Solving Nonlinear Optimization Problems. Uniwersytet Przyrodniczo-Humanistyczny w Siedlcach; Siedlce, Poland: 2020. (In Polish) [Google Scholar]
17.Nocedal J., Wright S.J. Numerical Optimization. Springer; New York, NY, USA: 1999. [Google Scholar]
18.Facchinei F., Fisher A., Kanzow C. On the accurate identification of active constraints. SIAM J. Optim. 1998;9:14–32. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

[B1-entropy-25-01112] 1.Barvinok A., Rudelson M. When a system of real quadratic equations has a solution. Adv. Math. 2022;403:108391. doi: 10.1016/j.aim.2022.108391. [DOI] [Google Scholar]

[B2-entropy-25-01112] 2.Bereznev V.A. Theoretical and Applied Problems of Nonlinear Analysis. Russian Academy of Sciences, Computing Center; Moscow, Russia: 2007. [Google Scholar]

[B3-entropy-25-01112] 3.Li N., Zhi L. Improved two-step Newton’s method for computing simple multiple zeros of polynomial systems. Numer. Algorithm. 2022;9:19–50. doi: 10.1007/s11075-022-01253-7. [DOI] [Google Scholar]

[B4-entropy-25-01112] 4.Poirier A., Torres J. Approximating roots by quadratic iteration. Proyecciones J. Math. 2023;42:407–431. doi: 10.22199/issn.0717-6279-5447. [DOI] [Google Scholar]

[B5-entropy-25-01112] 5.Tret’yakov A.A., Marsden J.E. Factor–analysis of nonlinear mappings: p–regularity theory. Commun. Pure Appl. Anal. 2003;2:425–445. [Google Scholar]

[B6-entropy-25-01112] 6.Anitescu M. A superlinearly convergent sequential quadratically constrained quadratic programming algorithm for degenerate nonlinear programming. SIAM J. Optim. 2002;12:949–978. doi: 10.1137/S1052623499365309. [DOI] [Google Scholar]

[B7-entropy-25-01112] 7.Fletcher R. Resolving degeneracy in quadratic programming. Degeneracy in optimization problems. Ann. Oper. Res. 1993;46/47:307–334. doi: 10.1007/BF02023102. [DOI] [Google Scholar]

[B8-entropy-25-01112] 8.De Marchi A. On a primal-dual Newton proximal method for convex quadratic programs. Comput. Optim. Appl. 2022;81:369–395. doi: 10.1007/s10589-021-00342-y. [DOI] [Google Scholar]

[B9-entropy-25-01112] 9.Permenter F. Log-domain interior-point methods for convex quadratic programming. Optim. Lett. 2023;17:1613–1631. [Google Scholar]

[B10-entropy-25-01112] 10.Gould N.I.M., Orban D., Robinson D.P. Trajectory-following methods for large-scale degenerate convex quadratic programming. Math. Program. Comput. 2013;5:113–142. doi: 10.1007/s12532-012-0050-3. [DOI] [Google Scholar]

[B11-entropy-25-01112] 11.Yamakawa Y., Takayuki O. A stabilized sequential quadratic semidefinite programming method for degenerate nonlinear semidefinite programs. Comput. Optim. Appl. 2022;83:1027–1064. doi: 10.1007/s10589-022-00402-x. [DOI] [Google Scholar]

[B12-entropy-25-01112] 12.Dostal Z., Brzobohaty T., Horak D., Kozubek T., Vodstrcil T. On R-linear convergence of semi-monotonic inexact augmented Lagrangians for bound and equality constrained quadratic programming problems with application. Comput. Math. Appl. 2014;67:515–526. doi: 10.1016/j.camwa.2013.11.009. [DOI] [Google Scholar]

[B13-entropy-25-01112] 13.Belash K.N., Tret’yakov A.A. Methods for solving degenerate problems. USSR Comput. Math. Math. Phys. 1988;28:90–94. doi: 10.1016/0041-5553(88)90116-4. [DOI] [Google Scholar]

[B14-entropy-25-01112] 14.Bertsekas D.P. Nonlinear Programming. Athena Scientific; Belmont, MA, USA: 1999. [Google Scholar]

[B15-entropy-25-01112] 15.Alekseev V.M., Tikhomirov V.M., Fomin S.V. Optimal Control. Consultants Bureau; New York, NY, USA: London, UK: 1987. [Google Scholar]

[B16-entropy-25-01112] 16.Szczepanik E., Tret’yakov A.A. p-Regularity Theory and Methods of Solving Nonlinear Optimization Problems. Uniwersytet Przyrodniczo-Humanistyczny w Siedlcach; Siedlce, Poland: 2020. (In Polish) [Google Scholar]

[B17-entropy-25-01112] 17.Nocedal J., Wright S.J. Numerical Optimization. Springer; New York, NY, USA: 1999. [Google Scholar]

[B18-entropy-25-01112] 18.Facchinei F., Fisher A., Kanzow C. On the accurate identification of active constraints. SIAM J. Optim. 1998;9:14–32. [Google Scholar]

PERMALINK

On the Finite Complexity of Solutions in a Degenerate System of Quadratic Equations: Exact Formula

Olga Brezhneva

Agnieszka Prusińska

Alexey A Tret’yakov

Roles

Abstract

1. Introduction

Definition 1.

Definition 2.

Theorem 1.

2. Elements of the p-Regularity Theory

2.1. The Main Definitions and Constructions of the p-Regularity Theory

Definition 3.

Definition 4.

Definition 5.

Definition 6.

Definition 7.

Definition 8.

Definition 9.

2.2. The p-Factor-Method for Solving Singular Nonlinear Equations

Theorem 2.

Theorem 3.

Proof.

Theorem 4.

Theorem 5.

3. Nonlinear Equations with Quadratic Mappings: the Exact Solution Formula

Remark 1.

Example 1.

Theorem 6.

Example 1

4. Procedure for Identifying Zero Elements

Theorem 7

Lemma 1.

Remark 2.

Example 2.

5. Quadratic Programming Problems

Definition 10

5.1. Regular Quadratic Programming

5.1.1. First Approach to Solving the QP Problem

Lemma 2.

Proof.

5.1.2. Second Approach to Solving the QP Problem

Remark 3.

Remark 4.

5.1.3. Examples

Example 2.

Example 3.

5.1.4. Third Approach to Solving the QP Problem

5.2. Identification of the Active Constraints

Theorem 8.

Proof.

5.3. General Case

Assumption A1.

Remark 5.

Example 4.

Example 5.

Example 6.

6. Conclusions

Theorem 9.

Lemma 3.

Acknowledgments

Author Contributions

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

Funding Statement

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

2. Elements of the $p$ -Regularity Theory