Skip to main content
Entropy logoLink to Entropy
. 2023 Feb 10;25(2):330. doi: 10.3390/e25020330

An Inexact Feasible Quantum Interior Point Method for Linearly Constrained Quadratic Optimization

Zeguan Wu 1, Mohammadhossein Mohammadisiahroudi 1, Brandon Augustino 1, Xiu Yang 1, Tamás Terlaky 1,*
Editor: Andreas Wichert1
PMCID: PMC9956007  PMID: 36832696

Abstract

Quantum linear system algorithms (QLSAs) have the potential to speed up algorithms that rely on solving linear systems. Interior point methods (IPMs) yield a fundamental family of polynomial-time algorithms for solving optimization problems. IPMs solve a Newton linear system at each iteration to compute the search direction; thus, QLSAs can potentially speed up IPMs. Due to the noise in contemporary quantum computers, quantum-assisted IPMs (QIPMs) only admit an inexact solution to the Newton linear system. Typically, an inexact search direction leads to an infeasible solution, so, to overcome this, we propose an inexact-feasible QIPM (IF-QIPM) for solving linearly constrained quadratic optimization problems. We also apply the algorithm to 1-norm soft margin support vector machine (SVM) problems, and demonstrate that our algorithm enjoys a speedup in the dimension over existing approaches. This complexity bound is better than any existing classical or quantum algorithm that produces a classical solution.

Keywords: quantum computing, interior point method, quadratic optimization

MSC: 90C20, 90C51, 81P68

1. Introduction

Linearly constrained quadratic optimization (LCQO) is defined as optimizing a convex quadratic objective function over a set of linear constraints. Linear optimization is a special case of LCQO that corresponds to the case where the objective function is linear. LCQO has rich theory, algorithms, and applications. Many problems in machine learning can be formulated as LCQO problems, including variants of least square problems and variants of support vector machine training [1,2]. Some important optimization algorithms also have LCQO subproblems, e.g., sequential quadratic programming [1].

The modern age of IPMs was launched by Karmarkar’s projective method for linear optimization (LO). Since then, many variants of IPMs have also been applied to nonlinear optimization problems, including LCQO problems [3,4]. Contemporary IPMs progress towards the set of optimal solutions by moving within a neighbourhood of an analytic curve known as the central path. IPMs can be categorized according to whether or not the the sequence of iterates produced by the algorithm satisfies feasibility. Feasible IPMs are initialized with a strictly feasible solution and maintain feasibility in each iteration, whereas infeasible IPMs start from an infeasible interior solution and do not require feasibility to be exactly satisfied at any point of the algorithm. For LCQO problems with n variables, feasible IPMs can produce an ϵ-approximate solution using O(nlog(1/ϵ)) iterations, whereas infeasible IPMs require O(n2log(1/ϵ)) IPM iterations to converge to an ϵ-approximate solution [5,6].

At each IPM iteration, a linear system needs to be solved to obtain the search direction, called the Newton direction. This so-called Newton linear system is traditionally in the form of the augmented system or the normal equation system. Classically, these linear systems can be solved exactly using Bunch–Parlett factorization if the matrices in the systems are symmetric indefinite [7], or Cholesky factorization if the matrices are symmetric positive definite. Solving the Newton linear systems using direct factorization approaches requires the use of O(n3) arithmetic operations, which suggests that feasible IPMs based on factoring methods cannot exhibit complexity better than O(n3.5log(1/ϵ)), whereas, with the partial update, they achieve O(n3log(1/ϵ)) arithmetic operation complexity. The linear systems can also be solved inexactly using some inexact methods, e.g., Krylov subspace methods, which may require fewer iterations if the desired accuracy of the solutions to the linear systems is not high. However, inaccurately solving the Newton linear systems (i.e., the inaccuracy of the search directions) may result in the infeasibility of the sequence of solutions generated by IPMs; therefore, they have only been used in infeasible IPMs.

The advent of quantum technology has led to the development of many quantum-assisted algorithms for optimization and machine learning applications, such as linear regression [8] and the support vector machine training problem [9]. Following the seminal work on quantum algorithms for solving linear systems of equations [10], researchers have been studying whether QLSAs could yield quantum speedups in classical optimization algorithms. In particular, quantum IPMs (QIPMs) that utilize QLSAs to solve the Newton linear system arising in each iteration have been proposed for LO problems [11,12] and semidefinite optimization problems [13]. To maintain the feasibility of the iterates using quantum subroutines, the authors of [13,14] introduce the so-called orthogonal subspace system (OSS) for SDO and LO problems, and, in particular, demonstrate that a feasible solution to the original Newton system can be recovered from an inexact solution to the OSS. However, linearly constrained quadratic optimization problems, which are fundamental to both optimization and machine learning, have yet to be formally studied in the quantum literature.

In this work, we generalize the OSS for LO problems in [14] to LCQO problems and provide an efficient method for constructing the OSS using a quantum computer. Using the OSS, we can obtain an inexact feasible IPM, solving for the search directions inexactly but maintaining the feasibility of the iterates throughout the process of our IPM. The feasibility of the iterates gives better IPM iteration complexity and the bottleneck becomes solving the linear system, OSS. In particular, we show that a quantum implementation of our algorithm with access to quantum RAM (QRAM) obtains an ϵ-approximate solution to a given LCQO problem with worst-case complexity

O˜n,ω¯,1ϵnnω¯2ϵ+σmax(Q)κVAQ+n2,

where ω¯=maxkωk, σmax(Q) is the maximum singular value of the Hessian of the objective function and κVAQ is the condition number of a matrix determined by initial data; see Lemma 3. We also consider the application of 1-norm soft margin SVM problems, in which case, an ϵ-approximate solution is obtained with complexity

O˜m,n,ω¯,1ϵ(m+n)1.5ω¯2ϵ+σmax(Q)κVAQ+(m+n)2.5.

Here, m is the number of features and n is the number of data points. ω¯, Q, and κVAQ are defined similarly from the LCQO formulation of the SVM problem; see Section 4. The dependence on dimension is better than any existing quantum or classical algorithm.

The rest of this paper is organized as follows: in Section 2, we introduce IPMs for LCQO and the OSS system; in Section 3, we discuss how to use quantum algorithms to find the Newton directions and analyze the complexity of our IF-QIPM; in Section 4, we apply our IF-QIPM to the support vector machine problem. Discussions are provided in Section 5, and some technical proofs are moved to the Appendix A and Appendix B.

2. Preliminaries

In this section, we introduce notations before reviewing the theory of IPMs applied to LCQO, and derive the OSS system for the class of problems.

2.1. Notation

Vectors are typically represented by lower-case letters. We write 0n when referring to the n-dimensional all-zeros vector, and the n-dimensional all-ones vector is denoted by en. When the dimension is obvious from the context, we may write 0 or e, respectively. Matrices are typically represented with upper-case letters. The identity of dimension n is denoted by In×n, and 0n×m represents the n×m-dimensional all-zero matrix, again, dropping these subscripts when the dimension is obvious from the context. For a general n×m-dimensional matrix H, we write Hi· to refer to its ith row, and, similarly, denote the jth column by H·j. For the (i,j)th element of H, we write Hij or Hi,j.

For real-valued functions f1, f2, and f3, we write

f1=O(f2)

if there exists a positive number k4 such that f1k4f2. We write

f1=O˜f3(f2)

if there exists a positive number k5 such that f1k5f2×polylog(f3).

2.2. IPMs for LCQO

In this work, LCQO is defined as follows.

 Definition 1

(LCQO Problem). For vectors bRm, cRn, and matrices ARm×n and QRn×n with rank(A)=mn and Q symmetric positive semidefinite, we define the primal and dual LCQO problems as:

(P)mincTx+12xTQx,s.t.Ax=b,x0,(D)maxbTy12xTQx,s.t.ATy+sQx=c,s0, (1)

where xRn is the vector of primal variables, and yRm, sRn are vectors of the dual variables. Problem (P) is called the primal problem and (D) is called the dual problem.

Since A is of full row-rank, A does not contain any null rows, and we further make the following assumption on matrix A.

Assumption 1. 

Matrix A has no all-zero columns.

Remark 1. 

Suppose that A has zero columns. Without a loss of generality, assume that the nth column is all-zero. Introducing a new variable xn+1, we can rewrite the problem as

minc0Txxn+1+12xxn+1TQ0n×101×n0xxn+1,s.t.A·1A·(n1)0m×10m×10011xxn+1=b0,x0,xn+10.

The new LCQO problem is equivalent to the original one, and contains fewer all-zero columns. Iterating this procedure to eliminate each of the all-zero columns, we obtain a new LCQO problem satisfying Assumption 1 with no more than 2nm variables and n constraints in the worst case.

Assumption 2. 

There exists a solution (x,y,s)Rn×Rm×Rn such that

Ax=b,x>0,ATy+sQx=c,ands>0.

The set of primal–dual feasible solutions is defined as

PD:=(x,y,s)Rn×Rm×Rn:Ax=b,ATy+sQx=c,(x,s)0

and, similarly, the set of interior feasible primal–dual solutions is given by

PD0:=(x,y,s)Rn×Rm×Rn:Ax=b,ATy+sQx=c,(x,s)>0.

By strong duality, the set of optimal solutions can be characterized as

PD*:=(x,y,s)PD:xs=0,

where xs denotes the Hadamard, i.e., component-wise product of x and s. Let ϵ>0; then, the set of ϵ-approximate solutions to Problem (1) can be defined as

PDϵ:=(x,y,s)PD:xTsnϵ. (2)

Let X and S be diagonal matrices of x and s, respectively. Under Assumption 2, for all μ>0, the perturbed system of optimality conditions

Ax=b,ATy+sQx=c,XSe=μe,(x,s)0 (3)

has a unique solution (x(μ),y(μ),s(μ)), and this set of solutions gives rise to the primal–dual central path

CP:=(x,y,s)PD0|xisi=μfori{1,,n};forμ>0.

IPMs apply Newton’s method to solve system (3). At each iteration of infeasible IPMs, a candidate solution to the primal–dual LCQO pair in (1) is updated by solving the following linear system to find the Newton direction:

A00QATIS0XΔxΔyΔs=rprdrc, (4)

where

rp=bAxrd=cATysrc=σμeXSe,

are residuals, and σ(0,1) is the barrier reduction parameter. If rp=0 and rd=0, then the solution (x,y,s) exactly satisfies primal–dual feasibility. We can also define residuals in different ways as we will show later. Once the Newton direction is found, one can move along the direction but has to stay in a neighbourhood of the central path, which is defined as

N2(θ):=(x,y,s)PD0|XSeμe2θμ, (5)

where θ(0,1).

Until relatively recently, inexact solution approaches to solve the Newton linear system (4) had only been utilized in inexact infeasible IPMs (II-IPMs). For LCQO problems, ref. [6] proposes an II-IPM using an iterative method to solve the Newton systems and obtains a worst-case iteration complexity O(n2log(1ϵ)). On the other hand, feasible IPMs for LCQO problems enjoy O(nlog(1ϵ)) iteration complexity [15,16,17]. In [5], the author provides a general inexact feasible IPM for LCQO problems but does not discuss how the sequence of iterates could be guaranteed to maintain primal–dual feasibility exactly when using inexact linear system solvers. This is a vital consideration, as the feasible neighborhood of the central path as outlined in (5) is a subset of the primal–dual feasible set; if primal and dual feasibility are not satisfied exactly at any point in the algorithm, the iterates leave this neighborhood and the method fails. Our work fills this gap by using a method inspired by the QIPMs of [13,14].

2.3. Orthogonal Subspaces System

Assume that (x,y,s)PD0. To maintain the feasibility of the primal and dual variables, the first two linear equations in system (4) need to be solved with rp=0 and rd=0 exactly, which can be guaranteed if Δx lies in the null space of A, denoted as Null(A), and Δs=QΔxATΔy. Accordingly, we can rewrite system (4) by representing Δx as a linear combination of basis elements of Null(A). To achieve this, we partition A as A=ABAN, where AB is a basis of A. Then, we construct the following matrix:

V=AB1ANI.

Matrix V has a full column rank and satisfies AV=0, i.e., the columns of V span the null space of A. Let Δx=Vλ, where λRnm is the unknown coefficient vector used to determine Δx. Subsequently, we can rewrite system (4) by substituting Δx and Δs in the third equation as

SVλ+XQVλATΔy=rcSV+XQVXAT·λΔy=rc. (6)

A similar system was proposed and called “Orthogonal Subspaces System” (OSS) in [13,14], and we use the same name in this work. The matrix in the OSS system (6) is of size n×n, and it is nonsingular. Even if the OSS system is solved inexactly, primal and dual feasibility are preserved by computing Δx=Vλ and Δs=QVλATΔy. Thus, we can conclude that any inexactness will only impact the third equation of (4), i.e., rp=0 and rd=0. This property of the OSS system is very convenient when analyzing the proposed inexact IPM, and allows us to obtain the best known iteration complexity for IPMs.

3. Inexact Feasible IPM with QLSAs

In this section, we propose our IF-QIPM for LCQO problems. We begin with the IF-IPM structure introduced by [5] and describe how to quantize it into an IF-QIPM. Then, we analyze the construction of the OSS system and conclude by analyzing the overall complexity of our IF-QIPM.

3.1. IF-IPM for LCQO

In [5], the author studies a general conceptual form IF-IPM for QCLO problems by assuming the feasibility of the primal and dual iterates, which induces the following system:

A00QATIS0XΔxΔyΔs=00rc, (7)

where rc=σμeXSe, with σ(0,1) being the reduction factor of the central path parameter μ, i.e., μnew=σμ. When system (7) is solved with rc=σμeXSe inexactly yielding an error r, if r2δrc2 for some δ(0,1), the inexact IPM converges to an ϵ-approximate solution to Problem (1) in at most O(nlog(1/ϵ)) iterations. As we mentioned earlier, it is not specified in [5] how to preserve primal and dual feasibility when system (7) is solved inexactly. Thus, it is presently not clear whether one could recover the convergence conditions described in [5] using inexact approaches, which are reliant on the assumption of primal–dual feasibility (see, e.g., system (7)).

Now, we present a general procedure of how to solve system (7) inexactly, while the inexactness error occurs only in the third equation of system (7). Let (λ,Δy) be an inexact solution for system (6) and r be the error at this solution, i.e.,

SV+XQVXAT·λΔy=rc+r.

The corresponding Newton step

Δx=VλΔs=QΔxATΔy

satisfies

A00QATIS0X·ΔxΔyΔs=00rc+r.

Recall that once (λ,Δy) is determined, then (Δx,Δs) is also (uniquely) determined. An interesting property is that, if (λ,Δy) and (Δx,Δy,Δs) can be deduced from each other, then the OSS system and system (7) yield the same error term r. Hence, the convergence conditions built upon system (7) can be directly examined using the residual rc and error r of the OSS system. Let ϵOSS be the target accuracy of the OSS system (6), i.e.,

λλ*,ΔyΔy*2ϵOSS,

where (λ*,Δy*) is the accurate solution. According to [5], in order to guarantee that the IF-IPM converges, we must have

r2=SV+XQVXAT·λΔyrc2SV+XQVXAT2ϵOSSδrc2,

where δ(0,1) is a constant parameter. Therefore, to ensure the convergence of the IF-IPM, it suffices to set

ϵOSSδrc2SV+XQVXAT2.

The IF-IPM is presented in full detail in Algorithm 1. In each iteration, we build and solve system (6) classically. We solve system (6) to the accuracy just introduced above and then compute the feasible Newton step from the inexact solution and take a full Newton step.

Algorithm 1: Short-step IF-IPM
  • 1:

    Choose ϵ>0, δ(0,1), θ(0,1), β(0,1) and σ=(1βn).

  • 2:

    k0

  • 3:

    Choose initial feasible interior solution (x0,y0,s0)N(θ)

  • 4:

    while(xk,yk,sk)PDϵdo

  • 5:

       μk(xk)Tskn

  • 6:

       ϵOSSkδrck2/SkV+XkQVkXkAT2

  • 7:

       (λk,Δyk)solve system (6) with accuracy ϵOSSk

  • 8:

       Δxk=Vλk and Δsk=ATΔyk

  • 9:

       (xk+1,yk+1,sk+1)(xk,yk,sk)+(Δxk,Δyk,Δsk)

  • 10:

       kk+1

  • 11:

    end while

  • 12:

    return(xk,yk,sk)

In the quantum-assisted IF-IPM, or IF-QIPM, we propose accelerating Step 7 using quantum subroutines. In the next sections, we investigate how to use quantum algorithms to build and solve the OSS system and obtain the Newton direction.

3.2. IF-QIPM for LCQO

The pseudocode of our IF-QIPM is presented in Algorithm 2. At each iteration of the IF-QIPM, we construct and solve system (6) and compute the Newton direction using quantum algorithms. To obtain an ϵOSS-approximate solution of system (6), we first block encode system (8); see Appendix A. Then, we use quantum algorithms to solve for an ϵQLSA-approximate solution of system (8). This solution is normalized but we can rescale it to obtain an ϵOSS-approximate solution of system (6). Details are discussed later in this section.

Algorithm 2: Short-step IF-QIPM
  • 1:

    Choose ϵ>0, δ(0,1), θ(0,θ0), β(0,1) and σ=(1βn).

  • 2:

    k0

  • 3:

    Choose initial feasible interior solution (x0,y0,s0)N(θ)

  • 4:

    while(xk,yk,sk)PDϵdo

  • 5:

       μk(xk)Tskn

  • 6:

       ϵOSSkδrck2/2SkV+XkQVkXkAT2

  • 7:

       (λk,Δyk)solve system (6) with accuracy ϵOSSk quantumly

  • 8:

       Δxk=Vλk and Δsk=ATΔyk

  • 9:

       (xk+1,yk+1,sk+1)(xk,yk,sk)+(Δxk,Δyk,Δsk)

  • 10:

       kk+1

  • 11:

    end while

  • 12:

    return(xk,yk,sk)

Here, θ0<1 and its value will be discussed later. First, we introduce some notations to simplify the OSS system. In the kth iteration of Algorithm 2, let

Mk=SkV+XkQVXkAT,zk=λkΔyk.

Then, the OSS system can be rewritten as

Mkzk=rck.

As discussed in [14], to solve the OSS system (6) using quantum algorithms, we can first rewrite it as the normalized Hermitian OSS system

12MkF0Mk(Mk)T0·0zk=12MkF.rck0. (8)

To use the QLSAs mentioned earlier, we need to turn the linear system (8) into a quantum linear system using the block encoding introduced in [18]. To this end, we first decompose the coefficient matrix in linear system (8) as

12MkF0Mk(Mk)T0=12MkF00(Mk)T0+12MkF0Mk00, (9)

where

00(Mk)T0=0n×n0n×n0n×n0(nm)×nVT0(nm)×n0m×n0m×nA×0n×n0n×nSk0n×n0n×n0n×n+0n×n0n×n0n×n0n×nQ0n×n0n×n0n×nIn×n0n×n0n×nXk0n×nXk0n×n. (10)

To compute matrix V, we need to find a basis matrix AB of matrix A and we need to compute the inverse matrix AB1. Both steps are nontrivial and can be expensive. However, we can reformulate the LCQO problem as follows:

mincTx+12xTQxs.t.I0A0IAxxx=bbx0,x0,x0.

In this case, we have an obvious basis

AB=I00I

and matrix V can be constructed efficiently

V=AB1ANI=I00IAAI=AAI.

Since matrix A has no all-zero rows, matrix V has no all-zero rows either. This property of the reformulation is useful in the analysis of the proposed IF-QIPM but we do not want to build the complexity analysis on the reformulated problem. Thus, without a loss of generality we may make the following assumption.

Assumption 3. 

Matrix A is of the formA=IAN.

To simplify the analysis, we further assume that the input data are integers.

Assumption 4. 

The input data of Problem (1) are integers.

Based on the two assumptions above, we have the following lemma.

Lemma 1. 

Matrix V equals

V=ANI

and

mini=1,,n{Vi·22}=min{1,mini=1,,m(AN)i·22}=1,

where Vi· and (AN)i· are the ith row of V and AN, respectively.

Now, we are ready to give θ0 in our definition of the central path neighborhood; see (5). We set

θ0=min13n,14QVVTF+1. (11)

We also define ωk as the maximum of the values of primal variables and dual slack variables in the kth iteration.

Definition 2. 

Let (xk,yk,sk) be a candidate solution for Problem (1); then,

ωk=maxi{1,,n}{xik,sik}.

As is standard in the literature on quantum algorithms, in this work, we assume access to quantum random access memory (QRAM). Then, Step 7 of Algorithm 2 consists of three parts: (1). use block encoding to build system (8); (2). use QLSAs to solve system (8); (3). use quantum tomography algorithms (QTAs) to extract the classical solution. We use the block-encoding methods introduced in [18] to block-encode linear system (8).

Proposition 1. 

In the kth iteration of Algorithm 2, using the block-encoding methods introduced in [18] and the decomposition described in Equations (9) and (10), a

VF2+AF22ωkMkF(2QF+2+1),O(polylog(n)),ϵQLSAκMk3

-block-encoding of the matrix in system (8) can be implemented efficiently and the complexity will be dominated by the complexity of the QLSA step. Here,ϵQLSAis the accuracy required for the QLSA step andκMkis the condition number of matrixMk.

Proof. 

See Appendix A for proof. □

Provided access to QRAM, the complexity associated with block encoding the OSS system coefficient matrix and preparing a quantum state encoding the right hand side amounts to polylogarithmic overhead. The cost of these steps is therefore negligible when compared with the complexity contributed by QLSAs and QTAs, so we ignore it here. To bound the total complexity contributed by QLSAs and QTAs, we first need to analyze the accuracy of QLSA characterized by ϵQLSA, the accuracy of QTA characterized by ϵQTA, and their relationship.

In each iteration, we use a QLSA to solve the block-encoded version of system (8) and obtain an ϵQLSA-approximate solution. Then, we use a QTA to extract an ϵQTA-approximate solution from the quantum machine. In the context of QLSAs and QTAs, if z˜ is an ϵ-approximate solution of z, then z˜ satisfies

z˜z˜2zz22ϵ

Observe that this definition of accuracy differs from the concept of ϵ-approximate solutions defined in (2).

Similar to [12,13], the QLSA we use is proposed by [19] and the QTA we use is proposed by [20]. Following the argument in Section 2 in [12], we can establish the relationship among ϵQLSA, ϵQTA, and ϵOSSk as

ϵQLSA=ϵQTA=12·2MkFrck2ϵOSSk, (12)

where ϵOSSk is defined as the 2 norm of the residual when solving system (8) inexactly in the kth iteration. This coefficient is also used to rescale the solution. According to [12], we rescale the normalized solution obtained from QLSA and QTA by

rck22MkF

to obtain the ϵOSSk-approximate solution for system (6). Here, we did not add superscript to ϵQLSA and ϵQTA, and the reason shall be revealed later. Let

0˜kz˜k

be an inexact solution for system (8) in the kth iteration. Then, the norm of residual of system (8), which is ϵOSSk, and the norm of residual of system (6), which is Mkz˜krck2, satisfies

ϵOSSk=12MkF0Mk(Mk)T00˜kz˜k12MkFrck02=12MkFMkz˜k(Mk)T0˜k12MkFrck0212MkFMkz˜k12MkFrck212MkFMkz˜krck2.

Recall that the error arising from the OSS system (6) is the same as the error in the full Newton system (7); then, we can directly use the convergence condition in [5], i.e.,

Mkz˜krck2δrck2.

We can require

Mkz˜krck22MkFϵOSSkδrck2

and it follows that

ϵOSSkδrck22MkF.

Then, choosing

ϵOSSk=δrck22MkFandϵQLSA=ϵQTA=MkFϵOSSk2rck2=δ2

ensures the convergence of the IF-QIPM. The complexities for each step are also available now. Using the QLSA from [19] and QTA from [20], we have the complexity for QLSA and QTA:

TQLSA=O˜n,ω¯,1ϵκMkωkMkF,TQTA=O˜nnϵQTATQLSA=O˜n,ω¯,1ϵnκMkωkMkF.

Since we have ϵQTA=δ2 and δ(0,1) is a constant parameter, we omit ϵQTA in the Big-O notation. Note that the complexity of the block-encoding procedure is dominated by that of QLSA and QTA and thus we ignore the complexity contributed by block encoding. In Step 8, the complexity contributed by computing Newton step from OSS solution is O(n2). The total complexity for the kth iteration of IF-QIPM will be

OTQTA+n2=O˜n,ω¯,1ϵnωkκMkMkF+n2. (13)

3.2.1. Bound for ωk/MkF

In this section, all of the quantities that we consider are from the kth iteration. For simplicity, we omit the superscript k in this section unless we need it. Using the property of trace, we have

MF2=tr(MTM)=tr(SV+XQV)(SV+XQV)T+XATAX=tr(SV+XQV)(SV+XQV)T+trXATAX=trSVVTS+trXQVVTS+trSVVTQX+trXQVVTQX+trXATAX.

For the non-symmetric term, due to the cyclic invariant property of trace, we have

trXQVVTS=trSXQVVT.

Recalling the central path neighborhood that we defined in (5), we define a matrix E such that

E=1μθ(XSμI). (14)

It is obvious that E is a diagonal matrix and satisfies

Ee2<1,

which leads to

|tr(E)|Ee1nEF=nEe2<nandIE0andI+E0.

With this, we can have

trXQVVTS=trSXQVVT=tr(θμE+μI)QVVT=trθμEQVVT+trμQVVT.

For the second term, we know that Q and VTQV are both positive semidefinite. Thus, we can have

trQVVT=trVTQV0

because of the cyclic invariant property of trace. According to the Cauchy–Schwarz inequality, we have

trEQVVT2EF2QVVTF2.

Thus, we have

trEQVVTQVVTF.

Thus, we have

trXQVVTS=trθμEQVVT+trμQVVTμtrQVVTθQVVTFθμQVVTFμ4,

where the last inequality holds due to condition (11). Thus, we can bound MF by

MF2=trSVVTS+trXQVVTS+trSVVTQX+trXQVVTQX+trXATAXtrSVVTS+trXQVVTQX+trXATAXμ2.

Since XQVVTQX0, we have

MF2trSVVTS+trXATAXμ2.

Since X and S are both positive diagonal matrices, we have

MF2trSVVTS+trXATAXμ2=isi2(VVT)ii+ixi2(ATA)iiμ2ω2μ2.

As we mentioned in the very beginning of this section, at each iteration, ω is indeed ωk, but the superscript is ignored here. Now, we aim to find a bound for μ so we can further bound MF2. Since ω is the upper bound for the magnitude of the primal and dual slack variables, we have

ω2xisi.

Recall the definition of matrix E; see (14). Thus, we have

ω2xisi=μ+θμEiiμθμ=(1θ)μ.

Thus,

MF2ω2μ2ω212ω21θω212ω211/3=ω24,

where the last inequality follows from the bound for θ; see (11). Thus, we have

ωMF2=O1.

3.2.2. Bound for κMk

Similar to the previous section, we ignore the superscript k unless we need it. We will start with a general result and then work on the matrix Mk. The following lemma is a well-known result regarding condition numbers of matrices and can be proven using Courant–Fischer–Weyl min-max principle [21].

Lemma 2. 

For any full row rank matrix PRm×n and symmetric positive definite matrix DRn×n, their condition number satisfies

κ(PDPT)κ(D)κ(PPT).

Next, we analyze the matrix in the OSS system (8). Specifically, we focus on MTM since we are interested in the spectral property of the OSS system (8). Using the matrix E defined in (14), we have the following decomposition:

MTM=VT(S+XQ)T(S+XQ)VVT(S+XQ)TXATAX(S+XQ)VAX2AT=VT(S+XQ)T(S+XQ)VVTμθEATVTQTX2ATAμθEVAX2QVAX2AT=VT00A(S+XQ)T(S+XQ)μθEQX2μθEX2QX2VT00AT.

The second equality holds because

VTSXATVTQTX2AT=VTμI+θEATVTQTX2AT=VTμθEATVTQX2AT,

as AV=0 and Q is symmetric. Then, plugging (14) into the first diagonal block of the decomposition we obtained earlier, we have

MTM=VT00AS2+2μQ+μθ(EQ+QE)+QX2QμθEQX2μθEX2QX2VT00AT=VT00AS2+2μQ+μθ(EQ+QE)μθEμθE0+QX2QQX2X2QX2VT00AT=VT00AIQ0IS2+2μQμθEμθE0I0QIVT00AT+VT00AIQ0I000X2I0QIVT00AT=VT00AIQ0IS2+2μQμθEμθEX2I0QIVT00AT.

The first two matrices are nonsingular, so we can apply the Lemma 2, and thus we only need to study the middle matrix. Denote the middle matrix by Ψ. Observe that Ψ is almost the same as its counterpart in [14]. Subsequently, we have the following result regarding the spectral property of Mk.

Lemma 3. 

When (x,y,s)N(θ) and θ0,min13n,14QVVTF+1, the condition number of matrix Mk satisfies

κMk=O(ωk)2+μkσmax(Q)μkκVAQ,

whereκVAQis the condition number of the matrixVT00AIQ0I.

Proof. 

The proof is in Appendix B. □

Putting all of these together, we have the complexity for our IF-QIPM for LCQO problems.

Theorem 1. 

The IF-QIPM for LCQO problems stops with the final duality gap less than ϵ in at mostOnlog(1/ϵ)IPM iterations and, in each IPM iteration, the Newton direction can be obtained with complexityO˜n,ω¯,1ϵnω¯2ϵ+σmax(Q)κVAQ+n2, whereω¯=maxkωk.

Proof. 

The complexity bound for the IPM iterations comes from the result in [5]. According to (13), the complexity for obtaining the Newton direction is

O˜n,ω¯,1ϵnωkκMkMkF+n2.

Combining this with the result in Section 3.2.1, the bound in Lemma 3, and μkϵ, we have

O˜n,ω¯,1ϵnωkκMkMkF+n2=O˜n,ω¯,1ϵnω¯2ϵ+σmax(Q)κVAQ+n2.

4. Application in Support Vector Machine Problems

In this section, we discuss how to use our IF-QIPM to solve SVM problems. We show that our algorithm can solve 1-norm soft margin SVM problems faster than any existing classical or quantum algorithms with respect to dimension.

The ordinary SVM problem works on a linearly separable dataset, in which the data points have binary labels. The ordinary SVM aims to find a hyperplane correctly separating the data points with a maximum margin. However, in practice, the data points are not necessarily linearly separable. To allow for mislabelling, the concept of a soft margin SVM was introduced in [22]. Let {(ϕi,ζi)Rm×{1,+1}|i=1,,n} be the set of data points, Φ be a matrix with the ith column being ϕi, and Z be a diagonal matrix with the ith diagonal element being ζi. The SVM problem with an l1-norm soft margin can be formulated as below.

min(ξ,w,t)Rn×Rm×R12w22+Cξ1s.t.ζiw,ϕi+t1ξi,i=1,,nξi0,i=1,,n. (15)

Here, (w,t) determines a hyperplane and C is a penalty parameter. In [9], the authors rewrote the SVM problem as a second-order conic optimization (SOCO) problem and used the quantum algorithm that they proposed to solve the resulting SOCO problem. They claim the complexity of their algorithm has O(n2) dependence on the dimension, which is better than any classical algorithm. However, the algorithm in [9] is invalid. Their algorithm is an inexact infeasible-QIPM (II-QIPM), while they used the IPM complexity for the feasible-QIPM, which ignores at least O(n1.5) dependence on n. They also missed the symmetrization of the Newton step, which is necessary for SOCO problems and makes their Newton step invalid.

Aside from [9], some pure quantum algorithms for SVM problems are also proposed. In [23], the authors propose a pure quantum algorithm for SVM problems. They claim the complexity is O(κeff3ϵ3log(mn)), where κeff is the condition number of a matrix involving the kernel matrix and ϵ is the accuracy. In the worst case, κeff=O(m). Their complexity is worse than ours regarding the dependence of dimension and accuracy. In addition, their algorithm does not provide classical solutions. Namely, the solution is in the quantum machine and we cannot read or use it in a classical computer. However, our algorithm produces a classical solution.

To convert the problem into standard-form LCQO, we introduce (w+,w)R+m×R+m, (t+,t)R+×R+, and a slack variable ρR+n. Then, we can obtain the following formulation:

minw+,w,t+,t,ξ,ρ12w+w22+Cξ1s.t.ζiw+w,ϕi+t+t+ξiρi=1,i=1,,n(ξ,w+,w,t+,t,ρ)0.

This is a standard-form LCQO problem with non-negative variables (w+,w,t+,t,ξ,ρ)Rm×Rm×R×R×Rn×Rn and parameters

c=02m+2Cen0nQ=Im×mIm×m0m×(2+2n)Im×mIm×m0m×(2+2n)0(2+2n)×m0(2+2n)×m0(2+2n)×(2+2n)A=ZΦTZΦTZZIn×nIn×nb=e.

Thus, we can use the proposed IF-QIPM for LCQO problems to solve the 1-norm soft margin SVM problems and obtain an ϵ-approximate solution with complexity

O˜m,n,ω¯,1ϵ(m+n)1.5ω¯2ϵ+σmax(Q)κVAQ+(m+n)2.5.

This dependence on dimension is better than any existing quantum or classical algorithm.

5. Discussion

In this work, we present an IF-QIPM for LCQO problems by combining the IF-IPM framework proposed in [5] and the OSS system introduced in [14]. Our algorithm has n1.5 dependence on n, which is better than any existing algorithms for LCQO problems. The dependence on the accuracy is polynomial, which is worse than classic IPMs. Iterative refinement techniques might help to improve the dependence on the accuracy but they are beyond the discussion of this work.

Abbreviations

The following abbreviations are used in this manuscript:

IF-IPM Inexact Feasible Interior Point Method
IF-QIPM Inexact Feasible Quantum Interior Point Methods
IPM Interior Point Method
LCQO Linearly Constrained Quadratic Optimization
LO Linear Optimization
OSS Orthogonal Subspace System
QIPM Quantum Interior Point Method
QLSA Quantum Linear System Algorithm
QTA Quantum Tomography Algorithm
SOCO Second-Order Conic Optimization
SVM Support Vector Machine

Appendix A. Block Encoding of the OSS System

In this section, we ignore the superscript k for simplicity. As described in Equation (9), we first block encode each of the matrices involved in (10). We assume that V,A,S, and X are given and are stored in a quantum accessible data structure (we ignore the complexity to store the classical information into the quantum machine). For the first matrix

M1=0n×n0n×n0n×n0(nm)×nVT0(nm)×n0m×n0m×nA,

a

VF2+AF2,O(polylog(n)),ϵ1

-block-encoding of M1 can be implemented efficiently according to Lemma 50 from [18].

The second matrix

M2=0n×n0n×nS0n×n0n×n0n×n

is both one-row-sparse and one-column-sparse. By the definition of ω, each element of M2/ω has an absolute value of at most 1. According to Lemma 48 in [18], a

1,O(polylog(n)),ϵ2

-block-encoding of M2/ω can be implemented efficiently.

The third matrix

M3=0n×n0n×n0n×n0n×nQ0n×n0n×n0n×nIn×n

can be decomposed into

M3=0n×n0n×n0n×n0n×nQ0n×n0n×n0n×n0n×n+0n×n0n×n0n×n0n×n0n×n0n×n0n×n0n×nIn×n.

Then, we can block encode the two matrices first, and then apply a linear combination to obtain M3. In fact, a

QF,O(polylog(n)),ϵ3

-block-encoding of the left matrix can be implemented efficiently according to Lemma 50 from [18] and a

1,O(polylog(n)),ϵ3

-block-encoding of the right matrix can be implemented efficiently according to Lemma 48 in [18]. With the state-preparation cost of the linear combination coefficient vector (1,1) neglected, a

QF+1,O(polylog(n)),(QF+1)ϵ3

-block-encoding of M3 can be implemented efficiently according to Lemma 52 from [18].

The fourth matrix

M4=0n×n0n×nX0n×nX0n×n

is one-row-sparse and two-columns-sparse. After being scaled by 1ω, each element of M4/ω has an absolute value of at most 1. According to Lemma 48 in [18], a

2,O(polylog(n)),ϵ4

-block-encoding of M4/ω can be implemented efficiently.

For the matrix multiplication M3M4/ω, a

2QF+2,O(polylog(n)),(QF+1)(2ϵ3+ϵ4)

-block-encoding can be implemented efficiently according to Lemma 53 from [18].

For the linear combination M2/ω+M3M4/ω, the cost for the state-preparation of the coefficient vector (1,1) is negligible and thus a

2QF+2+1,O(polylog(n)),(2QF+2+1)(ϵ3+12ϵ4)

-block-encoding can be implemented efficiently according to Lemma 52 from [18].

For the matrix multiplication of M1(M2/ω+M3M4/ω), a

(VF2+AF2(2QF+2+1),O(polylog(n)),VF2+AF2(2QF+2+1)(ϵ3+12ϵ4)+(2QF+2+1)ϵ1)

-block-encoding can be implemented efficiently according to Lemma 53 from [18].

Finally, considering that the complexity of the state-preparation of the vector

(ω2MF,ω2MF)

can be neglected, a

(VF2+AF22ωMF(2QF+2+1),O(polylog(n)),VF2+AF22ωMF(2QF+2+1)2VF2+AF2(ϵ3+12ϵ4)+ϵ1)

-block-encoding of the coefficient matrix of system (8) can be implemented efficiently according to Lemma 52 from [18]. We can choose

ϵ1=ϵQLSAκM312Kϵ2=ϵ12VF2+AF2ϵ3=ϵ2ϵ4=2ϵ2,

where K depends on the initial data

K=2VF2+AF2(2QF+2+1)2.

Now, considering that the complexity for all of the block-encoding algorithms that we have used so far has poly-logarithmic dependence on the dimension and accuracy, and that, for i=1,2,3,4

Opolylog(1ϵi)=Opolylog(κM),

the complexity for block encoding will be dominated by the complexity for QLSA because QLSA has linear dependence on κM, we can ignore the complexity of block encoding.

Appendix B. Spectral Analysis for Matrix Ψ

In this section, we provide the spectral analysis for the matrix

Ψ=S2+2μQμθEμθEX2. (A1)

Just like in the previous section, for simplicity, we ignore the superscript k. We can perform the following decomposition:

S2+2μQμθEμθEX2=S2μθEμθEX2+2μQ000.

Let us use the following notation:

Ψ1=S2μθEμθEX2Ψ2=2μQ000.

It can be proven that Ψ1 is positive definite. The majority of the proof of this conclusion comes from the paper [14]. For the reader’s convenience, we provide the complete proof here.

Matrix Ψ1 is a block diagonal matrix, with all four blocks being diagonal matrices. Thus, we can easily compute the eigenvalues using the characteristic polynomial

det(Ψ1qI)=detX2qIS2qIθ2μ2E2=i=1nxi2qsi2qθ2μ2Eii2.

Clearly, det(Ψ1qI)=0 gives n quadratic equations and each quadratic equation gives two eigenvalues. The two eigenvalues from the ith quadratic equation are

qi+=12((xi2+si2)+(xi2+si2)24xi2si2+4θ2μ2Eii2)

and

qi=12((xi2+si2)(xi2+si2)24xi2si2+4θ2μ2Eii2).

Recalling the definition of E in (14), we can write

qi=12((xi2+si2)(xi2+si2)24xi2si2+4(xisiμ)2)=12((xi2+si2)(xi2+si2)2+4(xisiμ+xisi)(xisiμxisi))=12((xi2+si2)(xi2+si2)24μ(2xisiμ))=12((xi2+si2)(xi2+si2)24μ(2θμEii+μ)).

One can verify that the square root always exists because

(xi2+si2)24μ(2xisiμ)4(xisi)24μ(2xisi)+4μ2=4(xisiμ)20.

With θ0,min13n,14QVVTF+1, we have

qi12((xi2+si2)(xi2+si2)24μ(2θμEii+μ))12(xi2+si2)(xi2+si2)24μ(2μ13n+μ)=12((xi2+si2)(xi2+si2)243μ2)=1243μ2(xi2+si2)+(xi2+si2)243μ21243μ2(xi2+si2)+(xi2+si2)2=μ23(xi2+si2)>0.

This means that matrix Ψ1 is positive definite and its eigenvalues coincide with its singular values because Ψ1 is also real and symmetric. Analogously, we have

qi+=12((xi2+si2)+(xi2+si2)24μ(2θμEii+μ))12((xi2+si2)+(xi2+si2)+2μ(2θEii+1))12((xi2+si2)+(xi2+si2)+2μ2)=(xi2+si2)+2μ.

Thus, the condition number of Ψ satisfies

κ(Ψ)σmax(Ψ1)+σmax(Ψ2)σmin(Ψ1)+σmin(Ψ2)=maxiqi++σmax(Ψ2)minjqj+σmin(Ψ2)maxi{xi2+si2}+2μ+2μσmax(Q)minjμ23(xi2+si2)=3maxi{xi2+si2}maxi{xi2+si2}+2μ+2μσmax(Q)μ23ω2ω2+2μ+2μσmax(Q)μ2,

where the last inequality comes from the definition of ω. Since ω2xisi(1θ)μ, we have

κ(Ψ)=Oω2(ω2+μσmax(Q))μ2.

Using Lemma 2, we can also bound the condition number of matrix M by

κM=κ(MTM)κ(Ψ)κVAQ=O(ω2+μσmax(Q))μκVAQ.

Author Contributions

Conceptualization, Z.W. and T.T.; methodology, Z.W.; supervision, X.Y. and T.T.; validation, Z.W., Mohammadhossein Mohammadisiahroudi, B.A., X.Y. and T.T.; writing—original draft, Z.W.; writing—review and editing, Z.W., M.M., B.A., X.Y. and T.T. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The funder had no role in the design of the study; in the writing of the manuscript; or in the decision to publish the results.

Funding Statement

This work was supported by Defense Advanced Research Projects Agency as part of the project W911NF2010022: The Quantum Computing Revolution and Optimization: Challenges and Opportunities. This work was also supported by National Science Foundation CAREER DMS-2143915.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

  • 1.Nocedal J., Wright S.J. Numerical Optimization. Springer; New York, NY, USA: 1999. [Google Scholar]
  • 2.Boser B.E., Guyon I.M., Vapnik V.N. A training algorithm for optimal margin classifiers. In: Haussler D., editor. Proceedings of the Fifth Annual Workshop on Computational Learning Theory; Pittsburgh, PA, USA. 27–29 July 1992; New York, NY, USA: Association for Computing Machinery; 1992. pp. 144–152. [Google Scholar]
  • 3.Roos C., Terlaky T., Vial J.P. Theory and Algorithms for Linear Optimization: An Interior Point Approach. John Wiley & Sons; New York, NY, USA: 1997. [Google Scholar]
  • 4.Pólik I., Terlaky T. Interior point methods for nonlinear optimization. In: Gianni Di Pillo F.S., editor. Nonlinear Optimization. Springer; New York, NY, USA: 2010. pp. 215–276. [Google Scholar]
  • 5.Gondzio J. Convergence analysis of an inexact feasible interior point method for convex quadratic programming. SIAM J. Optim. 2013;23:1510–1527. doi: 10.1137/120886017. [DOI] [Google Scholar]
  • 6.Lu Z., Monteiro R.D., O’Neal J.W. An iterative solver-based infeasible primal-dual path-following algorithm for convex quadratic programming. SIAM J. Optim. 2006;17:287–310. doi: 10.1137/04060771X. [DOI] [Google Scholar]
  • 7.Bunch J.R., Parlett B.N. Direct methods for solving symmetric indefinite systems of linear equations. SIAM J. Numer. Anal. 1971;8:639–655. doi: 10.1137/0708060. [DOI] [Google Scholar]
  • 8.Schuld M., Sinayskiy I., Petruccione F. Prediction by linear regression on a quantum computer. Phys. Rev. A. 2016;94:022342. doi: 10.1103/PhysRevA.94.022342. [DOI] [Google Scholar]
  • 9.Kerenidis I., Prakash A., Szilágyi D. Quantum algorithms for second-order cone programming and support vector machines. Quantum. 2021;5:427. doi: 10.22331/q-2021-04-08-427. [DOI] [Google Scholar]
  • 10.Harrow A.W., Hassidim A., Lloyd S. Quantum algorithm for linear systems of equations. Phys. Rev. Lett. 2009;103:150502. doi: 10.1103/PhysRevLett.103.150502. [DOI] [PubMed] [Google Scholar]
  • 11.Kerenidis I., Prakash A. A quantum interior point method for LPs and SDPs. ACM Trans. Quantum Comput. 2020;1:1–32. doi: 10.1145/3406306. [DOI] [Google Scholar]
  • 12.Mohammadisiahroudi M., Fakhimi R., Terlaky T. Efficient use of quantum linear system algorithms in interior point methods for linear optimization. arXiv. 20222205.01220 [Google Scholar]
  • 13.Augustino B., Nannicini G., Terlaky T., Zuluaga L.F. Quantum interior point methods for semidefinite optimization. arXiv. 20212112.06025 [Google Scholar]
  • 14.Mohammadisiahroudi M., Fakhimi F., Wu Z., Terlaky T. An Inexact Feasible Interior Point Method for Linear Optimization with High Adaptability to Quantum Computers. Department of ISE, Lehigh University; Bethlehem, PA, USA: 2021. Technical Report: 21T-006. [Google Scholar]
  • 15.Kojima M., Mizuno S., Yoshise A. A polynomial-time algorithm for a class of linear complementarity problems. Math. Program. 1989;44:1–26. doi: 10.1007/BF01587074. [DOI] [Google Scholar]
  • 16.Monteiro R.D., Adler I. Interior path following primal-dual algorithms. part II: Convex quadratic programming. Math. Program. 1989;44:43–66. doi: 10.1007/BF01587076. [DOI] [Google Scholar]
  • 17.Goldfarb D., Liu S. An O (n 3L) primal interior point algorithm for convex quadratic programming. Math. Program. 1990;49:325–340. doi: 10.1007/BF01588795. [DOI] [Google Scholar]
  • 18.Gilyén A., Su Y., Low G.H., Wiebe N. Quantum singular value transformation and beyond: Exponential improvements for quantum matrix arithmetics. arXiv. 20181806.01838 [Google Scholar]
  • 19.Chakraborty S., Gilyén A., Jeffery S. The power of block-encoded matrix powers: Improved regression techniques via faster Hamiltonian simulation. arXiv. 20181804.01973 [Google Scholar]
  • 20.van Apeldoorn J., Cornelissen A., Gilyén A., Nannicini G. Quantum tomography using state-preparation unitaries. arXiv. 20222207.08800 [Google Scholar]
  • 21.Horn R.A., Johnson C.R. Matrix Analysis. Cambridge University Press; Cambridge, UK: 2012. [Google Scholar]
  • 22.Cortes C., Vapnik V. Support-vector networks. Mach. Learn. 1995;20:273–297. doi: 10.1007/BF00994018. [DOI] [Google Scholar]
  • 23.Rebentrost P., Mohseni M., Lloyd S. Quantum support vector machine for big data classification. Phys. Rev. Lett. 2014;113:130503. doi: 10.1103/PhysRevLett.113.130503. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Not applicable.


Articles from Entropy are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES