Skip to main content
Springer logoLink to Springer
. 2021 Sep 8;192(1-2):319–337. doi: 10.1007/s10107-021-01698-z

About the complexity of two-stage stochastic IPs

Kim-Manuel Klein 1,
PMCID: PMC8907148  PMID: 35300153

Abstract

We consider so called 2-stage stochastic integer programs (IPs) and their generalized form, so called multi-stage stochastic IPs. A 2-stage stochastic IP is an integer program of the form max{cTxAx=b,lxu,xZs+nt} where the constraint matrix AZrn×s+nt consists roughly of n repetitions of a matrix AZr×s on the vertical line and n repetitions of a matrix BZr×t on the diagonal. In this paper we improve upon an algorithmic result by Hemmecke and Schultz from 2003 [Hemmecke and Schultz, Math. Prog. 2003] to solve 2-stage stochastic IPs. The algorithm is based on the Graver augmentation framework where our main contribution is to give an explicit doubly exponential bound on the size of the augmenting steps. The previous bound for the size of the augmenting steps relied on non-constructive finiteness arguments from commutative algebra and therefore only an implicit bound was known that depends on parameters rst and Δ, where Δ is the largest entry of the constraint matrix. Our new improved bound however is obtained by a novel theorem which argues about intersections of paths in a vector space. As a result of our new bound we obtain an algorithm to solve 2-stage stochastic IPs in time f(r,s,Δ)·poly(n,t), where f is a doubly exponential function. To complement our result, we also prove a doubly exponential lower bound for the size of the augmenting steps.

Keywords: Integer programming, Parameterized compexity, Two-stage stochastic, Stochastic programming

Introduction

Integer programming is one of the most fundamental problems in algorithm theory. Many problems in combinatorial optimization and other areas can be modeled by integer programs. An integer program (IP) is thereby of the form

max{cTxAx=b,lxu,xZn}

for some matrix AZm×n, a right hand side bZm, a cost vector cZn and lower and upper bounds l,uZn. The famous algorithm of Kannan [22] computes an optimal solution of the IP in time of roughly nO(n)·poly(m,logΔ), where Δ is the largest entry of A and b.

In recent years there was significant progress in the development of algorithms for IPs when the constraint matrix A has a specific structure. Consider for example the class of integer programs with a constraint matrix N of the form

N=AAAB000B000B

for some matrices AZr×s and BZr×t. An IP of this specific structure is called an n-fold IP. This class of IPs has found numerous applications in the area of string algorithms [24], computational social choice [25] and scheduling [19, 23]. State-of-the-art algorithms compute a solution of an n-fold IP in time ΔO(r2s+s2)·(n,t)1+o(1) [7, 10, 11, 20, 26], where Δ is the largest entry in the matrices A and B.

Two-stage stochastic integer programming

Stochastic programming deals with uncertainty of decision making over time [21]. One of the basic models in stochastic programming is 2-stage stochastic programming. In this model one has to decide on a solution at the first stage and in the second stage there is an uncertainty where n possible scenarios can happen. Each of n possible scenarios might have a different optimal solution and the goal is to minimize the costs of the solution of the first stage in addition to the expected costs of the solution of the second stage. In the case that said scenarios can be modeled by an (integer) linear program, we are talking about 2-stage stochastic (integer) linear programs. 2-stage stochastic linear programs that do not contain any integer variable are well understood (we refer to standard text books [3, 21]). In contrast, 2-stage stochastic programs that contain integer variables are hard to solve and are the topic of ongoing research. Typically, those IPs are investigated in the context of decomposition based methods (we refer to a tutorial [27] or a survey [31] on the topic). For progress on 2-stage stochastic programs we refer to [1, 5, 31]. The interest in solving 2-stage stochastic (I)LPs efficiently stems from their wide range of applications for example in modeling manufacturing processes [9] or energy planing [17].

In this paper we consider 2-stage stochastic IPs with only integral variables. For the extension of the results of this paper to the mixed setting we refer to [4]. So called pure integral 2-stage stochastic IPs have also been considered in the literature from a practical perspective (see [14, 33]). The considered IP is then of the form

maxcTxAx=blxuxZs+nt 1

for given objective vector cZs+nt, upper and lower bound l,uZs+nt. The constraint matrix A has the shape

A=A(1)B(1)00A(2)0B(2)0A(n)00B(n)

for given matrices A(1),,A(n)Zr×s and B(1),,B(n)Zr×t.

Typically, 2-stage stochastic IPs are written in a slightly different (equivalent) form that explicitly involves the scenarios and the probability distribution of the scenarios of the second stage. In this presented form, roughly speaking, the solution for the first stage scenario is encoded in the variables corresponding to vertical matrices. A solution for each of the second stage scenarios is encoded in the variables corresponding to one of the diagonal matrices and the expectation for the second stage scenarios can be encoded in a linear objective function. Since we do not rely on known techniques of stochastic programming in this paper, we omit the technicalities surrounding 2-stage stochastic IPs and simply refer to a survey for further details [31].

Despite their similarity, it seems that 2-stage IPs are significantly harder to solve than n-fold IPs. While Hemmecke and Schultz [18] have shown that a 2-stage stochastic IP with constraint matrix A can be solved in running time of the form f(r,s,t,Δ)·poly(n) for some computable function f, the actual dependence on the parameters r,s,t,Δ was unknown (we elaborate on this further in the coming section). Their algorithm is based on the augmentation framework which we also discuss in the following section.

Graver elements and the augmentation framework

Suppose we have an initial feasible solution z0 of an IP max{cTxAx=b,lxu,xZn} and our goal is to find an optimal solution. The idea behind the augmentation framework (see [11]) is to compute an augmenting (integral) vector y in the kernel, i.e., yker(A) with cTy>0. A new solution z with improved objective can then be defined by z=z0+λy for a suitable λZ>0. This procedure can be iterated until a solution with optimal objective is obtained eventually.

We call an integer vector yker(A) a cycle. A cycle can be decomposed if there exist integral vectors u,vker(A)\{0} with y=u+v and ui·vi0 for all i (i.e., the vectors are sign-compatible with y). An integral vector yker(A)\{0} that can not be decomposed is called a Graver element [15] or we simply say that it is indecomposable. The set of all indecomposable elements is called the Graver basis.

For a given bound on the size of Graver elements of the constraint matrix, an augmenting vector yker(A) can often be computed by a dynamic program (depending on the structure of the constraint matrix), whereas the running time of the dynamic program depends on the respective bound. An optimal solution of the corresponding IP can then be solved by using the augmentation framework. For a detailed description of the augmentation framework we refer to the paper by Eisenbrand et al. [11].

In the case that the constraint matrix has a very specific structure, one can sometimes show improved bounds. Specifically, if the constraint matrix A has a 2-stage stochastic shape with identical matrices in the vertical and diagonal line, then Hemmecke and Schultz [18] were able to prove a bound for the size of Graver elements that only depends on the parameters rst and Δ. The presented bound is an existential result and uses so called saturation results from commutative algebra. In their line of proof MacLagan’s theorem is used, which only yields a finiteness statement (i.e., there are no infinite antichains in the set of monomial ideals in a polynomial ring in finitely many variables over a field) and no explicit bound is known yet for this quantity. It is only known that the dependence on the parameters is lower bounded by Ackerman’s function [28]. This implies that the parameter dependence of rst and Δ in the implicit bound of the size of Graver elements by Hemmecke and Schultz is at least ackermanian.

Very recently, improved bounds for Graver elements of general matrices and matrices with specific structure like n-fold [10] or 4-block structure [6] were developed.

Lemma 1

(Steinitz [16, 32]) Let v1,,vnRm be vectors with viΔ for 1in. Assuming that i=1nvi=0 then there is a permutation π such that for each k{1,,n} the norm of the partial sum i=1kvπ(i) is bounded by mΔ

The Steinitz Lemma was used by Eisenbrand, Hunkenschröder and Klein [10] to bound the size of Graver elements for a given matrix A. As we use the following theorem and its technique in this paper, we give a brief sketch of its proof. The Steinitz Lemma was first used by Eisenbrand and Weismantel [12] in the context of integer programming.

Theorem 1

(Eisenbrand, Hunkenschröder, Klein [10]) Let AZm×n be an integer matrix where every entry of A is bounded by Δ in absolute value. Let gZn be an element of the Graver Basis of A then g1(2mΔ+1)m.

Proof

Consider the sequence of vectors v1,,vg1 consisting of yi copies of the ith column of A if gi is positive and |gi| copies of the negative of the ith coplumn of A if gi is negative. As g is a Graver element we obtain that v1++vg1=0. Using the Steinitz Lemma above, there exists a reordering u1++ug1 of the vectors such that the partial sum pk=i=1kuiΔm for each kg1.

Suppose by contradiction that g1>(2mΔ+1)m. Then by the pigeonhole principle there exist two partial sums that sum up to the same value. However, this means that g can be decomposed and hence can not be a Graver element.

Our results

The main result of this paper is to prove a new structural lemma that enhances the toolset of the augmentation framework. We show that this lemma can be directly used to obtain an explicit bound for Graver elements of the constraint matrix of 2-stage stochastic IPs. But we think that it might also be of independent interest as it provides interesting structural insights into vector sets.

Lemma 2

Given are multisets T1,,TnZ0d where all elements tTi have bounded size tΔ. Assuming that the total sum of all elements in each set is equal, i.e.,

tT1t==tTnt

then there exist nonempty submultisets S1T1,,SnTn of bounded size |Si|(dΔ)O(d(Δd2)) such that

sS1s==sSns.

Note that Lemma 2 only makes sense when we consider the Ti to be multisets as the number of different sets without allowing multiplicity of vectors would be bounded by 2(Δ+1)d.

A geometric interpretation of Lemma 2 is given in the following figure. On the left side we have n paths consisting of sets of vectors and all paths end at the same point b.

graphic file with name 10107_2021_1698_Figa_HTML.jpg

Then Lemma 2 shows that there always exist permutations of the vectors of each path such that all paths meet at a point b of bounded size. The bound depends only on Δ and the dimension d and is thus independent of the number of paths n and the size of b. For the Proof of Lemma 2 we need basic properties of the intersections of integer cones. We show that those properties can be obtained by using the Steinitz Lemma.

We show that Lemma 2 has strong implications in the context of integer programming. Using Lemma 2, we can show that the size of Graver elements of the matrix A is bounded by (rsΔ)O(rs((2rΔ+1)rs2)). Within the framework of Graver augmenting steps the bound implies that 2-stage stochastic IPs can be solved in time (rsΔ)O(rs2((2rΔ+1)rs2))·n2t2φlog2(nt), where φ is the encoding length of the instance (see Theorem 3). With this we improve upon an implicit bound for the size of the Graver elements matrix 2-stage stochastic constraint matrices due to Hemmecke and Schultz [18].

Based on the structural observations of this paper, in a recent work by Cslovjecsek et al. [8] an algorithm was developed that solves 2-stage stochastic IPs with an improved running time of 22ΔO(r(r+s))·nlogO(rs)n.

Furthermore, we show that our Lemma can also be applied to bound the size of Graver elements of constraint matrices that have a multi-stage stochastic structure. Multi-stage stochastic IPs are a well known generalization of 2-stage stochastic IPs. By this, we improve upon a result of Aschenbrenner and Hemmecke [2].

To complement our results for the upper bound, we also present in Sect. 3 a lower bound for the size of Graver elements of matrices that have a 2-stage stochastic IP structure. The given lower bound is for the case of s=1. In this case we show in Theorem 4 a matrix where the Graver elements have a size of 2Ω(Δr).

The complexity of two-stage stochastic IPs

First, we argue about the application of Lemma 2. In the following we show that the infinity-norm of Graver elements of matrices with a 2-stage stochastic structure can be bounded using Lemma 2.

Given the block structure of the IP (1), we define for a vector yZs+nt with Ay=0 the vector y(0)Z0s which consists of the entries of y that belong to the vertical matrices A(i) and we define y(i)Z0t to be the entries of y that belong to the diagonal matrix B(i).

Theorem 2

Let y be a Graver element of the constraint matrix A of IP (1). Then y is bounded by (rsΔ)O(rs((2rΔ+1)rs2)). More precisely, y(i)1(rsΔ)O(rs((2rΔ+1)rs2)) for every 0in.

Proof

Let yZs+nt be a cycle of IP (1), i.e., Ay=0. Consider a submatrix (A(i)B(i))Zr×(s+t) of the matrix A denoted by consisting of the matrix A(i) of the vertical line and the matrix B(i) of the diagonal line. Consider further the corresponding variables v(i)=y(0)y(i)Zs+t of the respective matrix A(i) and B(i). Since Ay=0, we also have that (A(i)B(i))v(i)=0. By using Theorem 1 iteratively, we can decompose v(i) into a multiset Ci of indecomposable elements, i.e., v(i)=zCiz with z1(2rΔ+1)r for each zCi.

Since all matrices (A(i)B(i)) share the same set of variables in the overlapping matrices A(i), we can not directly derive cycles for the entire matrix A from cycles of the submatrices (A(i)B(i)). This is because a cycle v(A(i)B(i)) and a cycle v(A(j)B(j)) for ji might have conflicting entries in the overlapping part of the vector.

Let p:Zs+tZs be the projection that maps a cycle z of a block matrix (A(i)B(i)) to the variables in the overlapping part, i.e., p(z)=p(z(0)z(i))=z(0).

In the case that y is large we will show that we can find a cycle y¯ of smaller length with |y¯i||yi| and therefore show that y can be decomposed. In order to obtain this cycle y¯ for the entire matrix A, we have to find a multiset of cycles C¯iCi in each block matrix (A(i)B(i)) such that the sum of the projected parts is identical, i.e., zC¯1p(z)==zC¯np(z). We apply Lemma 2 to the multisets p(C1),,p(Cn), where p(Ci)={p(z)zCi} is the multiset of projected elements in Ci, where p(z)1(2rΔ+1)r holds. Note that xp(C1)x==xp(Cn)x=y(0) and hence the conditions to apply Lemma 2 are fulfilled. Since every v(i) is decomposed in a sign compatible way, every entry of the vector in p(Ci) has the same sign. Hence we can flip the negative signs in order to apply Lemma 2.

By Lemma 2, there exist submultisets S1p(C1),,Snp(Cn) such that xS1x==xSnx and |Si|(sz1)O(s(z1s2))=(rsΔ)O(rs((2rΔ+1)rs2)). As there exist submultisets C¯1C1,C¯nCn with p(C1¯)=S1,p(C¯n)=Sn, we can use those submultisets C¯i to define a solution y¯ with |y¯i||yi|. For i>0 let y¯(i)=zC¯ip¯(z), where p¯(z) is the projection that maps a cycle zC¯i to the part that belongs to matrix B(i), i.e., p¯(z(0)z(i))=z(i). Let y¯(0)=zC¯ip(z) for an arbitrary i>0, which is well defined as the sum is identical for all multisets C¯i. As the cardinality of the multisets C¯i is bounded, we know by construction of y¯ that the one-norm of every y(i) is bounded by

y(i)1(2rΔ+1)r·(rsΔ)O(rs((2rΔ+1)rs2))=(rsΔ)O(rs((2rΔ+1)rs2)).

This directly implies the infinity-norm bound for y as well.

As a consequence of the bound for the size of the Graver elements, we obtain by the framework of augmenting steps an efficient algorithm to compute an optimal solution of a 2-stage stochastic IP. By using the augmentation framework as described in [11] we obtain the following theorem regarding the worst-case complexity for solving 2-stage stochastic IPs.

Theorem 3

A 2-stage stochastic IP of the form (1) can be solved in time

n2t2φlog2(nt)(rsΔ)O(rs2((2rΔ+1)rs2)),

where φ is the encoding length of the IP.

Proof

Let L=(rsΔ)O(rs((2rΔ+1)rs2)) be the bound for y(i)1 that we obtain from the previous Lemma. To find the optimal augmenting step, it is sufficient to solve the so called augmenting IP

maxcTxAx=0l¯xu¯xLxZs+nt 2

for some upper and lower bounds l¯,u¯. Having the best augmenting step at hand, one can show that the objective value improves by a certain factor. We refer to Corollary 14 of [11] which shows that IP (1) can be solved if the above augmenting IP (2) can be solved.

In the following we briefly show how to solve the IP (2) in order to compute the augmenting step. The algorithm works as follows:

  • Compute for every y(0) with y(0)1L the objective value of the cycle y consisting of y(0),y¯(1),,y¯(n), where y¯(i) for i>0 are the optimal solutions of the IP
    max(c(i))Ty¯(i)B(i)y¯(i)=-A(i)y(0)l¯(i)y¯(i)u¯(i)
    where l¯(i),u¯(i) are the upper and lower bounds for the variables y¯(i) and c(i) their corresponding objective vector. Note that the first set of constraints of the IP ensure that Ay=0. The IPs can be solved with the algorithm of Eisenbrand and Weismantel [12] in time O(ΔO(r2)) each.
  • Return the cycle with maximum objective.

As the number of different vectors y(0) with 1-norm L is bounded by (L+1)s=(rsΔ)O(rs2((2rΔ+1)rs2)) step 1 of the algorithm takes time ΔO(r2)·(rsΔ)O(rs2((2rΔ+1)rs2)).

About the intersection of integer cones

Before we are ready to prove our main Lemma 2, we need two helpful observations about the intersection of integer cones. An integer cone is defined for a given (finite) generating set BZ0d of elements by

int.cone(B)={bBλbbλZ0B}.

Note that the intersection of two integer cones is again an integer cone, as the intersection is closed under addition and scalar multiplication of positive integers.

We say that an element b of an integer cone int.cone(B) is indecomposable if there do not exist elements b1,b2int.cone(B)\{0} such that b=b1+b2. We can assume that the generating set B of an integer cone consists just of the set of indecomposable elements as any decomposable element can be removed from the generating set.

In the following we allow to use a vector set B as a matrix and vice versa where the elements of the set B are the columns of the matrix B. This way we can multiply B with a vector, i.e., Bλ=bBλbb for some λZB.

Lemma 3

Consider integer cones int.cone(B(1)) and int.cone(B(2)) for some generating sets B(1),B(2)Zd where each element xB(1)B(2) has bounded norm xΔ. Consider the integer cone of the intersection

int.cone(B^)=int.cone(B(1))int.cone(B(2))

for some generating set of elements B^. Then for each indecomposable element bB^ of the intersection cone with b=B(1)λ=B(2)γ for some λZ0B(1) and γZ0B(2), we have that λ1,γ1(2dΔ+1)d. Furthermore, the norm of b is bounded by bΔ(2dΔ+1)d

Proof

Consider the representation of a point b=B(1)λ=B(2)γ in the intersection of int.cone(B(1)) and int.cone(B(2)). The sum v1++v(λ1+γ1) consisting of λi copies of the ith element of B(1) and γi copies of the negative of the ith element of B(2) equals to zero. Using Steinitz’ Lemma, there exists a reordering of the vectors u1++u(λ1+γ1) such that the partial sum i=1uidΔ, for each λ1+γ1.

If λ1+γ1>(2dΔ+1)d then by the pigeonhole principle, there exist two partial sums of the same value. Hence, there are two sequences that sum up to zero, i.e., there exist non-zero vectors λ,λZ0B(1) with λ=λ+λ and γ,γZ0B(2) with γ=γ+γ such that B(1)λ-B(2)γ=0 and B(1)λ-B(1)γ=0. Hence B(1)λ=B(2)γ and B(1)λ=B(2)γ are elements of the intersection cone. This implies that b can be decomposed in the intersection cone.

Using a similar argumentation as in the previous lemma, we can consider the intersection of several integer cones. Note that we can not simply use the above Lemma inductively as this would lead to worse bounds.

Lemma 4

Consider integer cones int.cone(B(1)),,int.cone(B()) for some generating sets B(1),,B()Z0d with xΔ for each xB(i). Consider the integer cone of the intersection

int.cone(B^)=i=1int.cone(B(i))

for some generating set of elements B^.

Then for each indecomposable element bB^ with B(i)λ(i)=b for some λ(i)Z0B(i) in the intersection cone, we have that λ(i)1O((dΔ)d(-1)) for all 1i.

Proof

Given vectors λ(1),,λ() with λ(k)Z0B(k) and B(k)λ(k)=b for each k. Consider the sum of vectors v1(k)++vλ(k)1(k) for each 1k consisting of λj(k) copies of the jth element of Bjk. By adding 0 vectors to sums we can assume without loss of generality that every sequence has the same number of summands L=maxi=1,,λ(i)1.

Claim: There exists a reordering u1(k)++uL(k) for each of these sums such that each partial sum pm(k)=i=1mui(k) is close to the line between 0 and b and more precisely:

pm(k)-mLb4Δ(d+1).

for each mL and each k. To see this, we construct the sequence that consists of vectors from B(k) and subtract L fractional parts of the vector b. To count the number of vectors we use an additional component with weight Δ of the vector and define v¯i(k)=Δvi(k) and b¯=LΔb. Note that v¯i(k)1,1Lb¯12Δ. Then the sequence v¯1(k)++v¯L(k)-1Lb¯--1Lb¯ sums up to zero, as v1(k)++vL(k)=b. Hence we can apply the Steinitz Lemma to obtain a reordering u¯1++u¯L for each sequence such that each partial sum i=1mu¯i2Δ(d+1) for each m2L. Each partial sum that sums up to index m contains p vectors v¯j(k) and q vectors 1Lb for some p,qZ0 with m=p+q. Hence i=1pui-qLb2Δ(d+1). Furthermore, the Δ entry of each vector guarantees that |p-q|2(d+1) which implies the statement of the claim.

Now consider the differences of a partial sum pm(k) with pm(1). Using the claim from above, we can now argue that pm(1)-pm(k)8Δ(d+1) for each mL and k as each pm(k) is close to mLb. Therefore the number of different values for pm(1)-pm(k) is bounded by (16Δ(d+1)+1)d. Assuming that L>(16Δ(d+1)+1)d(-1), by the pigeonhole principle there exist indices m and m with m>m such that pm(1)-pm(k)=pm(1)-pm(k) for each k. Hence pm(1)-pm(1)==pm()-pm()=:b and b,b-bi=1Bi. This implies that b can be decomposed and is therefore not a generating element of i=1int.cone(Bi).

Proof of Lemma

2 Using the results from the previous section, we are now finally able to prove the main Lemma 2.

We begin with a sketch of the proof for the 1-dimensional case. This will be helpful when we generalize the approach later. In the 1-dimensional case, the multisets Ti consist solely of natural numbers, i.e T1,,TnZ0. Suppose that each set Ti consists only of many copies of a single integral number xi{1,,Δ}. Then it is easy to find a common multiple as Δ!1·1=Δ!2·2==Δ!Δ·Δ. Hence one can choose the subsets consisting of Δ!xi copies of xi. Now suppose that the multisets Ti can be arbitrary. If |Ti|Δ·Δ!=ΔO(Δ) we are done. But on the other hand, if |Ti|Δ·Δ!, by the pigeonhole principle there exists a single element xi{1,,Δ} for every Ti that appears at least Δ! times. Then we can argue as in the previous case where we needed at most Δ! copies of a number xi{1,,Δ}. Note that the cardinality of the sets Ti has to be of similar size. As the elements of each set sums up to the same value, the cardinality of two sets Ti,Tj can only differ by a factor of Δ. This proves the lemma in the case d=1.

In the case of higher dimensions, the lemma seems much harder to prove. But in principle we use generalizations of the above techniques. Instead of single natural numbers however, we have to work with bases of corresponding basic feasible LP solutions and the intersections of the integer cones generated by those bases.

In the proof we need the notion of a cone which is simply the relaxation of an integer cone. For a generating set BZ0d, a cone is defined by

cone(B)=bBλbbλR0B.

Proof

First, we describe the multisets T1,,TnZ0d by multiplicity vectors λ(1),,λ(n)Z0P, where PZd is the set of non-negative integer points p with pΔ. Each λp(i) thereby states the multiplicity of a vector p in Ti. Hence tTit=pPλp(i)p and our objective is to find vectors y(1),,y(n)Z0P with y(i)λ(i) such that pPyp(1)p==pPyp(n)p.

Consider the linear program

pPxpp=bxR0P 3

Let x(1),,x()R0d be all possible basic feasible solutions of the LP corresponding to bases B(1),,B()Z0d×d i.e., B(i)x(i)=b.

In the following we prove two claims that correspond to the two previously described cases of the one dimensional case. First, we consider the case that essentially each multiset Ti corresponds to one of the basic feasible solution x(j). In the 1-dimensional case this would mean that each set consists only of a single number. Note that the intersection of integer cones in dimension 1 is just the least common multiple, i.e., int.cone(z1)int.cone(z2)=int.cone(lcm(z1,z2)) for some z1,z2Z0.

Claim 1

If for all i we have x(i)1>d·O((dΔ)d(-1)) then there exist non-zero vectors y(1),,y()Z0d with y(1)x(1),,y()x() and y(i)1d·O((dΔ)d(-1)) such that B(1)y(1)==B()y().

Note that all basic feasible solutions x(i)R0d have to be of similar size. Since Bx(i)=b holds for all 1i we know that x(i)1 and x(j)1 can only differ by a factor of dΔ for all 1i,j. Hence all basic feasible solutions x(i) have to be either small or all have to be large. This claim considers the case that the size of all x(i) is large.

Proof of the claim

Note that B(i)x(i)=b and hence bcone(B(i)). In the following, our goal is to find a non-zero point qZ0d such that q=B(1)y(1)==B()y() for some vectors y(1),,y()Z0d. However, this means that q has to be in the integer cone int.cone(B(i)) for every 1i and therefore in the intersection of all the integer cones, i.e., qi=1nint.cone(B(i)). By Lemma 4 there exists a set of generating elements B^ such that

  • int.cone(B^)=i=1nint.cone(B(i)) and int.cone(B^){0} as bcone(B^) and

  • each generating vector pB^ can be represented by p=B(i)λ for some λZ0d with λ1O((dΔ)d(-1)) for each basis B(i).

As bcone(B^) there exists a vector x^R0B^ with B^x^=b. Our goal is to show that there exists a non-zero vector qB^ with x^q1. In this case b can be simply written by b=q+q for some qcone(B^). As q and q are contained in the intersection of all cones, there exists for each generating set B(j) a vectors y(j)Z0B(j) and z(j)R0B(j) such that B(j)y(j)=q and B(j)z(j)=q. Hence x(j)=y(j)+z(j) and we finally obtain that x(j)y(j) for y(j)Z0B(j) which shows the claim.

Therefore it only remains to prove the existence of the point q with x^q1. By Lemma 4, each vector pB^ can be represented, by p=B(i)x(p) for some x(p)Z0B(i) with x(p)1O((dΔ)d(-1)) for every basis B(i).

As B(i)x(i)=b=pB^x^pp=pB^x^p(B(i)x(p)), every x(i) can be written by x(i)=pB^x(p)x^p and we obtain a bound on x(i)1 assuming that every for every pB^ we have x^p<1.

x(i)1pB^x(p)x^p1<x^p<1pB^x(p)1d·O((dΔ)d(-1)).

The last inequality follows as we can assume by Caratheodory’s theorem [30] that the number of non-zero components of x^ is less or equal than d. Hence if x(i)1d·O((dΔ)d(-1)) then there has to exist a vector qB^ with xq1 which proves the claim.

Claim 2

For every vector λ(i)Z0P with pPλpp=b there exists a basic feasible solution x(k) of LP (3) with basis B(k) such that 1x(k)λ(i) in the sense that 1xp(k)λp(i) for every pB(k).

Proof of the claim

The proof of the claim can be easily seen as each multiplicity vector λ(i) is also a solution of the linear program (3). By standard LP theory, we know that each solution of the LP is a convex combination of the basic feasible solutions x(1),,x(). Hence, each multiplicity vector λ(i) can be written as a convex combination of x(1),,x(), i.e., for each λ(i), there exists a tR0 with t1=1 such that λ(i)=j=1tjx¯(j), where

x¯p(j)=xp(j)ifpB(j)0otherwise.

By the pigeonhole principle, there exists for each multiplicity vector λ(i) an index k with tk1 which proves the claim.

Using the above two claims, we can now prove the claim of the lemma by showing that for each λ(i), there exist a vector y(i)λ(i) with bounded 1-norm such that pPyp(1)p==pPyp(n)p.

First, consider the case that there exists a basic feasible solution x(j) of LP 3 with x(j)1d·O((dΔ)d(-1)). In this case we have for all 1in that λ(i)1d2Δ·O((dΔ)d(-1)) as the size of solutions of LP (3) can not differ by a factor of more than dΔ (this is because for every p,pP the sizes p1,p1 can not differ by a factor of more than dΔ).

Now, assume that for all basic feasible solutions x(i) we have x(i)1>d·O((dΔ)d(-1)). We can argue by Claim 2 that for each λ(i) (with 1in) we find one of the basic feasible solutions x(k) (1k) with 1x(k)λ(i). As 1x(i)d·O((dΔ)d(-1)) for all 1i, we can apply the first claim to vectors 1x(1),,1x() with 1b=1Bx(1)==1Bx(), we obtain vectors y(1)1x(1),,y()1x() with By(1)==By(). Hence, we find for each λ(i) a vector y(k)Z0B(k) with y(k)λ(i).

Finally we obtain that

y(j)1d2Δ·O((dΔ)d(-1))=(dΔ)O(d(Δd2))

using that is bounded by |P|d|P|d and |P|Δd

A lower bound for the size of graver elements

In this section we prove a lower bound on the size of Graver elements for a matrix where the overlapping parts contains only a single variable, i.e., s=1.

First, consider the matrix

A=-1200-103-10-100M.

This matrix is of 2-stage stochastic structure with r=1 and s=1. We will argue that every element in ker(A)(ZM\{0}) is large and therefore, the Graver elements of the matrix are large as well. We call the variable corresponding to the ith column of the matrix variable xi, where x1 is the variable corresponding to the column with only -1 entries and the xi for i>1 is the variable corresponding to the column with entry i in component i and 0 everywhere else. Clearly, for xZM to be in ker(A)Zn, we know by the first row of matrix A that x1 has to be a multiple of 2. By the second row of the matrix, we know that x1 has to be a multiple of 3 and so on. Henceforth the variable x1 has to be a multiple of all numbers 1,,M. Thus x1 is a multiple of the least common multiple of numbers 1,,n which is divisible by the product of all primes between 1,,n. By known bounds for the product of all primes n [13], this implies that the value of x12Ω(M), which shows that the size of Graver elements of matrix A is in (2Ω(M)).

The disadvantage in the matrix above is that the entries of the matrix are rather big. In the following we reduce the largest entry of the overall matrix by encoding each number 1,,M into a submatrix. For the encoding we use the matrix

C=Δ-1000Δ-1000Δ-1,

having r rows and r+1 constraints. Due to the first row of matrix C, for a vector xker(C)Zr+1 we know by the ith row of the matrix that xi=xi-1·Δ. Hence xi=Δi-1x1. Now we can encode in each number z{0,,Δr+1-1} in an additional row by z=i=0rai(z)Δi, where ai(z) is the ith number in a representation of z in base Δ. Hence, we consider the following matrix:

graphic file with name 10107_2021_1698_Equ29_HTML.gif

By the same argumentation as in matrix A above we know that x0 has to be a multiple of each number 2,,Δr+1-1. This implies that every non-zero integer vector of ker(A) has infinity-norm of at least 2Ω(Δr). This shows the doubly exponential lower bound for the Graver complexity of 2-stage stochastic IPs and proves the following theorem.

Theorem 4

There exists a constraint matrix AZrn×(1+nt) such that each Graver element yker(A) is lower bounded by

y2Ω(Δr).

Multi-stage stochastic IPs

In this section we show that Lemma 2 can also be used to get a bound on the Graver elements of matrices with a multi-stage stochastic structure. Multi-stage stochastic IPs are a well known generalization of 2-stage stochastic IPs. For the stochastical programming background on multi-stage stochastic IPs we refer to [29]. Here we simply show how to solve the equivalent deterministic IP with a large constraint matrix. Regarding the augmentation framework of multi-stage stochastic IPs, it was previously known that a similar implicit bound as for 2-stage stochastic IPs also holds for multi-stage stochastic IPs. This was shown by Aschenbrenner and Hemmecke [2] who built upon the bound of 2-stage stochastic IPs.

In the following we define the shape of the constraint matrix M of a multi-stage stochastic IP. The constraint matrix consists of given matrices A(1),,A() for some Z0, where each matrix A(i) uses a unique set of columns in M. For a given matrix, let rows(A(i)) be the set of rows in M which are used by A(i). A matrix M is multi-stage stochastic shape, if the following conditions are fulfilled:

  • There is a matrix Ai0 such that for every 1in we have rows(A(i))rows(A(i0)).

  • For two matrices A(i),A(j) either rows(A(i))rows(A(j)) or rows(A(i))rows(A(j))= holds.

An example of a matrix of multi-stage stochastic structure is given in the following: graphic file with name 10107_2021_1698_Figb_HTML.jpg

Intuitively, the constraint matrix is of multi-stage stochastic shape if the block matrices with the relation on the rows, forms a tree (see figure below). graphic file with name 10107_2021_1698_Figc_HTML.jpg Let si be the number of columns that are used by matrices in the ith level of the tree (starting from level 0 at the leaves). Here we assume that the number of columns of matrices in the same level of the tree are all identical. Let r be the number of rows that are used by the matrices that correspond to the leaves of the tree. In the following theorem we show that Lemma 2 can be applied inductively to bound the size of an augmenting step of multi-stage stochastic IPs. The proof is similar to that of Theorem 2.

Theorem 5

Let y be an indecomposable cycle of matrix M. Then y is bounded by a function T(s1,,st,r,Δ), where t is the depth of the tree. The function T involves a tower of t+1 exponentials and is recursively defined by

T(r,Δ)=(Δr)O(r)T(s1,,si,r,Δ)=2(T(s1,,si-1,r,Δ))O(si2).

Proof

Consider a submatrix A of the constraint matrix M corresponding to a subtree of the tree with depth t. Hence, A itself is of multi-stage stochastic structure. Let submatrix A{A(1),,A()} be the root of the corresponding subtree of A and let the submatrices B(1),,B(n) be the submatrices corresponding to the subtrees of A with rows(B(i))rows(A) for all 1in.

Let A¯(i) be the submatrix of A which consists only of the rows that are used by B(i) (recall that rows(B(i))rows(A)). Now suppose that y is a cycle of A, i.e., Ay=0 and let y(0) be the subvector of y consisting only of the entries that belong to matrix A. Symmetrically let y(i) be the entries of vector y that belong only to the matrix B(i) for i>0. Since Ay=0 we also know that A¯(i)y(0)+B(i)y(i)=(A¯(i)B(i))y(0)y(i)=0 for every 1in. Each vector y(0)y(i) can be decomposed into a multiset of indecomposable cylces Ci , i.e.,

y(0)y(i)=zCiz

where each cycle zCi is a vector z=z(0)z(i) consisting of subvector z(0) of entries that belong to matrix A and a subvector z(i) of entries that belong to the matrix B(i). Note that the matrix (A(i)B(i)) has a multi-stage stochastic structure with a corresponding tree of depth t-1. Hence, by induction we can assume that each indecomposable cycle zCi is bounded by zT(s1,,st-1,r) for all 1in, where T is a function that involves a tower of t exponentials. In the base case that t=0 and the matrix A only consists of one matrix, we can bound z by (2Δr+1)r using Theorem 1. Let p be the projection that maps a cycle to the entries that belong to the matrix A i.e., p(z)=p(z(0)z(i))=z(0).

For each vector y(0)y(i) and its decomposition into cycles Ci let p(Ci)={p(z)zCi}. Since

y(0)=zC1p(z)==zCnp(z)

we can apply Lemma 2, to obtain submultisets Sip(Ci) of bounded size

|Si|(stT)O(st(Tst2))

with T=T(s1,,st-1,r,Δ) such that xS1x==xSnx. As T(s1,,st-1,r) is a function with t exponentials, the cardinality |Si| can be bounded by a function of t+1 exponentials.

There exist submultisets C¯1C1,,C¯nCn with p(C¯1)=S1,,p(C¯n)=Sn. Hence, we can define the solution y¯y by y¯(i)=zC¯ip¯(z) for every i>0, where p¯ is the function that projects a vector to the entries that belong the matrix B(i) i.e., p¯(z)=p¯(z(0)z(i))=z(i). For i=0 we define y(0)=zC¯ip(z). As the sum zC¯ip(z) is identical for every 1in, the vector y¯ is a well defined.

Let K be the constant derived from the O-notation of Lemma 2 and T=T(s1,,st-1,r,Δ), then the size of y¯ can be bounded by

y¯T·maxi|Ci|=T·(stT)Kst·T(st2)2Kstlog(stT)·T(st2)2TO(st2).

As a consequence of the bound of the Graver elements of the constraint matrix M of multi-stage stochastic IPs, we obtain by using the augmentation framework an algorithm to solve multi-stage stochastic IPs. Again, we refer to [11] for the details on the augmentation framework.

Theorem 6

A multi-stage stochastic IP with a constraint matrix M that corresponds to a tree of depth t can be solved in time

n2s02φlog2(ns0)·T(s1,,st,r,Δ),

where φ is the encoding length of the IP and T is a function depending only on parameters s1,,st,r,Δ and involves a tower of t+1 exponentials.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Footnotes

This work was partially done during the author’s time at EPFL. The Project was supported by the Swiss National Science Foundation (SNSF) within the Project Convexity, geometry of numbers, and the complexity of integer programming (No. 163071).

An extended abstract of this paper appeared at Integer Programming and Combinatorial Optimization–21st International Conference, IPCO 2020, London.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Ahmed S, Tawarmalani M, Sahinidis NV. A finite branch-and-bound algorithm for two-stage stochastic integer programs. Math. Program. 2004;100(2):355–377. doi: 10.1007/s10107-003-0475-6. [DOI] [Google Scholar]
  • 2.Aschenbrenner M, Hemmecke R. Finiteness theorems in stochastic integer programming. Found. Comput. Math. 2007;7(2):183–227. doi: 10.1007/s10208-005-0174-1. [DOI] [Google Scholar]
  • 3.Birge JR, Louveaux F. Introduction to Stochastic Programming. Berlin: Springer; 2011. [Google Scholar]
  • 4.Brand, C., Koutecký, M., Ordyniak, S.: Parameterized algorithms for milps with small treedepth. CoRR, arXiv:1912.03501 (2019)
  • 5.Carøe CC, Tind J. L-shaped decomposition of two-stage stochastic programs with integer recourse. Math. Program. 1998;83(1–3):451–464. doi: 10.1007/BF02680570. [DOI] [Google Scholar]
  • 6.Chen, L., Koutecký, M., Xu, L., Shi, W.: New bounds on augmenting steps of block-structured integer programs. In: Grandoni, F., Herman, G., Sanders, P., (eds.), 28th Annual European Symposium on Algorithms, ESA 2020, September 7-9, 2020, Pisa, Italy (Virtual Conference), volume 173 of LIPIcs, pp. 33:1–33:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)
  • 7.Cslovjecsek, J., Eisenbrand, F., Hunkenschröder, C., Rohwedder, L., Weismantel, R.: Block-structured integer and linear programming in strongly polynomial and near linear time. In: Marx, D. (ed.) Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms, SODA 2021, Virtual Conference, January 10 - 13, 2021, pp. 1666–1681. SIAM (2021)
  • 8.Cslovjecsek, J., Eisenbrand, F., Pilipczuk, M., Venzin, M., Weismantel, R.: Efficient sequential and parallel algorithms for multistage stochastic integer programming using proximity. CoRR, arXiv:2012.11742 (2020)
  • 9.Dempster MAH, Fisher M, Jansen L, Lageweg B, Lenstra JK, Rinnooy Kan A. Analytical evaluation of hierarchical planning systems. Oper. Res. 1981;29(4):707–716. doi: 10.1287/opre.29.4.707. [DOI] [Google Scholar]
  • 10.Eisenbrand, F., Hunkenschröder, C., Klein, K.: Faster algorithms for integer programs with block structure. In: 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018, July 9-13, 2018, Prague, Czech Republic, pp. 49:1–49:13 (2018)
  • 11.Eisenbrand, F., Hunkenschröder, C., Klein, K.-M., Koutecký, M., Levin, A., Onn, S.: An algorithmic theory of integer programming (2019)
  • 12.Eisenbrand, F., Weismantel, R.: Proximity results and faster algorithms for integer programming using the Steinitz lemma. In: Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 808–816. SIAM (2018)
  • 13.Erdös, P.: Ramanujan and i. In: Number Theory, Madras 1987, pp. 1–17. Springer (1989)
  • 14.Gade D, Küçükyavuz S, Sen S. Decomposition algorithms with parametric gomory cuts for two-stage stochastic integer programs. Math. Program. 2014;144(1–2):39–64. doi: 10.1007/s10107-012-0615-y. [DOI] [Google Scholar]
  • 15.Graver JE. On the foundations of linear and integer linear programming i. Math. Program. 1975;9(1):207–226. doi: 10.1007/BF01681344. [DOI] [Google Scholar]
  • 16.Grinberg, V.S., Sevast’yanov, S.V.: Value of the Steinitz constant. Funct. Anal. Appl. 14(2), 125–126 (1980)
  • 17.Klein Haneveld, W.K., van der Vlerk, M.H.: Optimizing electricity distribution using two-stage integer recourse models. Stochastic optimization: algorithms and applications. Springer, Boston, MA, pp. 137–154 (2001)
  • 18.Hemmecke R, Schultz R. Decomposition of test sets in stochastic integer programming. Math. Program. 2003;94(2–3):323–341. doi: 10.1007/s10107-002-0322-1. [DOI] [Google Scholar]
  • 19.Jansen, K., Klein, K., Maack, M., Rau, M.: Empowering the configuration-ip - new PTAS results for scheduling with setups times. CoRR arXiv:1801.06460 (2018)
  • 20.Jansen, K., Lassota, A., Rohwedder, L.: Near-linear time algorithm for n-fold ilps via color coding. arXiv preprint arXiv:1811.00950 (2018)
  • 21.Kall P, Wallace SW. Stochastic Programming. Berlin: Springer; 1994. [Google Scholar]
  • 22.Kannan, R.: Minkowski’s convex body theorem and integer programming. Math. Oper. Res. 12(3), 415–440 (1987)
  • 23.Knop D, Koutecký M. Scheduling meets n-fold integer programming. J. Scheduling. 2018;21(5):493–503. doi: 10.1007/s10951-017-0550-0. [DOI] [Google Scholar]
  • 24.Knop, D., Koutecký, M., Mnich, M.: Combinatorial n-fold Integer Programming and Applications. In: Pruhs, K., Sohler, C. (eds.) 25th Annual European Symposium on Algorithms (ESA 2017), volume 87 of Leibniz International Proceedings in Informatics (LIPIcs), pp. 54:1–54:14, Dagstuhl, Germany, 2017. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik
  • 25.Knop, D., Koutecký, M., Mnich, M.: Voting and bribing in single-exponential time. In: 34th Symposium on Theoretical Aspects of Computer Science, STACS 2017, March 8-11, 2017, Hannover, Germany, pp. 46:1–46:14 (2017)
  • 26.Koutecký, M., Levin, A., Onn, S.: A parameterized strongly polynomial algorithm for block structured integer programs. In: 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018, July 9-13, 2018, Prague, Czech Republic, pp. 85:1–85:14 (2018)
  • 27.Küçükyavuz, S., Sen, S.: An introduction to two-stage stochastic mixed-integer programming. In: Leading Developments from INFORMS Communities, pp. 1–27. INFORMS (2017)
  • 28.Pelupessy, F.,Weiermann, A.: Ackermannian lower bounds for lengths of bad sequences of monomial ideals over polynomial rings in two variables. Mathematical Theory and Computational Practice, p. 276 (2009)
  • 29.Römisch, W., Schultz, R.: Multistage stochastic integer programs: An introduction. Online optimization of large scale systems. Springer, Berlin, Heidelberg, pp. 581–600 (2001)
  • 30.Schrijver A. Theory of Linear and Integer Programming. New York: Wiley; 1998. [Google Scholar]
  • 31.Schultz R, Stougie L, Van Der Vlerk MH. Two-stage stochastic integer programming: a survey. Stat. Neerl. 1996;50(3):404–416. doi: 10.1111/j.1467-9574.1996.tb01506.x. [DOI] [Google Scholar]
  • 32.Steinitz E. Bedingt konvergente reihen und konvexe systeme. J. für die reine und angewandte Mathematik. 1913;143:128–176. doi: 10.1515/crll.1913.143.128. [DOI] [Google Scholar]
  • 33.Zhang M, Küçükyavuzvuz S. Finitely convergent decomposition algorithms for two-stage stochastic pure integer programs. SIAM J. Optim. 2014;24(4):1933–1951. doi: 10.1137/13092678X. [DOI] [Google Scholar]

Articles from Mathematical Programming are provided here courtesy of Springer

RESOURCES