Skip to main content
Springer logoLink to Springer
. 2017 Jan 11;59(1):3–45. doi: 10.1007/s10817-016-9402-4

Complexity and Resource Bound Analysis of Imperative Programs Using Difference Constraints

Moritz Sinn 1,, Florian Zuleger 1, Helmut Veith 1
PMCID: PMC6044401  PMID: 30069066

Abstract

Difference constraints have been used for termination analysis in the literature, where they denote relational inequalities of the form xy+c, and describe that the value of x in the current state is at most the value of y in the previous state plus some constant cZ. We believe that difference constraints are also a good choice for complexity and resource bound analysis because the complexity of imperative programs typically arises from counter increments and resets, which can be modeled naturally by difference constraints. In this article we propose a bound analysis based on difference constraints. We make the following contributions: (1) our analysis handles bound analysis problems of high practical relevance which current approaches cannot handle: we extend the range of bound analysis to a class of challenging but natural loop iteration patterns which typically appear in parsing and string-matching routines. (2) We advocate the idea of using bound analysis to infer invariants: our soundness proven algorithm obtains invariants through bound analysis, the inferred invariants are in turn used for obtaining bounds. Our bound analysis therefore does not rely on external techniques for invariant generation. (3) We demonstrate that difference constraints are a suitable abstract program model for automatic complexity and resource bound analysis: we provide efficient abstraction techniques for obtaining difference constraint programs from imperative code. (4) We report on a thorough experimental comparison of state-of-the-art bound analysis tools: we set up a tool comparison on (a) a large benchmark of real-world C code, (b) a benchmark built of examples taken from the bound analysis literature and (c) a benchmark of challenging iteration patterns which we found in real source code. (5) Our analysis is more scalable than existing approaches: we discuss how we achieve scalability.

Electronic supplementary material

The online version of this article (doi:10.1007/s10817-016-9402-4) contains supplementary material, which is available to authorized users.

Keywords: Bound analysis, Complexity analysis, Amortized analysis, Difference constraints, Static analysis, Resource bound analysis, Automatic complexity analysis, Cost analysis

Introduction

Automated program analysis for inferring program complexity and resource bounds is a very active area of research. Amongst others, approaches have been developed for analyzing functional programs [16], C# [15], C [2, 7, 29, 35], Java [1] and Integer Transition Systems [6, 10]. Below we sketch applications in the areas of verification and program understanding. For additional motivation we refer the reader to the cited papers.

Verification In many applications such as embedded systems there is a hard constraint on the availability of resources such as CPU time, memory, bandwidth, etc. It is an important part of functional correctness that programs stay within their given resource limits. As a concrete example we mention that considerable effort has been invested to analyze the worst case execution time (WCET) of hard real-time systems [33]. Another application domain is security, where the goal is to derive a bound on how much secret information is leaked in order to decide whether this leakage is acceptable [31].

Static Profiling and Program Understanding Standard profilers report numbers such as how often certain program locations are visited and how much time is spent inside certain functions; however, no information is provided how these numbers are related to the program input. Recently, new profiling approaches have been proposed that apply curve fitting techniques for deriving a cost function, which relates size measures on the program input to the measured program performance [9, 34]. We believe that automated complexity and resource bound analysis lends itself naturally as static profiling technique, because it provides the user with a symbolic expression that relates the program performance to the program input. In the same way, complexity and resource bound analysis can be used to explore unfamiliar code or to annotate library functions by their performance characteristics; we note that a substantial number of performance bugs can be attributed to a “wrong understanding of API performance features” [22].

As a final remark we discuss the relationship to termination analysis, which has been intensively studied in the last decade in the computer-aided verification community: complexity and resource bound analysis can be understood as a quantitative variant of termination analysis, where not only a qualitative “yes” answer is provided, but also a symbolic upper bound on the run-time of the program.

Difference constraints (DCs) have been introduced by Ben-Amram for termination analysis in [4], where they denote relational inequalities of the form xy+c, and describe that the value of x in the current state is at most the value of y in the previous state plus some constant cZ. We call a program whose transitions are given by a set of difference constraints a difference constraint program (DCP).

We advocate the use of DCs for program complexity and resource bound analysis. Our key insight is that DCs provide a natural abstraction of the standard manipulations of counters in imperative programs: counter increments and decrements, i.e., x:=x+c resp. resets, i.e., x:=y, can be modeled by the DCsxx+c resp. xy (see Sect. 6 on program abstraction). The approach we discuss in this article exploits the expressive strength of DCs and distinguishes between counter resets and counter increments in the reasoning. In contrast, previous approaches [1, 2, 6, 10, 15, 29, 35] to bound analysis are not able to track increments and resets on the same level of precision and therefore often fail to infer tight bounds for a class of nested loop constructs which we identified during our experiments on real-world code (demonstrated by our experimental evaluation in Sect. 8.3). In this article we make the following contributions:

  1. Our analysis handles bound analysis problems of high practical relevance which current approaches cannot handle: we extend the range of bound analysis to a class of challenging but natural loop iteration patterns which typically appear in parsing and string-matching routines as we discuss in Sect. 2. At the same time our analysis is general and can handle most of the bound analysis problems which are discussed in the literature. Both claims are supported by our experiments.

  2. We advocate the idea of using bound analysis to infer invariants: we state a clear and concise formulation of invariant analysis by bound analysis on base of our abstract program model: our soundness proven algorithm (Sect. 3) obtains invariants through bound analysis, the inferred invariants are in turn used for obtaining bounds. Our bound analysis therefore does not rely on external techniques for invariant generation.

  3. We demonstrate that difference constraints are a suitable abstract program model for automatic complexity and resource bound analysis: we develop appropriate techniques for abstracting imperative programs to DCPs in Sect. 6.

  4. We report on a thorough experimental comparison of state-of-the-art bound analysis tools (Sect. 8): we set up a tool comparison on (a) a large benchmark of real-world C-code (Sect. 8.1), (b) a benchmark built of examples taken from the bound analysis literature (Sect. 8.2) and (c) a benchmark of challenging iteration patterns which we found in real source code (Sect. 8.3).

  5. We have designed our analysis with the goal of scalability: our experiments demonstrate that our implementation outperforms the state-of-the-art with respect to scalability. We give a detailed discussion on how we achieve scalability in Sect. 10.

This article is an extension of the conference version presented at FMCAD 2015 [27]. Besides making the material more accessible through additional explanations and discussions, it adds the following contributions: (1) a discussion on the instrumentation of our analysis for resource bound analysis (Sect. 2.2). (2) A more detailed discussion and presentation of our context-sensitive bound algorithm (Sect. 3.3). (3) A more detailed discussion on how we determine local bounds and an extension to sets of local bounds (Sect. 4). (4) A complete example (Sect. 7). (5) A discussion on the relation to amortized complexity analysis (Sect. 9). (6) Additional experimental results (Sects. 8.2 and 8.3). (7) In Electronic Supplementary Material we state the soundness proofs omitted in the conference version.

Motivation and Related Work

Example xnuSimple stated in Fig. 1 is representative for a class of loops that we found in parsing and string matching routines during our experiments. In these loops the inner loop iterates over disjoint partitions of an array or string, where the partition sizes are determined by the program logic of the outer loop. For an illustration of this iteration scheme see Example xnu in Fig. 9 (Sect. 7), which contains a snippet of the source code after which we have modeled Example xnuSimple. Example xnuSimple has the linear complexity 2n (we define complexity here as the total number of loop iterations, for alternative definitions see the discussion in Sect. 2.2), because the inner loop as well as the outer loop can be iterated at most n times (as argued in the next paragraph). In the following, we give an overview how our approach infers the linear complexity for Example xnuSimple:

  1. Program Abstraction We abstract the program to a DCP over N as shown in Fig. 1. The abstract variable [x] represents the program expression max(x,0). We discuss our algorithm for abstracting imperative programs to DCPs based on symbolic execution in Sect. 6.

  2. Finding Local Bounds We identify [p] as a variable that limits the number of executions of transition τ3: we have that [p] decreases on each execution of τ3 ([p] takes values over N). We call [p] a local bound for τ3. Accordingly we identify [x] as a local bound for the transitions τ1,τ2,τ4,τ5,τ6.

  3. Bound Analysis Our algorithm (stated in Sect. 3) computes transition bounds, i.e., (symbolic) upper bounds on the number of times program transitions can be executed, and variable bounds, i.e., (symbolic) upper bounds on variable values. For both types of bounds, the main idea of our algorithm is to reason how much and how often the value of the local bound resp. the variable value may increase during program run. Our algorithm is based on a mutual recursion between variable bound analysis (“how much”, function VB(v)) and transition bound analysis (“how often”, function TB(τ)). Next, we give an intuition how our algorithm computes transition bounds: for τ{τ1,τ2,τ4,τ5,τ6} our algorithm computes TB(τ)=[n]=n (note that [n]=n because n has type unsigned) because the local bound [x] is initially set to [n] and never increased or reset. Our algorithm computes TB(τ3) (τ3 corresponds to the loop at l3) as follows: τ3 has local bound [p]; [p] is reset to [r] on τ2; our algorithm detects that before each execution of τ2, [r] is reset to [0] on either τ0 or τ4, which we call the context under which τ2 is executed; our algorithm establishes that between being reset and flowing into [p] the value of [r] can be incremented up to TB(τ1) times by 1; our algorithm obtains TB(τ1)=n by a recursive call; finally, our algorithm calculates TB(τ3)=[0]+TB(τ1)×1=0+n×1=n. We give an example for the mutual recursion between TB and VB in Sect. 2.1.

Fig. 1.

Fig. 1

Running Example xnuSimple, the symbol asterisk denotes non-determinism (arising from conditions not modeled in the analysis)

Fig. 9.

Fig. 9

a Example xnu  b formal representation of xnu by an LTS

Invariants and Bound Analysis

We motivate the need for invariants in bound analysis and sketch how our algorithm infers invariants by bound analysis. Consider Example twoSCCs in Fig. 2. It is easy to infer x as a bound for the possible number of iterations of the loop at l3. However, in order to obtain a bound in the function parameters the difficulty lies in finding an invariant of form xexpr(n,m1,m2), where expr(n,m1,m2) denotes an expression over the function parameters n,m1,m2. We show how our algorithm obtains the invariant xmax(m1,m2)+2n by means of bound analysis:

Fig. 2.

Fig. 2

Running Example twoSCCs

Our algorithm computes a transition bound for the loop at l3 (with the single transition τ5) by TB(τ5)=TB(τ4)×VB([x])=1×VB([x])=VB([x])=TB(τ3)×2+max([m1],[m2])=(TB(τ0)×[n])×2+max([m1],[m2])=(1×[n])×2+max([m1],[m2])=2n+max(m1,m2) (note that [n]=n, [m1]=m1 and [m2]=m2 because n,m1,m2 have type unsigned). We point out the mutual recursion between TB and VB: TB(τ5) has called VB(x), which in turn called TB(τ3). We highlight that the variable bound VB(x) (corresponding to the invariant xmax(m1,m2)+2n) has been established during the computation of TB(τ5).

We call the kind of invariants that our algorithm infers upper bound invariants (Definition 6). We compare our reasoning to classical invariant analysis in Sect. 2.3.

Resource Bound Analysis

We shortly discuss how resource bound analysis can be naturally formulated within our framework. We introduce a fresh variable c and add the initialization c=0 to the beginning of the program under scrutiny. We add an increment/decrement c=c+k at program locations where a resource of cost k is consumed (k is positive) or freed (k is negative). Resource bound analysis is then equivalent to computing an upper bound on the value of the variable c. We can run our algorithm VB(c) to compute a symbolic upper bound for c.

In the same way we can encode related bound analysis problems: reachability bounds [15] (visits to a single location), visits to multiple transitions, loop bounds or complexity analysis. For each of these bound analysis problems one can add a counter increment at the program locations of interest.

We illustrate the suggested encoding on the problem of computing loop bounds: for a given loop we add increments of the counter variable c to every back edge of the loop. Calling VB(c) then returns the sum of the transition bounds of all back edges of the loop. This example also illustrates how transition bounds are used for computing variable bounds in our approach.

Related Work

Termination In [4] it is shown that termination of DCPs is undecidable in general but decidable for the natural syntactic subclass of fan-in free DCPs (see Definition 12), which is the class of DCPs we use in this article. It is an open question for future work whether there is a complete algorithm for bound analysis of fan-in free DCPs.

Bound Analysis In [35] a bound analysis based on so-called size-change constraints xy is proposed, where {<,}. Size-change constraints form a strict syntactic subclass of DCs. However, termination is decidable even for size-change programs that are not fan-in free and a complete algorithm for deciding the complexity of size-change programs has been developed [8]. For reasoning about inner loops [35] computes disjunctive loop summaries while such summaries are not computed by the approach discussed in this work.

In [29] a bound analysis based on constraints of the form xx+c is proposed, where c is either an integer or a symbolic constant. Because the constraints in [29, 35] cannot model both increments and resets, the resulting bound analyses cannot infer the linear complexity of Example xnuSimple and need to rely on external techniques for invariant analysis.

The COSTA project (e.g. [1]) obtains recurrence relations from so-called cost equations using invariant analysis based on the polyhedra abstract domain and approaches from the literature for synthesizing linear ranking functions. Closed-form solutions for the obtained recurrence relations are inferred by means of computer algebra.

The technique discussed in [10] is based on the COSTA approach and formulated in terms of cost equations. Further, paper [10] is inspired by the counter instrumentation-based approach [14] and applies the techniques [3, 25] for inferring linear ranking functions. The technique of [10] achieves a high precision of the inferred bounds by means of control-flow refinement (see also Ref. [11]).

The technique discussed in [2] over-approximates the reachable states by abstract interpretation based on the polyhedra abstract domain. This information is used for generating a linear constraint problem from which a multi-dimensional linear ranking function is obtained. A bound on the number of values which can be taken by the ranking function is then obtained from the previously computed approximation of the reachable states. Importantly, the number of dimensions of the ranking function determines the degree of the bound polynomial. The approach of [2] therefore aims at inferring a ranking function with a minimal number of dimensions and thus depends on a minimal solution to the linear constraint problem which is obtained by linear optimization (Technique [2] instruments the LP-solver with an objective function).

The technique discussed in [6] applies approaches from the literature for synthesizing ranking functions thereby inferring bounds on the number of times the execution of isolated program parts can be repeated. These bounds, called time bounds, are then used to compute bounds on the absolute value of variables, so-called variable size bounds. Additional information is inferred through abstract interpretation based on the octagon abstract domain. An overall complexity bound is deduced by alternating between time bound and variable size bound analysis. In each alternation bounds for larger program parts are obtained based on the previously computed information.

In Sect. 8 we compare our implementation against the techniques [2, 6, 7, 10, 29].

Amortized Complexity Analysis We note that inferring the linear complexity 2n for Example xnuSimple, even though the inner loop can already be iterated n times within one iteration of the outer loop, is an instance of amortized complexity analysis [32]: the cost of executing the inner loop, averaged over all n iterations of the outer loop is 1. Most previous approaches [1, 6, 10, 15, 29, 35] can establish only a quadratic bound for Example xnuSimple. A typical reasoning which fails to establish the linear complexity of Example xnuSimple is as follows: (1) the outer loop can be iterated at most n times, (2) the inner loop can be iterated at most n times within one iteration of the outer loop (because the inner loop has a local loop bound p and pn is an invariant), (3) the loop bound n2 is obtained from (1) and (2) by multiplication.

The recent paper [7] discusses an interesting alternative for amortized complexity analysis of imperative programs: a system of linear inequalities is derived using Hoare-style proof-rules. Solutions to the system represent valid linear resource bounds. Since bound analysis typically does not aim at some bound but tries to infer a tight bound, Ref. [7] uses linear optimization (an LP-solver instrumented by an objective function) in order to obtain a minimum solution to the problem. Interestingly, Ref. [7] is able to compute the linear bound for l3 of Example xnuSimple but fails to deduce the bound for the original source code (discussed in Sect. 7). Moreover, Ref. [7] is restricted to linear bounds, while our approach derives bounds which are polynomial (see, e.g., the results in Table 12) and which contain the maximum operator (e.g., Example twoSCCs). We compare our implementation to the implementation of Ref. [7] in Sect. 8.

Table 12.

Tool results on 23 challenging loop iteration patterns from cBench and SPEC CPU 2006 Benchmarks

loopus’15 loopus’14 CoFloCo KoAT Rank C4B
cf_decode_eol O(n) × O(n2) × ×
cryptRandWriteFile O(n) O(n2) O(n2)
encode_mcu_AC_refine O(n) × O(n2) ×
hc_compute O(n2) O(n3) O(n3) TO ×
inflated_stored O(n) × × ×
PackBitsEncode O(n) × TO O(n2) ×
s_SFD_process O(n) × O(n2) × ×
send_tree O(n) O(n2) O(n2) O(n2) ×
sendMTFValues O(n) O(n2) O(n2)
set_color_ht O(n2) O(n3) O(n3) × ×
subsetdump O(n) O(n2) O(n2) × ×
zwritehexstring_at O(n) O(n2) O(n2)
analyse_other O(n3) O(n4) O(n4) × O(n13) × ×
ApplyBndRobin O(n4) ×
asctoeg O(n2) O(n3) ×
Configure O(n) O(n2) O(n2) O(n2) O(n2) ×
load_mems O(n) O(n3) × O(n3) ×
local_alloc O(n) O(n2) O(n2)
ParseFile O(n) TO O(n3) ×
Perl_scan_vstring O(n) O(n2) × O(n2) ×
SingleLinkCluster O(n2) × TO × ×
xdr3dfcoord O(n) O(n2) O(n2) ×
XNU O(n) O(n2) O(n2) O(n2) × ×
Total tight 21 12 6 2 7 6
Total Over-approx. 2 10 7 18 2 0
Total fail 0 1 8 1 10 16
Total timed Out 0 0 2 2 0 0
Total time 2 s 1 s 41 m 74 m 28 s 20 m
Total time w/o TO 2 s 1 s 1 m 34 m 28 s 20 m

The time out was 20 min, a longer time out did not yield additional results

Invariants and Bound Analysis The powerful idea of expressing locally computed bounds in terms of the function parameters by alternating between bound analysis and variable upper bound analysis has previously been applied in [6, 12, 28]. Since Refs. [12, 28] do not give a general algorithm but deal with specific cases we focus our discussion on [6] and highlight some important differences. The technique discussed in [6] computes upper bound invariants only for the absolute values of variables; for many cases, this does not allow to distinguish between variable increments and decrements: consider the program foo(int x, int y) {while(y> 0) {x -- ; y -- ;} while(x> 0) x -- ;}. The algorithm described in [6] infers the bound |x|+|y| for the second loop, whereas our analysis infers the bound max(x,0). The approach of [6] depends on global invariant analysis. E.g., given a decrement x:=x-1, the technique of [6] needs to check whether x0 holds. If x0 cannot be ensured, the decrement can actually increment the absolute value of x, and will thus be interpreted as |x|=|x|+1. This can either lead to gross over-approximations or failure of bound computation if the increment of |x| cannot be bounded. Since our approach does not track the absolute value but the value, it is not concerned with this problem. The technique discussed in [6] does not support amortized analysis: e.g., The technique [6] fails to compute the linear bounds for Example xnuSimple (Fig. 1), Example xnu (Fig. 9) and other examples we discuss in this article (see also the results in Sect. 8.3). On the other hand, Ref. [6] can infer bounds for functions with multiple recursive calls which is not supported by the analysis we present in this article.

Comparison to Invariant Analysis We contrast our previously discussed approach for computing a bound for the loop at l3 of Example xnuSimple with classical invariant analysis: assume that we have added a counter c which counts the number of inner loop iterations (i.e., c is initialized to 0 and incremented in the inner loop). For inferring cn through invariant analysis the invariant c+x+rn is needed for the outer loop, and the invariant c+x+pn for the inner loop. Both relate 3 variables and cannot be expressed as (parametrized) octagons (e.g., [26]). Further, the expressions c+x+r and c+x+p do not appear in the program, which is challenging for template based approaches to invariant analysis.

We now contrast our variable bound analysis (function VB) with classical invariant analysis: reconsider Example twoSCCs in Fig. 2. We have discussed how our algorithm obtains the invariant xmax(m1,m2)+2n by means of bound analysis in the course of computing a bound for the loop at l3. Note, that the invariant xmax(m1,m2)+2n cannot be computed by standard abstract domains such as octagon or polyhedra: these domains are convex and cannot express non-convex relations such as maximum. The most precise approximation of x in the polyhedra domain is xm1+m2+2n. Unfortunately, it is well-known that the polyhedra abstract domain does not scale to larger programs and needs to rely on heuristics for termination. Standard abstract domains such as octagon or polyhedra propagate information forward until a fixed point is reached, greedily computing all possible invariants expressible in the abstract domain at every location of the program. In contrast, our method VB(x) infers the invariant xmax(m1,m2)+2n by modular reasoning: local information about the program (i.e., local bounds and increments/resets of variables) is combined to a global program property. Moreover, our variable and transition bound analysis is demand-driven: our algorithm performs only those recursive calls that are indeed needed to derive the desired bound. We believe that our analysis complements existing techniques for invariant analysis and will find applications outside of bound analysis.

Program Model and Algorithm

In this section we present our algorithm for computing worst-case upper bounds on the number of executions of a given transition (transition bound) and on the value of a given program expression (variable bound and upper bound invariant).

Definition 1

(Program) Let Σ be a set of states. A program over Σ is a directed labeled graph P=(L,T,lb,le), where L is a finite set of locations, lbL is the entry location, leL is the exit location and TL×2Σ×Σ×L is a finite set of transitions. We write l1λl2 to denote a transition (l1,λ,l2)T. We call λ2Σ×Σ a transition relation. A path of P is a sequence l0λ0l1λ1 with liλili+1E for all i. A run of P is a sequence ρ=(lb,σ0)λ0(l1,σ1)λ1 such that lbλ0l1λ1 is a path of P and for all 0<i it holds that (σi-1,σi)λi-1. A run ρ is complete if it ends at le.

Note that a run of P=(L,T,lb,le) starts at location lb. Further note that we call an edge l1λl1T of the program a transition, whereas λ is its transition relation. In the following we will refer to transitions by τ and to transition relations by λ.

Transition bounds are at the core of our analysis: we infer bounds on the number of loop iterations, on computational complexity, on resource consumption, etc., by computing bounds on the number of times that one or several transitions can be executed. Before we formally define our notion of a transition bound we have to introduce some notation.

Definition 2

(Counter Notation I) Let P(L,T,lb,le) be a program over Σ. Let τT. Let ρ=(lb,σ0)λ0(l1,σ1)λ1 be a run of P. By (τ,ρ) we denote the number of times that τ occurs on ρ.

In the following, we denote by ‘’ a value s.t. a< for all aZ (infinity).

Definition 3

(Transition Bound) Let P=(L,T,lb,le) be program over states Σ. Let τT. A value bN0{} is a bound for τ on a run ρ=(lb,σ0)λ0(l1,σ1)λ1(l2,σ2)λ2 of P iff (τ,ρ)b, i.e., iff τ appears not more than b times on ρ. A function b:ΣN0{} is a bound for τ iff for all runs ρ of P it holds that b(σ0) is a bound for τ on ρ, where σ0 denotes the initial state of ρ.

Given a program transition τ, our bound algorithm (which we define below) computes a bound for τ. If possible, the bound computed by our algorithm should be precise or tight, in particular the trivial bound Σ is (most often) of no value to us.

Definition 4

(Precise Transition Bound) Let P(L,T,lb,le) be a program over states Σ. Let τT. We say that a transition bound b:ΣN0{} for τ is precise iff for each σ0Σ there is a run ρ=(lb,σ0)λ0(l1,σ1)λ1 such that (τ,ρ)=b(σ0).

Informally A transition bound is precise if it can be reached for all initial states σ0. Note that there is exactly one precise transition bound.

Definition 5

(Tight Transition Bound) Let P(L,T,lb,le) be a program over states Σ. Let τT. We say that a transition bound b:ΣN0{} is tight iff there is a c>0 such that either (1) for all σΣ we have b(σ)<c (b is bounded), or (2) there is a family of states (σi)iN with limib(σi)= (b is unbounded) such that for all σi there is a run ρi starting in σi with b(σi)c×(τ,ρi).

Informally A transition bound is tight if it is in the same asymptotic class as the precise transition bound: let τT. For the special case Σ=N we have the following: let f:NN denote the precise transition bound for τ. Let g:NN be some transition bound for τ. Trivially fO(g) (f does not grow faster than g). Now, g is tight if also fΩ(g) (f does not always grow slower than g). With fO(g) and fΩ(g) we have that fΘ(g). The same can be formulated for general state sets Σ by mapping Σ to the natural numbers.

We discussed in Sect. 2.1 that in the course of computing transition bounds, our analysis computes invariants of a special shape. We now formally define the form of the invariants that our analysis infers.

Definition 6

(Upper Bound Invariant) Let P(L,T,lb,le) be a program over Σ. Let e:ΣZ. Let lL. Let ρ=(lb,σ0)λ0(l1,σ1)λ1(l2,σ2)λ2 be a run of P. A value bZ{} is an upper bound invariant for e at l on ρ iff e(σi)b holds for all i on ρ with li=l. A function b:ΣZ{} is an upper bound invariant for e at l iff for all runs ρ of P it holds that b(σ0) is an upper bound invariant for e at l on ρ, where σ0 denotes the initial state of ρ.

We now formally define the notion local bound that we motivated in Sect. 2.

Definition 7

(Counter Notation II) Let P(L,T,lb,le) be a program over Σ. Let ρ=(lb,σ0)λ0(l1,σ1)λ1 be a run of P. Let e:ΣZ be a norm. By (e,ρ) we denote the number of times that the value of e decreases on ρ, i.e., (e,ρ)=|{ie(σi)>e(σi+1)}|.

Definition 8

(Norm) Let Σ be a set of states. A norm e:ΣZ over Σ is a function that maps the states to the integers.

Definition 9

(Local Bound) Let P(L,T,lb,le) be a program over Σ. Let τT. Let e:ΣN be a norm that takes values in the natural numbers. Let ρ=(lb,σ0)u0(l1,σ1)u1 be a run of P. e is a local bound for τ on ρ if it holds that (τ,ρ)(e,ρ). We call e a local bound for τ if e is a local bound for τ on all runs of P.

Discussion A natural number valued norm e is a local bound for τ on a run ρ if τ appears not more often on ρ than the number of times the value of e decreases. I.e., a local bound e for τ limits the number of executions of τ on a run ρ as long as certain program parts (those were e increases) are not executed. We argue in Sect. 9 that in our analysis local bounds play the role of potential functions in classical amortized complexity analysis [32]. We discuss how we obtain local bounds in Sect. 4.

Difference Constraint Programs

As discussed introductory, we base our algorithm on the abstract program model of difference constraint programs which we now formally define in Definition 12. We discuss in Sect. 6 how we abstract a given program to a DCP.

Definition 10

(Variables, Symbolic Constants, Atoms) By V we denote a finite set of variables. By C we denote a finite set of symbolic constants. A=VC is the set of atoms.

Definition 11

(Difference Constraints) A difference constraint over A is an inequality of form xy+c with xV, yA and cZ. By DC(A) we denote the set of all difference constraints over A.

Notation We often write xy as a shorthand for the difference constraint xy+c.

Definition 12

(Difference Constraint Program, Syntax) A difference constraint program (DCP) over A is a directed labeled graph ΔP=(L,E,lb,le), where L is a finite set of vertices, lbL and leL and EL×2DC(A)×L is a finite set of edges. We write l1ul2 to denote an edge (l1,u,l2)E labeled by a set of difference constraints u2DC(A). We use the notation l1l2 to denote an edge that is labeled by the empty set of difference constraints. ΔP is fan-in-free, if for every edge l1ul2E and every vV there is at most one aA and cZ s.t. va+cu.

Example Figure 10b shows a fan-in free DCP.

Fig. 10.

Fig. 10

a Abstraction I: DCP with guards for Example xnu . b Abstraction II: DCP for Example xnu, assuming p,q,r,xV and [e-k]=p, [e-b]=q, [i-b]=r, [l-i]=x

Definition 13

(Difference Constraint Program, Semantics) The set of valuations of A is the set ValA=AN of mappings from A to the natural numbers. Let u2DC(A). We define u2(ValA×ValA) s.t. (σ,σ)u iff for all xy+cu it holds that (i) σ(x)σ(y)+c and (ii) for all sC σ(s)=σ(s). A DCP ΔP=(L,E,lb,le) is a program over the set of states ValA with locations L, entry location lb, exit location le and transitions T={l1ul2l1ul2E}.

Discussion A DCP is a program (Definition 1) whose transition relations are solely specified by conjunctions of difference constraints. Note that variables in difference constraint programs take values only over the natural numbers. Further note that we refer to the syntactic representation of the transition relation in form of a set of difference constraints by u, whereas by u we refer to the transition relation itself.

Definition 14

(Well-defined DCP) Let ΔP=(L,E,lb,le) be a DCP over atoms A. We say that a variable xV is defined at l if xdef(l), where def:L2A is defined by def(l)=l1ulE{xyVcZs.t.xy+cu}C.

We say that a variable x is used at l if xuse(l), where use:L2A is defined by use(l)=lul1E{yxAcZs.t.xy+cu}.

ΔP is well-defined iff lb has no incoming edges and for all lL it holds that use(l)def(l).

Discussion A DCP ΔP is well-defined if lb has no incoming edges and for all vV it holds that v is defined at all locations at which v is used (symbolic constants are always defined). Note that for well-defined programs we in particular require use(lb)def(lb). Because lb has no incoming edges we have def(lb)=C. Thus only symbolic constants can be used at lb.

Throughout this work we will only consider DCPs that are fan-in free and well-defined.

Let ΔP(L,E,lb,le) be a DCP over A. Our bound algorithm, which we start to develop in the next section, computes a bound for a given transition τE in form of an expression over A which involves the operators +,×, / ,min,max and the floor function ·. However, note that the norms, which are treated as atoms (elements of A) in the abstraction, can involve arbitrary operators (see Sect. 6).

Definition 15

(Expressions over A) By Expr(A) we denote the set of expressions over AZ{} that are formed using the arithmetical operators addition (+), multiplication (×), maximum (max), minimum (min) and integer division of form exprc where exprExpr(A) and cN. The semantics function ·:Expr(A)(ValAZ{}) evaluates an expression exprExpr(A) over a state σValA using the usual operator semantics (we have a+=, min(a,)=a, etc.).

Our bound algorithm, which we define next, computes a special case of an upper bound invariant which we call a variable bound.

Definition 16

(Variable Bound) Let ΔP(L,E,lb,le) be a DCP over A. Let aA. We call b s.t. b is an upper bound invariant for a at all lL with adef(l) a variable bound for a.

Let variable x of the abstract program represent the expression expr of the concrete program. Note that by computing a variable bound for x in the abstract program, we compute an upper bound invariant for expr in the concrete program.

Algorithm

Our bound algorithm computes a bound for a given transition τE based on a mapping ζ:EExpr(A) (called local bound mapping) which assigns each transition τE either (1) a bound for τ in form of an expression over the symbolic constants (i.e., ζ(τ)Expr(C)) or (2) a local bound for τ in form of a variable (i.e, ζ(τ)V). Note that VExpr(C)=. In Case (1) our algorithm (Definition 19) returns TB(τ)=ζ(τ). In Case (2) a transition bound TB(τ)Expr(C) is computed by inferring how often and by how much the local transition bound ζ(τ)V of τ may increase during program run.

Definition 17

(Local Bound Mapping) Let ΔP(L,E,lb,le) be a DCP over A. Let ρ=(lb,σ0)u0(l1,σ1)u1 be a run of ΔP. We call a function ζ:EExpr(A) a local bound mapping for ρ if for all τE it holds that either

  1. ζ(τ)Expr(C) and ζ(τ)(σ0) is a bound for τ on ρ or

  2. ζ(τ)V and ζ(τ) is a local bound for τ on ρ.

    We say that ζ is a local bound mapping for ΔP if ζ is a local bound mapping for all runs of ΔP.

Further, our bound algorithm is based on a syntactic distinction between two kinds of updates that modify the value of a given variable vV: we identify transitions which increment v and transitions which reset v.

Definition 18

(Resets and Increments) Let ΔP=(L,E,lb,le) be a DCP over A. Let vV. We define the resets R(v) and increments I(v) of v as follows:

R(v)={(l1ul2,a,c)E×A×Zva+cu,av}I(v)={(l1ul2,c)E×Nvv+cu,c>0}

Given a path π of ΔP we say that v is reset on π if there is a transition τ on π such that (τ,a,c)R(v) for some aA and cZ. We say that v is incremented on π if there is a transition τ on π such that (τ,c)I(v) for some cN.

I.e., we have that (τ,a,c)R(v) if variable v is reset to a value smaller or equal to a+c when executing the transition τ. Accordingly we have (τ,c)I(v) if variable v is incremented by a value smaller or equal to c when executing the transition τ.

Our algorithm in Definition 19 is built on a mutual recursion between the two functions VB(v) and TB(τ), where VB(v) infers a variable bound for variable v and TB(τ) infers a transition bound for the transition τ.

Definition 19

(Bound Algorithm) Let ΔP(L,E,lb,le) be a DCP over A. Let ζ:EExpr(A). We define VB:AExpr(A) and TB:EExpr(A) as:

VB(a)=a, ifaC, elseVB(v)=Incr(v)+max(_,a,c)R(v)(VB(a)+c)TB(τ)=ζ(τ), ifζ(τ)V, elseTB(τ)=Incr(ζ(τ))+(t,a,c)R(ζ(τ))TB(t)×max(VB(a)+c,0)

where Incr(v)=(τ,c)I(v)TB(τ)×c (we set Incr(v)=0 for I(v)=)

Discussion We first explain the subroutine Incr(v): with (τ,c)I(v) we have that a single execution of τ increments the value of v by not more than c. Incr(v) multiplies the bound for τ with the increment c in order to summarize the total amount by which v may be incremented over all executions of τ. Incr(v) thus computes a bound on the total amount by which the value of v may be incremented during program run.

The function VB(v) computes a variable bound for v: after executing a transition τ with (τ,a,c)R(v), the value of v is bounded by VB(a)+c. As long as v is not reset, its value cannot increase by more than Incr(v).

The function TB(τ) computes a transition bound for τ based on the following reasoning: (1) the total amount by which the local bound ζ(τ) of transition τ can be incremented is bounded by Incr(ζ(τ)). (2) We consider a reset (t,a,c)R(ζ(τ)); in the worst case, a single execution of t resets the local bound ζ(t) to VB(a)+c, adding max(VB(a)+c,0) to the potential number of executions of t; in total all TB(t) possible executions of t add up to TB(t)×max(VB(a)+c,0) to the potential number of executions of t.

Example We want to infer a bound for the loop at l3 in Fig. 2. We thus compute a transition bound for τ5 (the single back edge of the loop at l3). See Table 1 for details on the computation. We get TB(τ5)=max([m1],[m2])+[n]×2. Thus max(m1,m2)+2n is a bound for the loop at l3 (m1, m2 and n have type unsigned).

Table 1.

Computation of TB(τ5) for Example twoSCCs (Fig. 2) by Definition 19

graphic file with name 10817_2016_9402_Figa_HTML.jpg

Termination Our algorithm does not terminate iff recursive calls cycle, i.e., if a call to TB(τ) resp. VB(v) (indirectly) leads to a recursive call to TB(τ) resp. VB(v). This can be detected easily, we return the expression ‘’.

We distinguish three cases of cyclic computation: (1) there is a variable vV such that the computation of VB(v) ends up calling VB(v) over a number of recursive calls to VB. (2) There is a transition τE such that the computation of TB(τ) ends up calling TB(τ) over a number of recursive calls to TB. (3) There is a variable vV and a transition τE such that the computation of TB(τ) calls VB(v) which in turn ends up calling TB(τ) over a number of recursive calls to VB and TB.

Case (1) occurs iff there is a cycle in the reset graph (Definition 20 in Sect. 3.3) of ΔP. In Sect. 3.4 we discuss a preprocessing that ensures absence of cycles in the reset graph and thus absence of Case (1) by renaming the program variables appropriately.

Case (2) occurs iff there is a transition τ1 with local bound x that increases the local bound y of a transition τ2 which in turn increases x. We conclude that absence of Case (2) is ensured if for all strongly connected components (SCC) SCC of ΔP we can find an ordering τ1,,τn of the transitions of SCC such that the local bound of transition τi is not increased on any transition τj with nj>i1. Note that the existence of such an ordering for each SCC of ΔP proves termination of ΔP: it allows to directly compose a termination proof in form of a lexicographic ranking function by ordering the respective local transition bounds accordingly.

An example for Case (3) is given in Fig. 3a. Let τ1 be the transition on which y is reset to a. Let τ2 be the single transition of the inner loop. Assume we want to compute a loop bound for the inner loop, i.e., a transition bound for τ2 with local bound y. This triggers a variable bound computation for a because y is reset to a. Since a is incremented on τ2, the variable bound computation for a will in turn trigger a transition bound computation for τ2. Note, however, that the loop bound for the inner loop is exponential (2n). We consider exponential loop bounds very rare, we did not encounter an exponential loop bound during our experiments.

Fig. 3.

Fig. 3

a Example with an exponential loop bound, b Example for which we obtain a bound expression of exponential size, transitions τ1in have source- and target-location l

Complexity Our algorithm can be efficiently (polynomial in the number of variables and transitions of the abstract program) implemented using caches (dynamic programming): we set ζ(τ)=TB(τ) after having computed TB(τ). Accordingly we introduce a cache to store the result of a VB-computation. When VB(v) is called we first check if the result is already in the cache before performing the computation. The computed bound expressions, however, can be of exponential size: consider the DCP ΔP=({lb,l},{τ0,τ1,,τn},lb,le) over variables {x1,x2,,xn} and constants {m1,m2,,mn} shown in Fig. 3b. In fact, TB(τn)=S2{1,2,,n}m0×iSmi is precise for Fig. 3b. However, the example is artificial. To our experience the computed bound expressions can, in practice, be reduced to human readable size by applying basic rules of arithmetic.

Theorem 1

(Soundness) Let ΔP(L,E,lb,le) be a well-defined and fan-in free DCP over atoms A. Let ζ:EExpr(A) be a local bound mapping for ΔP. Let aA and τE. Let TB(τ) and VB(a) be as defined in Definition 19. We have: (1) TB(τ) is a transition bound for τ. (2) VB(a) is a variable bound for a.

In the following we describe two straightforward improvements of the algorithm stated in Definition 19.

Improvement I Let τE. Let vV be a local bound for τ, i.e., for all runs ρ of ΔP we have that (τ,ρ)(v,ρ). Let cN. Let (v,c,ρ) denote the number of times that the value of v decreases on a run ρ of ΔP by at least c (refines Definition 7). If for all runs ρ of ΔP we have that (τ,ρ)(v,c,ρ) (refines Definition 9) then TB(τ)c is a bound for τ (assuming ζ(τ)=v). In our simple abstract program model, cN is obtained syntactically from a constraint vv-c. See Sect. 4 on how we determine relevant constraints. More details on the discussed improvement are given in [30].

Improvement II Let τ1,τ2E be two transitions with the same local bound, i.e., ζ(τ1)=ζ(τ2). If τ1 and τ2 cannot be executed without decreasing the common local bound ζ(τ1) twice, once for τ1 and once for τ2 (e.g., τ2 and τ5 in xnuSimple, Fig. 1), we have that (τ1,ρ)+(τ2,ρ)TB(τ1)(σ0)=TB(τ2)(σ0). Thus, TB(τ1) is a bound on the number of times that τ1 and τ2 can be executed on any run of ΔP. We exploit this observation: assume some vV is incremented by c1 on τ1 and by c2 on τ2. For computing Incr(v) we only add TB(τ1)×max{c1,c2} instead of TB(τ1)×c1+TB(τ2)×c2. This idea can be generalized to multiple transitions. Further details on the discussed improvement are given in [30].

Reasoning Based on Reset Chains

Consider Fig. 4. The precise bound for the loop at l3 is n: Initially r has value n, after we have iterated the loop at l3, r is set to 0. Thus the loop can only be executed in at most one iteration of the outer loop. However, our algorithm from Definition 19 infers a quadratic bound for the loop at l3: as shown in Table 2 we have TB(τ3)=[n]×[n]. We thus get n2 (n has type unsigned) as bound for the loop at l3 in the concrete program.

Fig. 4.

Fig. 4

a Example, b Abstraction, c Reset Graph

Table 2.

Computation of TB(τ3) for Fig. 4b by Definition 19 (without calls to Incr because I(v)= and thus Incr(v)=0 for Fig. 4b)

graphic file with name 10817_2016_9402_Figb_HTML.jpg

Our algorithm from Definition 19 does not take into account that r is reset to 0 after executing the loop at l3. In the following we discuss an extension of our algorithm which overcomes this imprecision by taking the context under which a transition is executed into account: we say that a transition τ2 is executed under context τ1 if transition τ1 was executed before the current execution of τ2 and after the previous execution of τ2 (if any).

As an example, consider Fig. 4b, the abstraction of Fig. 4a. We have that τ2 is always executed either under context τ0 or under context τ4. When executing τ2 under context τ0, p is set to n. But when executing τ2 under context τ4, p is set to 0. Moreover, τ2 can only be executed once under context τ0 because τ0 is executed only once.

We define the notion of a reset graph as a means to reason systematically about the context under which resets can be executed.

Definition 20

(Reset Chain Graph) Let ΔP(L,E,lb,le) be a DCP over A. The reset chain graph or reset graph of ΔP is the directed graph G with node set A and edges E={(y,τ,c,x)(τ,y,c)R(x)}A×E×Z×V, i.e., each edge has a label in E×Z. We call G(A,E) a reset chain DAG or reset DAG if G(A,E) is acyclic. We call G(A,E) a reset chain forest or reset forest if the sub-graph G(V,E) (recall that VA) is a forest. We call a finite path κ=anτn,cnan-1τn-1,cn-1a0 in G with n>0 a reset chain of ΔP. We say that κ is a reset chain from an to a0. Let nij0. By κ[i,j] we denote the sub-path of κ that starts at ai and ends at aj. We define in(κ)=an, c(κ)=i=1nci, trn(κ)={τn,τn-1,,τ1}, and atm(κ)={an-1,,a0}. κ is sound if for all 1i<n it holds that ai is reset on all paths from the target location of τ1 to the source location of τi in ΔP. κ is optimal if κ is sound and there is no sound reset chain ϰ of length n+1 s.t. ϰ[n,0]=κ. Let vV, by R(v) we denote the set of optimal reset chains ending in v.

Example Figure 4c shows the reset graph of Fig. 4b.

We elaborate on the notions sound and optimal below. Let us first state a basic intuition on how we employ reset chains to enhance the precision of our reasoning:

For a given reset (τ,a,c)R(v), the reset graph determines which atom flows into variable v under which context. For example, consider Fig. 4b and its reset graph in Fig. 4c: when executing the reset (τ2,[r],0)R([p]) under the context τ4, [p] is set to [0], if the same reset is executed under the context τ0, [p] is set to [n]. Note that the reset graph does not represent increments of variables. We discuss how we handle increments in Sect. 3.3.1.

Let vV. Given a reset chain κ of length n that ends at v, we say that (trn(κ),in(κ),c(κ)) is a reset of v with context of length n-1. I.e., R(v) from Definition 18 is the set of context-free resets of v (context of length 0), because (trn(κ),in(κ),c(κ))R(v) iff κ ends at v and has length 1. Our previously defined algorithm from Definition 19 uses only context-free resets, we say that it reasons context free. For reasoning with context, we substitute the term

(t,a,c)R(ζ(τ))TB(t)×max(VB(a)+c,0)

in Definition 19 by the term

κR(ζ(τ))TB(trn(κ))×max(VB(in(κ))+c(κ),0).

Note that we can compute a bound on the number of times that a sequence τ1,τ2,,τn of transitions may occur on a run by computing min1inTB(τi).

We now discuss how our algorithm infers the linear bound for τ3 of Fig. 4 when applying the described modification to Definition 19: the reset graph of Fig. 4b is shown in Fig. 4c. There are 3 reset chains ending in [p]: κ1=[0]τ4,0[r]τ2,0[p], κ2=[n]τ0,0[r]τ2,0[p] and κ3=[r]τ2,0[p]. However, κ3 is a sub-path of κ1 and κ2. Note that κ1 and κ2 are sound by Definition 20 because [r] is reset on all paths from the target location l3 of τ2 to the source location l2 of τ2 in Fig. 4b (namely on τ4). κ1 and κ2 are both optimal because they are sound and of maximal length (we discuss the notions sound and optimal next). Thus R([p])={κ1,κ2}. Basing our analysis on R([p]) rather than R([p]) our approach reasons as shown in Table 3. We get TB(τ3)=[n], i.e., we get the bound n (n has type unsigned) for the loop at l3 in the concrete program (Fig. 4a).

Table 3.

Computation of TB(τ3) for Fig. 4b by Definition 21 (without calls to Incr because I(v)= and thus Incr(v)=0 for Fig. 4b)

graphic file with name 10817_2016_9402_Figc_HTML.jpg

Sound and Optimal Reset Paths A given reset chain anτn,cnan-1τn-1,cn-1τ1,c1a0 is sound if in between any two executions of τ1 all atoms on the path (but not necessarily an where the path starts and a0 where it ends) are reset: Assume that r in Fig. 4a would not be reset after executing the inner loop. Then we could repeat the reset of p to r without resetting r to 0, and the inner loop would have a quadratic loop bound. For the abstract program the described modification replaces the constraint [r][0] on τ4 in Fig. 4b by [r][r]. In the modified program, [r] is not reset between two executions of τ2, i.e., the reset chain [n]τ0[r]τ2[p] is not sound. Our algorithm therefore reasons based on the reset chain [r]τ2[p] and obtains a quadratic bound for τ3: TB(τ3)=TB(τ2)×VB(r)=[n]×[n]. I.e., if r is not reset on the outer loop this is modeled in our analysis by considering the reset chain [r]τ2[p] rather than the maximal reset chain [n]τ0[r]τ2[p]. Considering the maximal reset chain [n]τ0[r]τ2[p] would be unsound in the described scenario: min(TB(τ0),TB(τ2))×[n]=[n] is not a valid transition bound for τ3 if r is not reset to 0 between two executions of the inner loop. The optimal reset chains are the sound reset chains with maximal context, i.e., those reset chains that are sound and cannot be extended without becoming unsound.

Algorithm Based on Reset Chain Forests

In the presence of cycles in the reset graph we get infinitely many reset chains. Let us for now assume that the given program has a reset forest, i.e., that the sub-graph of the reset graph, which has nodes only in V, is a forest (Definition 20). Then also the complete reset graph is acyclic because A=VC and the nodes in C cannot have incoming edges (Definition 20).

Definition 21

(Bound Algorithm using Reset Chains (reset forest)) Let ζ:EExpr(A) be a local bound mapping for ΔP. Let VB:AExpr(A) be as defined in Definition 19. We override the definition of TB:EExpr(A) in Definition 19 by stating:

TB(τ)=ζ(τ), ifζ(τ)V, elseTB(τ)=IncrκR(ζ(τ))atm(κ)+κR(ζ(τ))TB(trn(κ))×max(VB(in(κ))+c(κ),0)

where TB({τ1,τ2,,τn})=min1inTB(τi) and

Incr({a1,a2,,an})=1inIncr(ai) with Incr()=0

Discussion and Example We have discussed above why we replace the term TB(t)×max(VB(a)+c,0) from Definition 19 by the term TB(trn(κ))×max(VB(in(κ))+c(κ),0). We further discuss the term Incr(κR(ζ(τ))atm(κ)) which replaces the term Incr(ζ(τ)) from Definition 19: consider Example xnuSimple in Fig. 1. Note that r may be incremented on τ1 between the reset of r to 0 on τ0 resp. τ4 and the reset of p to r on τ2. The term Incr(κR(ζ(τ))atm(κ)) takes care of such increments which may increase the value that finally flows into ζ(τ) (in the example p) when the last transition on κ (in the example τ2) is executed. In Table 4 the details of the bound computation are given. We get TB(τ3)=[n], i.e, we have the bound n for the loop at l3 in the concrete program (Fig. 1a, n has type unsigned).

Table 4.

Computation of TB(τ3) for Example xnuSimple (Fig. 1) by Definition 21

graphic file with name 10817_2016_9402_Figd_HTML.jpg

Soundness Definition 21 for DCPs with a reset forest is a special case of Definition 23 for DCPs with a reset DAG. We prove soundness of Definition 23 in Electronic Supplementary Material.

Complexity The nodes of a reset forest are the variables and constants of the abstract program (the elements of A). Since the number of paths of a forest is polynomial in the number of nodes, the run time of our algorithm remains polynomial.

Algorithm Based on Reset Chain DAGs

The examples we considered so far had reset forests. (Note that the definition of a reset forest (Definition 20) only requires the sub-graph over the variables, i.e., the reset graph without the nodes that are symbolic constants, to be a forest.) In the following we generalize Definition 21 to reset DAGs. We discuss in Sect. 3.4 how we ensure that the reset graph is acyclic.

Consider the Example shown in Fig. 5. The outer loop (at l1) can be executed n times. The loop at l4 resp. transition τ6 can be executed 2n times, e.g., by executing the program as depicted in Table 5:

Fig. 5.

Fig. 5

a Example, b Abstraction, c Reset Graph

Table 5.

Run of Figure 5

graphic file with name 10817_2016_9402_Fige_HTML.jpg

The first row counts the number of iterations of the outer loop, the second row shows the transitions that are executed and in the last two rows the values of r resp. p are tracked. The execution switches between two iteration schemes of the outer loop: an uneven iteration increments r twice (by executing τ2 twice) and afterward assigns r to p by executing τ5. We can then execute τ6 two times. Afterward the value of r is “saved” in p for the next (even) iteration of the outer loop before r is set to 0 on τ1. Therefore τ6 can be executed again two times in the next, even iteration though r is not incremented on that iteration.

Consider the abstracted DCP in Fig. 5b and its reset graph in Fig. 5c. We have that κ2=[0]τ1[r]τ5[p] and κ3=[0]τ1[r]τ7[p] are two reset chains ending in [p] (see Fig. 5 c). Observe that both are sound, i.e., between any two executions of τ7 resp. τ5 [r] is reset. However, [r] is not necessarily reset between the execution of τ5 and τ7, therefore the accumulated value 2 of r is used twice to increase the local bound [p] of τ6.

I.e., since there are two paths from [r] to [p] in the reset graph (Fig. 5c) we have to count the increments of [r] twice: once for κ2 and once for κ3. Definition 22 distinguishes between nodes that have a single resp. multiple path(s) to a given variable in the reset graph. This is used in Definition 23 for a sound handling of the latter case.

Definition 22

(atm1(κ) and atm2(κ)) Let ΔP(L,E,lb,le) be a DCP over A. Let P(a,v) denote the set of paths from a to v in the reset graph of ΔP. Let vV. Let κ be a reset chain ending in v. We define atm1(κ)={aatm(κ)|P(a,v)|1} and atm2(κ)={aatm(κ)|P(a,v)|>1}, where |S| denotes the number of elements in S.

Definition 23

(Bound Algorithm Based on Reset Chains (reset DAG)) Let ΔP(L,E,lb,le) be a DCP over A. Let ζ:EExpr(A) be a local bound mapping for ΔP. Let VB:AExpr(A) be as defined in Definition 19. We override the definition of TB:EExpr(A) in Definition 19 by stating:

TB(τ)=ζ(τ), ifζ(τ)V, elseTB(τ)=IncrκR(ζ(τ))atm1(κ)+κR(ζ(τ))TB(trn(κ))×max(VB(in(κ))+c(κ),0)+Incr(atm2(κ))

where TB({τ1,τ2,,τn})=min1inTB(τi) and

Incr({a1,a2,,an})=1inIncr(ai) with Incr()=0

Discussion If atm2(κ)= for all reset chains κ, Definition 23 is equal to Definition 21. This is the case for all DCPs with a reset forest (all examples in this article except Fig. 5). Definition 23 thus is a generalization of Definition 21.

Example As shown in Table 6 we get TB(τ6)=[n]+[n] for Fig. 5 by Definition 23. I.e., we get the precise bound 2n for the loop at l4 in Fig. 5 (n has type unsigned).

Table 6.

Computation of TB(τ6) for Fig. 5 by Definition 23

graphic file with name 10817_2016_9402_Figf_HTML.jpg

Theorem 2

( Soundness of Bound Algorithm using Reset Chains) Let ΔP(L,E,lb,le) be a well-defined and fan-in free DCP over atoms A. Let ζ:EExpr(A) be a local bound mapping for ΔP. Let TB and VB be defined as in Definition 23. Let τE and aA. If ΔP has a reset DAG then (1) TB(τ) is a transition bound for τ and (2) VB(a) is a variable bound for a.

Proof

See Electronic Supplementary Material.

Complexity A DAG can have exponentially many paths in the number of nodes. Thus there can be exponentially many reset chains in R(v) (exponential in the number of variables and constants of the abstract program, i.e., the norms generated during the abstraction process, see Sect. 6). However, in our experiments enumeration of (optimal) reset chains did not affect performance. (See also our discussion on scalability in Sect. 10.1.)

Preprocessing: Transforming a Reset Graph into a Reset DAG

Consider the DCP shown in Fig. 6a. Figure 6a has a cyclic reset graph as shown in Fig. 6b. In the following we describe an algorithm which transforms Fig. 6a into d by renaming the program variables. Figure 6d has an acyclic reset graph (a reset DAG).

Fig. 6.

Fig. 6

a Example, b Reset Graph, c Variable Flow Graph, d Variables Renamed

Definition 24

(Variable Flow Graph) Let ΔP(L,E,lb,le) be a DCP over A. We call the graph with node set V×L and edge set

{(y,l1)(x,l2)l1ul2Exy+cuwithx,yV}

the variable flow graph.

For an example see Fig. 6c.

Let ΔP(L,E,lb,le) be a DCP. Let {SCC1,SCC2,,SCCn} be the strongly connected components of its variable flow graph. For each SCC SCCi we choose a fresh variable viV. Let ς:V×LV be the mapping ς(v,l)=vi, where i s.t. (v,l)SCCi. We extend ς to A×LA by defining ς(s,l)=s for all lL and sC.

We obtain ΔP(L,E,lb,le) from ΔP by setting E={l1ul2l1ul2E}, where u is obtained from u by generating the constraint ς(x,l2)ς(y,l1)+c from a constraint xy+cu.

Examples Figure 6d is obtained from Fig. 6a by applying the described transformation using the mapping ς(x,l1)=ς(y,l2)=z.

Soundness Soundness of the described variable renaming is obvious if there are no two (different) variables v1 and v2 that are renamed to the same fresh variable at some location l. This is the case if in each SCC of the variable flow graph each location lL appears at most once, i.e., if there is no SCC SCC in the variable flow graph of the program such that there is a location lL and variables v1,v2V with v1v2 and (l,v1)SCC and (l,v2)SCC. In the literature, a program with this property is called stratifiable (e.g., [5]). A fan-in free DCP that is not stratifiable can be transformed into a stratifiable and fan-in free DCP by introducing appropriate case distinctions into the control flow of the program. Details are given in [30]. In the worst-case, however, this transformation can cause an exponential blow up of the number of transitions in the program (the size of the control flow graph).

Finding Local Bounds

In this section we describe our algorithm for finding local bounds.

Intuition Let τ=l1ul2E and vV. Clearly, v is a local bound for τ if v decreases when executing τ, i.e., if vv+cu for some c<0. Moreover, v is a local bound for τ, if every time τ is executed also some other transition tE is executed and v is a local bound for t. This is, e.g., the case if t is always executed either before each execution of τ or after each execution of τ.

Algorithm The above intuition can be turned into a simple three-step algorithm. Let ΔP(L,E,lb,le) be a DCP. (1) We set ζ(τ)=1 for all transitions τ that do not belong to a strongly connected component (SCC) of ΔP. (2) Let vV. We define ξ(v)E to be the set of all transitions τ=l1ul2E such that vv+cu for some c<0. For all τξ(v) we set ζ(τ)=v. (3) Let vV and τE. Assume τ was not yet assigned a local bound by (1) or (2). We set ζ(τ)=v if τ does not belong to a strongly connected component (SCC) of the directed graph (L,E) where E=E\{ξ(v)} (the control flow graph of ΔP without the transitions in ξ(v)).

If there are v1v2 s.t. τξ(v1)ξ(v2) then ζ(τ) is assigned either v1 or v2 non-deterministically. An alternative way of handling this case is as follows: we generate two local bound mappings, ζ1 and ζ2 where ζ1(τ)=v1 and ζ2(τ)=v2. This way we can systematically enumerate all possible choices, finally we apply our bound algorithm once based on ζ1, based on ζ2, etc., and finally take the minimum over all computed bounds. In our implementation, however, we follow the aforementioned greedy approach based on non-deterministic choice.

Discussion on Soundness Soundness of Steps (1) and (2) is obvious. We discuss soundness of Step (3): let τE. If τ does not belong to an SCC of (L,E\{ξ(v)}) we have that some transition in ξ(v) (which decreases v) has to be executed in between any two executions of τ. It remains to ensure that there is a decrease of v also for the last execution of τ: for special cases this is unfortunately not the case. Consider Fig. 8b (Sect. 5). The above stated algorithm sets ζ(τ1)=[x]. However, [x] is not a local bound for τ1 of Fig. 8b because there is no decrease of [x] for the last execution of τ1 (before executing τ3).

Fig. 8.

Fig. 8

a Example with a “break”-statement, b DCP obtained by abstraction

It is straightforward to ensure soundness of the algorithm: adding an edge from le to lb forces the algorithm to take the last execution of a transition into account. I.e., we set E=E{lelb}\{ξ(v)}. Now our algorithm fails to find a local bound for τ1 of Fig. 8b, which is sound. We discuss how we handle the example in Fig. 8 in Sect. 4.1.

Complexity Steps (1) and (2): can be implemented in linear time. Step (3): for each vV we need to compute the SCCs of (L,E\ξ(v)). It is well known that SCCs can be computed in linear time (linear in the number of edges and nodes). Since we need to perform one SCC computation per variable, Step (3) is quadratic.

Generalizing Local Bounds to Sets of Local Bounds

Consider the example in Fig. 7. In Fig. 7b the DCP obtained by abstraction (Sect. 6) from the program in Fig. 7a is shown. We have that x is a local bound for τ1 and y is a local bound for τ2. However, it is not straightforward to find a local bound for τ3: in order to form a local bound for τ3 we need to combine x and y to a linear combination, e.g., 2x+y. It is unclear how to automatically come up with such expressions.

Fig. 7.

Fig. 7

a Example, b DCP obtained by abstraction

In the following we discuss a simple generalization of our algorithm by which we avoid an explicit composition of local bounds.

We generalize the local bound mapping ζ:EExpr(A) (Definition 17) to a local bound set mapping ζ:E2Expr(A).

Definition 25

(Local Bound Set Mapping) Let ΔP(L,E,lb,le) be a DCP over A. Let ρ=(lb,σ0)u0(l1,σ1)u1 be a run of ΔP. We call a function ζ:E2Expr(A) a local bound set mapping for ρ if for all τE it holds that

(τ,ρ)(vζ(τ)V(v,ρ))+exprζ(τ)\Vexpr(σ0).

We say that ζ is a local bound set mapping for ΔP if ζ is a local bound set mapping for all runs of ΔP.

Example For Fig. 7b we have that ζ:E2Expr(A) with ζ(τ0)={1}, ζ(τ1)={[y]}, ζ(τ2)={[x]} and ζ(τ3)={[x],[y]} is a local bound set mapping.

We generalize the transition bound algorithm TB to local bound set mappings by summing up over all exprζ(τ). We exemplify the generalization by extending Definition 19.

Definition 26

(Bound Algorithm based on Local Bound Sets) Let ΔP(L,E,lb,le) be a DCP over A. Let ζ:E2Expr(A). Let VB:AExpr(A) be defined as in Definition 19. We define TB:EExpr(A) as:

TB(τ)=lbζ(τ)TB(lb)TB(lb)=lb,iflbV,elseTB(lb)=Incr(lb)+(t,a,c)R(lb)TB(t)×max(VB(a)+c,0)

where Incr(v)=(τ,c)I(v)TB(τ)×c (we set Incr(v)=0 for I(v)=).

Example For Fig. 7 we get TB(τ3)=2n, details are shown Table 7 ([n]=n because n has type unsigned).

Table 7.

Computation of TB(τ3) for Fig. 7b by Definition 26

graphic file with name 10817_2016_9402_Figg_HTML.jpg

Inferring a Local Bound Set Mapping The algorithm for finding local bounds can be easily extended for finding local bound sets: steps (1) and (2) remain unchanged. Step (3) is generalized as follows: let v1,,vkV and τ=l1ul2E. We set ζ(τ)={v1,,vk} if it holds that for each execution of τ a transition in ξ(v1)ξ(vk) is executed. This can be implemented by checking, if τ does not belong to a strongly connected component (SCC) SCC of the directed graph (L,E) where E=E{lelb}\(ξ(v1)ξ(vk)).

Note that Step (3) is parametrized in the number kN of variables considered. For obvious reasons it is preferable to find local bound sets of minimal size. Given a transition τ, we therefore first try to find a local bound set of size k=1 for τ and increment k only if the search fails. With a fixed limit for k the complexity of our procedure for finding local bounds remains polynomial. To our experience limiting k to 3 is sufficient in practice.

Handling break statements Consider Fig. 8a. The loop (resp. its back-edge) can be executed n times, the skip instruction (a placeholder for some code of interest), however, can be executed n+1 times. Consider the abstraction shown in Fig. 8b. Our algorithm for finding local bounds, as we discussed it so far, fails to find a local bound (set) for τ1 (modeling the skip instruction). We extend the algorithm as follows: we set ξ(1)={τEτis not part of any SCC}. I.e., for Fig. 8b we set ξ(1)={τ0,τ3}. We add 1Expr(A) to the set of “variables” v1,,vk. I.e., for our example we have v1=[x] and v2=1. The algorithm now computes ζ(τ3)={[x],1} for k=2 given that ξ([x])={τ2}. Based on ζ(τ3)={[x],1} our algorithm from Definition 26 correctly infers TB(τ3)=[n]+1=n+1 (n has type unsigned).

Combined Bound Algorithm

We have developed our algorithm for computing transition bounds and variable bounds on DCPs step-wise in Sect. 3 (Definitions 19, 21, 23), in each step adding new features to the algorithm. In this sense Definition 23 subsumes Definitions 21 and 19. We now combine Definition 23 with the extension to sets of local bounds (Sect. 4.1) and obtain Definition 27.

Definition 27

(Combined Bound Algorithm) Let ΔP(L,E,lb,le) be a DCP over A. Let ζ:E2Expr(A). Let VB:AExpr(A) be defined as in Definition 19. We override the definition of TB:EExpr(A) in Definition 19 by stating:

TB(τ)=lbζ(τ)TB(lb)TB(lb)=lb,iflbV,elseTB(lb)=IncrκR(lb)atm1(κ)+κR(lb)TB(trn(κ))×max(VB(in(κ))+c(κ),0)+Incr(atm2(κ))

where TB({τ1,τ2,,τn})=min1inTB(τi) and

Incr({a1,a2,,an})=1inIncr(ai) (we set Incr()=0) and

Incr(v)=(τ,c)I(v)TB(τ)×c (we set Incr(v)=0 for I(v)=).

We introduced and discussed the terms from which Definition 27 is composed in Sects. 3 and 4.1.

Soundness Soundness of Definition 27 results from Theorem 2 (proven in Appendix and the discussion in Sect. 4.1. Note that Definition 27 is only sound for DCPs that have a reset DAG. We have described in Sect. 3.4 how to transform a given DCP into a DCP with a reset DAG.

Program Abstraction

In the following we discuss how we abstract a given program to a DCP.

Definition 28

(Difference Constraint Invariants) Let P(L,T,le,lb) be a program over states Σ. Let e1,e2,e3 be norms, i.e., e1,e2,e3:ΣZ, and let cZ be some integer. We say e1e2+e3 is invariant on a transition l1λl2T, if e1(σ2)e2(σ1)+e3(σ1) holds for all (σ1,σ2)λ.

Definition 29

(DCP Abstraction of a Program) Let P=(L,T,lb,le) be a program and let N be a set of functions from the states to the natural numbers, i.e., N2ΣN. A DCP ΔP=(L,E,lb,le) over atoms N is an abstraction of the program P iff for each transition l1λl2T there is a transition l1ul2E s.t. every e1e2+cu is invariant on l1λl2.

Our abstraction algorithm proceeds in two steps: we first abstract a given concrete program to a DCP with integer semantics, in a second step we then further abstract the integer-DCP to a DCP over the natural numbers (as defined in Definition 12).

Abstraction I: DCPs with Integer Semantics

We extend our abstract program model from Definition 12 to the non-well-founded domain Z by adding guards to the transitions of the program.

Syntax of DCP s with guards The edges E of a DCP with guards ΔPG(L,E,lb,le) are a subset of L×2V×2DC(A)×L. I.e., an edge of a DCP with guards is of form l1g,ul2 with l1,l2L, g2V and u2DC(A).

Example See Fig. 10a in Sect. 7.1 for an example.

Semantics of DCP s with guards We extend the range of the valuations ValA of A from N to Z. Let u2DC(A). Let u be as defined in Definition 13. Let g2V. We define g,u={(σ1,σ2)uσ1(v)>0for allvg}. A guarded DCP ΔPG=(L,E,lb,le) is a program over the set of states ValA with locations L, entry location lb, exit location le and transitions T={l1g,ul2l1g,ul2E}.

I.e., a transition l1g,ul2 of a DCP with guards can only be executed if the values of all vg are greater than 0.

Definition 30

(Guard) Let P(L,T,le,lb) be a program over states Σ. Let e be a norm (e:ΣZ), let cZ. We say e is a guard of l1λl2T if e(σ1)>0 holds for all (σ1,σ2)λ.

We abstract a program P=(L,T,lb,le) to a DCP with guards ΔPG=(L,E,lb,le) as follows:

  1. Choosing an initial set of Norms We aim at creating a suitable abstract program for bound analysis. In our non-recursive setting complexity evolves from iterating loops. Therefore we search for expressions which limit the number of loop iterations. We consider conditions of form a>b resp. ab found in loop headers or on loop-paths if they involve loop counter variables, i.e., variables which are incremented and/or decremented inside the loop. Such conditions are likely to limit the consecutive execution of single or multiple loop-paths. From each condition of form a>b we create the integer expression a-b, from each condition of form ab we create the integer expression a+1-b. These expressions form our initial set of norms N. Note that on those transitions on which a>b holds, a-b>0 must hold, whereas with ab we have a+1-b>0.

    In ΔPG we interpret a norm eN from our initial set of norms N as variable, i.e., we have eV for all eN.

  2. Abstracting Transitions For each transition l1λl2T we generate a set uλ of difference constraints: initially we set uλ= for all transitions l1λl2T.

    We repeat the following construction until the set of norms N becomes stable: For each e1N and for each l1λl2T, such that all variables in e1 are defined at l2, we check whether there is a difference constraint of form e1e2+c with e2N and cZ in uλ. If not, we derive a difference constraint e1e2+c as follows: we symbolically execute λ for deriving e1 from e1: e.g., let e1=x+y and assume x is assigned x+1 on l1λl2 while y stays unchanged. We get e1=x+1+y through symbolic execution. In order to keep the number of norms low, we first try
    1. to find a norm e2N and cZ s.t. e1e2+c is invariant on l1λl2 (see Definition 28). If we succeed we add the predicate e1e2+c to uλ. E.g., for e1=x+y and e1=x+1+y we get the transition invariant (x+y)(x+y)+1 and will thus add e1e1+1 to uλ. In general, we find a norm e2 and a constant c by separating constant parts in the expression e1 using associativity and commutativity, thereby forming an expression e3 over variables and program parameters and an integer constant c. E.g., given e1=5+z we set e3=z and c=5. We then search a norm e2N with e2=e3 where the check on equality is performed modulo associativity and commutativity.
    2. If (a) fails, i.e., no such e2N exists, we add e3 to N and derive the predicate e1e3+c. In ΔPG we interpret e3 as atom, i.e., e3A. We interpret e3 as a symbolic constant, i.e., e3C, only if e3 is purely built over the program’s input parameters and constants. Note that this step increases the number of norms.
  3. Inferring Guards For each transition l1λl2 we generate a set gλ of guards: initially we set gλ= for all transitions l1λl2. For each eN and each transition l1λl2 we check if e is a guard of l1λl2. If so, we add e to gλ. We use an SMT solver to perform this check. E.g., let e=x+y and assume that l1λl2 is guarded by the conditions x0 and y>x. An SMT solver supporting linear arithmetic proves that x0y>x implies x+y

  4. We set E={l1gλ,uλl2l1λl2T}.

Note that SMT reasoning is applied only locally to single transitions to check if an expression is greater than 0 on that transition.

Propagation of Guards We improve the precision of our abstraction by propagating guards: consider a transition l3g3,u3l4. Assume l3 has the incoming edges l1g1,u1l3 and l2g2,u2l3. If yg1g2 (i.e., y is a guard on both incoming edges) and y does not decrease on the corresponding concrete transitions l1λ1l3 and l1λ2l3 (checked by symbolic execution) then y is also a guard on l3g3,u3l4 and we add y to g3.

Well-defined and Fan-in free DCPs generated by our algorithm are always fan-in free by construction: for each transition we get at most one predicate ee2+c for each eN because we check whether there is already a predicate for e before a predicate is inferred resp. added. We ensure well-definedness of our abstraction by a final clean-up: we iterate over all lL and check if use(l)def(l) holds. If this check fails we remove all difference constraints xy+c with yuse(l)\def(l) from all outgoing edges of l. We repeat this iteration until well-definedness is established, i.e., until use(l)def(l) holds for all lL.

Termination We have to ensure the termination of our abstraction procedure, since case (b) in step “2. Abstracting Transitions” triggers a recursive abstraction for the newly added norm: note that we can always stop the abstraction process at any point, getting a sound abstraction of the original program. We therefore ensure termination of the abstraction algorithm by limiting the chain of recursive abstraction steps that is triggered by entering case (2.b).

Non-linear Iterations We can handle counter updates such as x=2x or x=x/2 as follows: (1) We add the expression logx to our set of norms. (2) We derive the difference constraint (logx)(logx)-1 from the update x=x/2 if x>1 holds. Symmetrically we get (logx)(logx)+1 from the update x=2x if x>0 holds.

Data Structures In previous publications [13, 24] it has been described how to abstract programs with data structures to pure integer programs by making use of appropriate norms such as the length of a list or the number of elements in a tree. In our implementation we follow these approaches using a light-weight abstraction based on optimistic aliasing assumptions (see [30] for details). Once the program is transformed to an integer program, our abstraction algorithm is applied as described above for obtaining a difference constraint program.

Abstraction II: From the Integers to the Natural Numbers

We now discuss how we abstract a DCP with guards ΔPG=(L,E,lb,le) to a DCP ΔP=(L,E,lb,le) over N (Definition 12):

Let eN. By [e]:ΣN we denote the function [e](σ)=max(e(σ),0). Recall that e is interpreted as atom in ΔPG, i.e., eA. In ΔP, we interpret [e] as variable (i.e., [e]V) if eV. We interpret [e] as symbolic constant (i.e., [e]C) if eC.

Let l1g,ul2E. We create a transition l1ul2E as follows: let e1e2+cu. If c0, we add [e1][e2]+c to u. If c<0 and e2g we add the constraint [e1][e2]-1 to u. If c<0 and e2g we add the constraint [e1][e2]+0 to u.

Discussion Soundness of Abstraction II is due to the following observation: consider a transition l1g,ul2 of ΔPG. Let e1e2+cu, i.e., e1e2+c is invariant for the corresponding transition τ of the concrete program. Then [e1][e2]+0 is also invariant for τ. Further: if c0 then [e1][e2]+c is invariant for τ. And if c<0 and e2g (i.e., e2>0 must hold before executing τ), then [e1][e2]-1 is invariant for τ.

Modeling arbitrary Decrements Consider a transition l1g,ul2 of ΔPG. Assume e1G and e1e1-2u. Our abstraction procedure, as discussed so far, adds [e1][e1]-1 to u, where l1ul2 is the corresponding transition of ΔP. We discuss how we can model a decrease by 2 in ΔP (such decreases are handled by Improvement I in Sect. 3.2): let τ=l1λl2 denote the corresponding transition of the concrete program. We make the following observation: since e1 is a guard invariant for τ (e1>0 before executing τ) and e1e1-2 is invariant for τ we have that [e1+1][e1+1]-2 is invariant for τ. We thus add the predicate [e1+1][e1+1]-2 to u. Further: let ττ be some transition of the concrete program. If [e1][e1]+c is invariant for τ then [e1+1][e1+1]+c is also invariant for τ. Let e1e2. If [e1][e2]+c is invariant for τ then [e1+1][e2]+(c+1) is also invariant for τ. We add the corresponding predicates to ΔP.

We can handle decrements greater than 2 accordingly: e.g., if e1e1-3u we add the predicate [e1+2][e1+2]-3 to u, etc.

A Complete Example

Example xnu in Fig. 9a contains a snippet of the source code after which we have modeled Example xnuSimple in Fig. 1. The full version of Example xnu can be found in the SPEC CPU2006 benchmark,1 in function XNU of 456.hmmer/src/masks.c. The outer loop in Example xnu partitions the interval [0, len] into disjoint sub-intervals [begend]. The inner loop iterates over the sub-intervals. Therefore the inner loop has an overall linear iteration count. Example xnu is a natural example for amortized complexity: Though a single visit to the inner loop can cost len (if beg=0 and end=len), several visits can also not cost more than len since in each visit the loop iterates over a disjoint sub-interval. We therefore have: The amortized cost of a visit to the inner loop, i.e., the cost of executing the inner loop within an iteration of the outer loop averaged over all len iterations of the outer loop, is 1. Here, we refer by cost to the number of consecutive back jumps in the inner loop. But in general, any resource consumption inside the inner loop can, in total, only be repeated up to max(len,0) times.

Together with the loop bound max(len,0) of the outer loop, our observation yields an overall complexity of 2×max(len,0).

Our experimental results (Sect. 8.3) demonstrate that state-of-the-art bound analyses fail to infer tight bounds for Example xnu and similar problems.

Abstraction

We give a formal representation of the concrete program semantics of Example xnu in form of a labeled transition system (LTS) shown in Fig. 9b. Each edge in the LTS is labeled by a formula which encodes the transition relation. Consider, e.g., the edge from l1 to l2 (τ1) labeled by the formula i<lb=be=ei=i+1l=l. This formula induces the transition relation λ1={(σ,σ)Σ×Σσ(i)<σ(l)σ(b)=σ(b)σ(e)=σ(e)σ(i)=σ(i)+1σ(l)=σ(l)}.

We now discuss how our abstraction algorithm from Sect. 6 abstracts Example xnu to the DCP shown in Fig. 10b. Recall abstraction Step I discussed in Sect. 6.1.

  1. Choosing an initial set of Norms Our described heuristic adds the expressions l-i and e-k generated from the conditions k<e and i<l to the initial set of norms N. Thus our initial set of norms is N={l-i,e-k}.

  2. Abstracting Transitions
    • We check how l-i changes on the transitions τ0, τ1, τ2a, τ2b, τ3a,τ3b, τ3c, τ4, τ5, τ6:
      • τ0: we derive [l-i]l (reset), we add l to N. Since l is an input parameter we have lC.
      • τ1: we derive [l-i][l-i]-1 (decrement)
      • τ2a,τ2b,τ3a,τ3b,τ3c,τ4,τ5,τ6: l-i unchanged
    • We check how e-k changes on the transitions τ3a,τ4 (k is only defined at l4):
      • τ3a: we derive [e-k][e-b] (reset), we add e-b to N
      • τ4: we derive [e-k][e-k]-1 (decrement)
    • We check how e-b changes on the transitions τ0, τ1, τ2a, τ2b, τ3a, τ3b, τ3c, τ4, τ5, τ6:
      • τ0: we derive [e-b]0 (reset), we add 0 to N. Since 0 is a constant we have 0C.
      • τ2a: we derive [e-b][i-b], we add i-b to N.
      • τ3c: we derive [e-b]0 (reset)
      • τ5: we derive [e-b]0 (reset)
      • τ1,τ2b,τ3a,τ3b,τ4,τ6: e-b unchanged
    • We check how i-b changes on the transitions τ0, τ1, τ2a, τ2b, τ3a,τ3b, τ3c, τ4, τ5, τ6:
      • τ0: we derive [i-b]0 (reset)
      • τ1: we derive [i-b][i-b]+1 (increment)
      • τ3c: we derive [i-b]0 (reset)
      • τ5: we derive [i-b]0 (reset)
      • τ2a,τ2b,τ3a,τ3b,τ4,τ6: unchanged
    • We have processed all norms in N
  3. Inferring Guards We add the guard [l-i] to τ1 in ΔPG because l-i is a guard of τ1 in P (due to the condition i<l), we add the guard [e-k] to τ4 in ΔPG because e-k is a guard of τ4 in P (due to the condition k<e).

  4. The resulting DCP with guards is shown in Fig. 10a.

Applying abstraction Step II discussed in Sect. 6.2 gives us the DCP shown in Fig. 10b. In the depiction of the abstraction we assume p,q,r,xV and [e-k]=p, [e-b]=q, [i-b]=r, [l-i]=x.

Bound Computation

In Fig. 11 the reset graph of Fig. 10b is shown. Table 8 shows how our bound algorithm from Sect. 3 infers the linear bound max(l,0) for the inner loop at l4 of Example xnu by computing TB(τ4)=[l] on the abstraction shown in Fig. 10b. Recall that the abstract variable [l] represents the expression max(l,0) in the concrete program.

Fig. 11.

Fig. 11

Reset graph

Table 8.

Computation of TB(τ4) for Example xnu (Fig. 9) by Definition 21 resp. Definition 23

graphic file with name 10817_2016_9402_Figh_HTML.jpg

Note that the DCP in Fig. 10b has a reset forest. Therefore atm1(κ)=atm(κ) for all reset paths κ of Fig. 10b, as discussed in Sect. 3.3.2. The computation traces of Definitions 21 and 23 are thus equivalent for Fig. 10b.

Experiments

Implementation The presented analysis defines the core ideas and techniques of our implementation loopus. A complete description of the implemented techniques, including a path-sensitive extension of our bound algorithm, is given in [30]. Our implementation is open-source and available at [18]. loopus reads in the LLVM [23] intermediate representation and performs an intra-procedural analysis. It is capable of computing bounds for loops as well as analyzing the complexity of non-recursive functions.

In the following we discuss three experimental setups and tool comparisons. Our first experiment, which we discuss in Sect. 8.1 is performed on a benchmark of open-source C programs. For our second experiment (Sect. 8.2), we assembled a benchmark of challenging programs from the literature on automatic bound analysis. The third experiment was performed on a set of interesting loop iteration patterns that we found in real source code.

Evaluation on Real World C Code

Experimental Setup We base our experiment on the program and compiler optimization benchmark Collective Benchmark [17] (cBench), which contains a total of 1027 different C files (after removing code duplicates) with 211.892 lines of code. We set up the first comparison of complexity analysis tools on real-world code. For comparing our tool (loopus’15) we chose the three most promising tools from recent publications: the tool KoAT implementing the approach of [6], the tool CoFloCo implementing [10] and our own earlier implementation loopus’14 [29]. Note that we compared against the most recent versions of KoAT and CoFloCo (download 01/23/15).2 We were not able to evaluate Rank (implementing [2]) and C4B (implementing [7]) on our benchmark because both tools support only a limited subset of C. The experiments were performed on a Linux system with an Intel dual-core 3.2 GHz processor and 16 GB memory. The task was to perform a complexity analysis on function level. We used the following experimental set up:

  1. We compiled all 1027 C files in the benchmark into the LLVM intermediate representation using clang.

  2. We extracted all 1751 functions which contain at least one loop using the tool llvm-extract (comes with the LLVM tool suite). Extracting the functions to single files guarantees an intra-procedural setting for all tools.

  3. We used the tool llvm2kittel [20] to translate the 1751 LLVM modules into 1751 text files in the integer transition system (ITS) format that is read in by KoAT.

  4. We used the transformation described in [10] to translate the ITS format of KoAT into the cost equations representation that is read in by CoFloCo. This last step is necessary because there exists no direct way for translating C or the LLVM intermediate representation into the CoFloCo input format.

  5. We decided to exclude the 91 recursive functions from the benchmark set because we were not able to run CoFloCo on these examples (the transformation tool does not support recursion), KoAT was not successful on any of them, and loopus does not support recursion. In total our example set thus comprises 1659 functions.

Evaluation Table 9 shows the results of all four tools on our benchmark using a time out of 60 s (Table 10 shows the results on the subset of those functions on which no tool timed out). The first column shows the number of functions which were successfully bounded by the respective tool, the last column shows the number of time outs, on the remaining examples (not shown in the table) the respective tool did not time out but was also not able to compute a bound. The column Time shows the total time used by the respective tool to process the benchmark. loopus’15 obtains results for about twice as many functions as KoAT, CoFloCo, and loopus’14 while needing an order of magnitude less time than KoAT and CoFloCo, and significantly less time than loopus’14. We conclude that our implementation is both more scalable and, on real C code, more successful than implementations of other state-of-the-art approaches. However, while the experiment clearly demonstrates that our implementation outperforms the competitors with respect to scalability, it does not allow to compare the strengths of the different bound analyses conclusively: we observed that llvm2kittel, the only tool available for translating C-code resp. the LLVM intermediate representation into the ITS format of KoAT, looses information that is kept by our analysis. As a result, it is unclear if a failure to compute a bound is due to different analysis strength or due to information loss during translation (we have not seen such an information loss for our second and third experiment, on which we report in Sects. 8.2 and 8.3, where the considered benchmarks consist of rather small, pure integer programs for which llvm2kittel works well).

Table 9.

Tool results on analyzing the complexity of 1659 functions in the cBench benchmark, none of the tools infers log bounds

Succ. 1 n n2 n3 n>3 2n Total time # Time outs
loopus’15 806 205 489 97 13 2 0 15 m 6
loopus’14 431 200 188 43 0 0 0 40 m 20
KoAT 430 253 138 35 2 0 2 5.6 h 161
CoFloCo 386 200 148 38 0 0 0 4.7 h 217

Table 10.

Tool results on analyzing the complexity of the subset of those functions in the cBench benchmark on which no tool timed out

Succ. 1 n n2 n3 n>3 2n Total time # Time outs
loopus’15 753 196 466 84 7 0 0 9 m 0
loopus’14 414 192 181 41 0 0 0 20 m 0
KoAT 420 245 136 35 2 0 2 2.9 h 0
CoFloCo 382 198 146 38 0 0 0 1.1 h 0

We hope that our experiment motivates the development of better tools for bound analysis of real world-code and drives the research towards solving realistic complexity analysis problems. We want to add that to our experience, working with C programs instead of integer transition systems is very helpful for developing and debugging a complexity analysis tool: looking at C code, we can use our own intuition as programmers about the expected complexity of the analyzed code and compare it to the complexity reported by the tool.

Pointers and Shapes Even loopus’15 computed bounds for only about half of the functions in the benchmark. Studying the benchmark code we concluded that for many functions pointer alias and/or shape analysis is needed for inferring functional complexity. In our experimental comparison such information was not available to the tools. Using optimistic (but unsound) assumptions on pointer aliasing and heap layout, our tool loopus’15 was able to compute the complexity for in total 1185 out of the 1659 functions in the benchmark, using 28 min total time. A discussion of our optimistic pointer aliasing and heap layout assumption and on the reasons of failure can be found in [30].

The benchmark and more details on our experimental results can be found on [18] where our tool is also offered for download.

Evaluation on Examples from the Literature

In order to evaluate the precision of our approach on a number of known challenges to bound analysis, we performed a tool comparison on 110 examples from the literature. Our example set comprises those examples from the tool evaluation in [6, 29] that were available as imperative code (C or pseudo code, in total 89 examples), and additionally the examples used for the evaluation of Ref. [7] (15 examples) as well as the running examples of Ref. [27] (6 examples). We added the tools Rank (implementing [2]) and C4B (implementing [7]) to the comparison, because we were able to formulate the examples over the restricted C subset that is supported by these two tools (this was not possible for our experiment on real-world code).

The results of our evaluation are shown in Table 11. Our two tools loopus’15 and loopus’14 compute the highest number of linear bounds and are also significantly faster than the other tools, in particular than KoAT and CoFloCo. On the other hand, KoAT computes the highest number of bounds in total (4 more than loopus). CoFloCo computes, in total, 1 bound more than our tool. The comparable low number of bounds computed by C4B is also due to the fact that the approach implemented in C4B is limited to linear bounds.

Table 11.

Tool Results on analyzing examples from the literature, none of the tools infers log bounds

Succ. 1 n n2 n3 n>3 2n Time w/o TO # Time outs
loopus’15 86 2 51 27 1 5 0 4 s 0
loopus’14 86 2 50 28 2 4 0 4 s 0
CoFloCo 87 3 45 34 2 3 0 1 min 40 s 1
Rank 78 3 49 21 3 2 0 20 s 0
C4B 36 0 36 0 0 0 0 6 s 0
KoAT 90 3 43 36 3 5 0 3 min 50 s 3

The time out was 120 s. A higher time out did not yield additional results

In summary, our second evaluation shows that our approach is not only successful on the class of problems on which we focused in this article, but solves also many other bound analysis problems from the literature. Note, that in contrast to our first evaluation, our second benchmark contains small examples from academia (1293 LOC, in average 12 lines per file). On these examples our implementation is comparable in strength to the implementation of other state-of-the-art approaches to bound analysis. Given that our tool is a prototype implementation, there is room for improvement, concrete suggestions are discussed in [30]. More details on the results computed by each tool can be found on [19].

Evaluation on Challenging Iteration Patterns from Real Code

Scanning through two C-code benchmarks (cBench [17] and SPEC CPU 2006 [21]), we found a number of 23 different loop iteration patterns which we consider to be particular challenging for state-of-the-art bound analyses. The 23 patterns have the following property in common: (1) there is an inner loop L with loop counter c, such that c is increased on an outer loop of L. (2) Nevertheless, the amortized cost of L (the overall worst-case cost of executing L, averaged over the number of executions of its outer loop) is lower than the worst-case cost of a single execution (a single instance of consecutive iterations) of L.

Example xnu (discussed in Sect. 7) is a natural example for the described behaviour.

The complete benchmark is available at [19]. For each pattern we link its origin in the header of the respective file. Note that for some patterns we found several instances.

Table 12 states the results that were obtained by loopus’15, loopus’14, CoFloCo, KoAT, Rank and C4B: ‘’ denotes that the bound computed by the respective tool is tight (in the same asymptotic class as the precise bound, see Definition 5), ‘O(nx)’ denotes that the respective tool did not infer a tight bound but a bound in the asymptotic class O(nx), ‘×’ denotes that no bound was inferred, ‘TO’ denotes that the tool timed out (the time out limit was 20 min, a longer time out did not yield additional results), ‘’ denotes that we were not able to translate the example into the input format of the tool. For each file we annotate its asymptotic complexity (an asymptotic bound on the total number of loop iterations, determined manually) behind its file name in Table 12.

We explain the last 5 rows of Table 12: ‘Total Tight’ states the number of examples for which the respective tool inferred a tight bound (see Definition 5). ‘Total Over-approx.’ states the number of examples for which the respective tool inferred a bound that is not tight. ‘Total Fail’ states the number of examples for which the respective tool did not report a bound, but returned within the time out limit of 20 min. ‘Total Timed Out’ states the total number of examples on which the respective tool timed out (the time out limit was 20 min). ‘Total Time’ states the overall time consumed by the respective tool for processing the complete benchmark. ‘Total Time w/o TO’ states the overall time consumed by the respective tool on those examples on which the tool did not time out.

loopus’15 fails to infer a tight bound only for Configure and analyse_other. For both examples a precise bound can be obtained by an improvement of our variable bound function (VB) described in [30] which is not yet implemented into our tool. loopus’15 is far more successful in inferring tight bounds for the examples than any of the competitors. loopus’15 infers 21, loopus’14 12, Rank 7, C4B 6, CoFloCo 6 and KoAT 2 tight bounds. There are 9 examples for which only our tool loopus (loopus’15 and loopus’14) infers a tight bound: cf_decode_eol, PackBitsEncode, s_SFD_process, send_tree, subsetdump, ParseFile, SingleLinkCluster,xdr3dfcoord, and XNU.

The experiment demonstrates, that our bound analysis complements the state-of-the-art, by inferring tight bounds for a class of real-world loop iterations, on which existing techniques mostly fail or obtain coarse over-approximations.

Technical remarks (1) We counted the time needed by the tool Aspic (a preprocessor for Rank which performs invariant generation) into the time of the bound analysis performed by Rank. (2) Rank reported an unsound bound and an error message for the examples s_SFD_process.c, load_mems.c and SingleLinkCluster.c. On these examples we therefore assessed Rank’s return value as fail (‘×’).

Amortized Complexity Analysis

In the following we discuss how our approach relates to amortized complexity analysis as introduced by Tarjan in his influential paper [32]. We recall Tarjan’s idea of using potential functions for amortized analysis in Sect. 9. In Sect. 9.1 we explain how our approach can be viewed as an instantiation of amortized analysis via potential functions.

Amortized Analysis using Potential Functions Amortized complexity analysis [32] aims at inferring the worst-case average cost over a sequence of calls to an operation or function rather than the worst-case cost of a single call. In (resource) bound analysis the difference between the single worst-case cost and the amortized cost is relevant, e.g., if a function f is called inside a loop: assume the loop bound is n and the single worst-case cost of a call to f is also n. The cost of a single call to f amortized over all n calls might, however, be lower than n, e.g., 2. In this case the total worst-case cost of iterating the loop is 2n rather than n2. Note that in our non-recursive setting function calls can always be inlined. The amortized analysis problem thus boils down to the problem of inferring the cost of executing an inner loop averaged over all executions of the outer loop.

Tarjan [32] motivates amortized complexity analysis on the example of a program which executes n stack operations StackOp. Each StackOp operation consists of a push instruction, adding an element to the stack, followed by a pop instruction, removing an arbitrary number of elements from the stack. Initially the stack is empty. The cost of a single push is 1 and the cost of a single pop is the number of elements removed from the stack. Tarjan points out that the worst-case cost of a single pop is n: the nth pop instruction may pop n elements (cost n) from the stack, if the previous pop instructions did not remove any elements from the stack. i.e., the worst-case cost of a single StackOp operation is n+1. Nevertheless all n operations StackOp cannot cost more than 2n in total since we cannot remove more elements from the stack than have been added to the stack and thus the overall cost of the pop instructions is bounded by the total number of push instructions (n by assumption). The amortized cost of StackOp, i.e., the cost of StackOp averaged over the sequence of all n operations, is therefore 2.

Potential Function As a means to reason about the amortized cost of an operation or a sequence of operations, Tarjan introduces the notion of a potential function. A potential function is a function Φ:ΣZ from the program states to the integers. Let Cop(σ) denote the cost of executing operation op at program state σΣ. Let Φ be a potential function. Tarjan defines the amortized cost CopA(σ) as CopA(σ)=Cop(σ)+Φ(σ)-Φ(σ) where σ denotes the program state before and σ denotes the program state after executing op. I.e., the amortized cost is the cost plus the decrease resp. increase in the value of the potential. Consider a sequence of n operations, let opi denote the ith operation in the sequence. Let σi denote the program state before executing operation opi, σi+1 is the program state after executing opi. In general, the total cost of executing all n operations is:

i=1nCopi(σi)=i=1nCopiA(σi)-Φ(σi+1)+Φ(σi)=Φ(σ1)-Φ(σn+1)+i=1nCopiA(σi) 1

That is, the total time of the operations equals the sum of their amortized times plus the net decrease in potential from the initial to the final configuration. [...] In most cases of interest, the initial potential is zero and the potential is always non-negative. In such a situation the total amortized time is an upper bound on the total time [32]. I.e., if Φi0 and Φ0=0 then

i=1nCopi(σi)i=1nCopiA(σi) 2

Reconsider Tarjan’s previously discussed example of a sequence of n executions of operation StackOp. Let j denote the stack size, i.e., σi(j) is the size of the stack in program state σi. The cost of executing StackOp in program state σi is CStackOp(σi)=1+(σi(j)+1-σi+1(j)) (where σi(j)+1-σi+1(j) is the cost of the pop operation). Tarjan proposes to use the stack size j as a potential function, i.e. we choose Φ(σi)=σi(j). We have

CStackOpA(σi)=CStackOp(σi)+Φ(σi+1)-Φ(σi)=CStackOp(σi)+σi+1(j)-σi(j)=1+(σi(j)+1-σi+1(j)+σi+1(j)-σi(j)=2

With (2) we get:

i=1nCStackOpi(σi)i=1nCStackOpiA(σi)=i=1n2=2n

Amortized Analysis in our Algorithm

Example tarjan in Fig. 12 is a model of Tarjan’s motivating example (discussed above): variable j models the stack size. The push instruction is modeled by increasing the stack size j by 1. The pop instruction is modeled by decreasing the stack size. Further all calls to StackOp, push and pop are inlined. Consider the labeled transition system of Example tarjan shown in Fig. 12b. We have that transition τ1 models the push instruction, increasing the stack size j by 1, a sequence of transitions τ2 models the pop instruction, decreasing the stack by an arbitrary number of elements. A complete run ρ of Example tarjan can be decomposed into the initial transition τ0 and a number of sub-runs ρ[ik,ik+1] with 1i1<i2 s.t. each ρ[ik,ik+1] consists of a single transition τ1 (push) followed by a sequence of transitions τ2 (pop), followed by a single execution of transition τ3. Each sub-run ρ[ik,ik+1] models Tarjan’s StackOp operation. We thus have that the amortized cost of a sub-run ρ[ik,ik+1] is 2. Given that τ1 cannot be executed more than n times and each ρ[ik,ik+1] contains exactly one τ1, we get that the overall cost of executing Example tarjan is bounded by n×2=2n.

Fig. 12.

Fig. 12

a Example tarjan  b LTS of Example tarjan  c DCP obtained by abstraction from tarjan

In the following we argue that our transition bound algorithm TB is an instantiation of amortized analysis using potential functions. We base our discussion on the concrete semantics of Example tarjan given by the LTS in Fig. 12b. Note, however, that our algorithm runs on the abstracted DCP in Fig. 12c where the same reasoning applies: suppose we want to compute the transition bound of transition τ2 in order to compute the total cost of the pop instructions. Let ρ=(σ0,l0)λ0(σ1,l1)λ1 be a run of Example tarjan. Let len(ρ) denote the length of ρ (i.e, total number of transitions on ρ). We define the cost of executing τ2 in program state σi as Cτ2(σi)=1 and the cost of executing τ1 and τ3 as Cτ1(σi)=Cτ3(σi)=0 since we are only interested in τ2. We have

(τ2,ρ)=i=1len(ρ)-1Cρ(i)(σi)

where ρ(i) denotes the i+1th transition liuili+1 on ρ. Our algorithm reduces the question “how often can τ2 be executed?” to the question “how often can the local bound ’j’ of τ2 be increased on τ1?”. This reasoning uses the local bound j of τ2 as a potential function, as we show next: we get the following amortized costs for executing τ1,τ2 and τ3 respectively:

Cτ2A(σi)=Cτ2(σi)+σi+1(j)-σi(j)=1+σi+1(j)-σi(j)=0Cτ1A(σi)=Cτ1(σi)+σi+1(j)-σi(j)=0+σi+1(j)-σi(j)=1Cτ3A(σi)=Cτ3(σi)+σi+1(j)-σi(j)=0+σi+1(j)-σi(j)=0

With σi(j)0, σ1(j)=0 and (2) we have:

(τ2,ρ)=i=1len(ρ)-1Cρ(i)(σi)i=1len(ρ)-1Cρ(i)A(σi)=(τ1,ρ)×1

We point out that choosing the local bound j of τ2 as potential function causes the amortized cost of executing τ2 to be 0 and reduces the question how often τ2 can be executed to how often the potential j can be incremented on τ1.

Using (τ1,ρ)(τ0,ρ)×n=n one obtains the upper bound n for the total cost of the pop instructions.

Conclusion

We presented a new approach to (resource) bound analysis. Our approach complements existing approaches in several aspects, as discussed in Sect. 2.3. Our analysis handles bound analysis problems of high practical relevance which current approaches cannot handle: current techniques [6, 7, 10, 29] fail on Example xnu and similar problems. We have argued that such problems, e.g., occur naturally in parsing and string matching routines. During our experiments on real-world source code, we found 23 different iteration patterns that pose a challenge for similar reasons as Example xnu: in these patterns, the worst-case cost of a single inner loop execution is lower than the worst-case cost of the inner loop averaged over the iterations of the outer loop. Our implementation obtains tight bounds for 21 out of these 23 iteration patterns (Sect. 8.3).

Our algorithm (Sect. 3) obtains invariants by means of bound analysis and does not rely on external techniques for invariant generation. This is in contrast to current bound analysis techniques (see discussion on related work in Sect. 2). We have compared our algorithm to classical invariant analysis and argued that we can efficiently compute invariants which are difficult to obtain by standard abstract domains such as octagon or polyhedra (Sect. 2). We have demonstrated that the limited form of invariants (upper bound invariants) that our algorithm obtains is sufficient for the bound analysis of a large class of real-world programs.

We have demonstrated that difference constraints are a suitable abstract program model for automatic complexity and resource bound analysis. Despite their syntactic simplicity, difference constraints are expressive enough to model the complexity-related aspects of many imperative programs. In particular, difference constraints allow to model amortized complexity problems such as the bound analysis challenge posed by Example xnu (discussed in Sect. 7). We developed appropriate techniques for abstracting imperative programs to DCPs (Sect. 6): we described how to extract norms (integer-valued expressions over the program state) from imperative programs and showed how to use these norms as variables in DCPs.

Our approach deals with many challenges bound analysis is known to be confronted: in Sect. 8.2 we compared our tool on a benchmark of challenging problems from publications on bound analysis. The results show that our prototype implementation can handle most of these problems. Here, our implementation, while comparable in terms of strengths to other implementations of state-of-the-art bound analysis techniques, performs the task significantly faster than the competitors. The results obtained by our prototype tool could be further enhanced by extending our implementation with additional techniques discussed in [30].

We stress that our approach is more scalable than existing approaches. We presented empirical evidence of the good performance characteristics of our analysis by a large experiment and tool comparison on real source code in Sect. 8.1. We discuss the main technical reasons for scalability of our analysis in Sect. 10.1.

We think that the abstract program model of difference constraint programs is worth further investigation: given that difference constraints can model standard counter manipulations (counter increments, decrements and resets), a further research on complexity analysis of difference constraint programs is of high value. We consider DCPs to be a very suitable program model for studying the principle challenges of automated complexity and resource bound analysis for imperative programs.

Discussion on the Scalability of Our Analysis

In the following we state what we consider to be the main technical reasons that make our analysis scale:

First of all, we achieve scalability by local reasoning: note that our abstraction procedure relies on purely local information, i.e., information that is available on single program transitions. In particular, we do not apply global invariant analysis. Further, the sets I(v) and R(v), by which our main algorithm is parametrized, are built by categorizing the difference constraints on single (abstract) program transitions based on simple syntactic criteria. Our algorithm for computing the local bound mapping ζ (Sect. 4) is polynomial even in the generalized case (Sect. 4.1).

We use bound analysis to infer bounds on variable values (variable bounds). Unlike classical invariant analysis this approach is demand-driven and does not perform a fixed point iteration (see discussion in Sect. 2.3).

Note that the only general purpose reasoner we employ is an SMT solver. Further, the SMT solver is only employed in the program abstraction phase. In terms of size, the problems we feed to the SMT solver are small, namely simple linear arithmetic formulas, composed of the arithmetic of single transitions. Our approach instruments the SMT solver only for yes/no answers, no optimal solution (e.g., minimum or minimal unsatisfiable core) is required.

Our basic bound algorithm (Definition 19) runs in polynomial time. The reasoning based on reset chains (Definition 23), however, has exponential worst-case complexity, resulting from the potentially exponential number of paths in the program (exponential in the number of program transitions). We did not experience this to be an issue in practice because the simplicity of our abstract program model allows to take straightforward engineering measures: program slicing reduces the number of paths in the program significantly, further, merging of similar paths can be applied (details are given in [28]).

Electronic supplementary material

Below is the link to the electronic supplementary material.

Acknowledgements

Open access funding provided by TU Wien.

Footnotes

Electronic supplementary material

The online version of this article (doi:10.1007/s10817-016-9402-4) contains supplementary material, which is available to authorized users.

Supported by the Austrian National Research Network S11403-N23 (RiSE) and by the Vienna Science and Technology Fund (WWTF) through grant ICT12-059.

The tragic death of Helmut Veith prevented him from approving the final version. All faults and inaccuracies belong to his co-authors.

Contributor Information

Moritz Sinn, Email: sinn@forsyte.at.

Florian Zuleger, Email: zuleger@forsyte.at.

References

  • 1.Albert E, Arenas P, Genaim S, Puebla G, Zanardini D. Cost analysis of object-oriented bytecode programs. Theor. Comput. Sci. 2012;413(1):142–159. doi: 10.1016/j.tcs.2011.07.009. [DOI] [Google Scholar]
  • 2.Alias, C., Darte, A., Feautrier, P., Gonnord, L.: Multi-dimensional rankings, program termination, and complexity bounds of flowchart programs. In: Cousot, R., Martel, M. (eds.) Static Analysis: 17th International Symposium, SAS 2010, Perpignan, France, September 14–16, 2010. Proceedings. Springer Berlin Heidelberg, Berlin (2010)
  • 3.Bagnara R, Mesnard F, Pescetti A, Zaffanella E. A new look at the automatic synthesis of linear ranking functions. Inf. Comput. 2012;215:47–67. doi: 10.1016/j.ic.2012.03.003. [DOI] [Google Scholar]
  • 4.Ben-Amram, A.M.: Size-change termination with difference constraints. ACM Trans. Program. Lang. Syst. TOPLAS, 30(3), (2008). doi:10.1145/1353445.1353450
  • 5.Ben-Amram AM, Lee CS. Program termination analysis in polynomial time. ACM Trans. Program. Lang. Syst. TOPLAS. 2007;29(1):5. doi: 10.1145/1180475.1180480. [DOI] [Google Scholar]
  • 6.Brockschmidt M, Emmes F, Falke S, Fuhs C, Giesl J. Analyzing runtime and size complexity of integer programs. ACM Trans. Program. Lang. Syst. 2016;38(4):13:1–13:50. doi: 10.1145/2866575. [DOI] [Google Scholar]
  • 7.Carbonneaux, Q., Hoffmann, J., Shao, Z.: Compositional certified resource bounds. In: PLDI (2015)
  • 8.Colcombet, T., Daviaud, L., Zuleger, F.: Size-change abstraction and max-plus automata. In: MFCS, pp. 208–219 (2014)
  • 9.Coppa, E., Demetrescu, C., Finocchi, I.: Input-sensitive profiling. In: PLDI, pp. 89–98 (2012)
  • 10.Flores-Montoya, A., Hähnle, R.: Resource analysis of complex programs with cost equations. In: APLAS, pp. 275–295 (2014)
  • 11.Gulwani, S., Jain, S., Koskinen, E.: Control-flow refinement and progress invariants for bound analysis. In: PLDI, pp. 375–385 (2009)
  • 12.Gulwani, S., Juvekar, S.: Bound analysis using backward symbolic execution. Technical Report MSR-TR-2004-95, Microsoft Research (2009)
  • 13.Gulwani, S., Lev-Ami, T., Sagiv, M.: A combination framework for tracking partition sizes. In: POPL, pp. 239–251 (2009)
  • 14.Gulwani, S., Mehra, K.K., Chilimbi, T.M.: Speed: precise and efficient static estimation of program computational complexity. In: POPL, pp. 127–139 (2009)
  • 15.Gulwani, S., Zuleger, F.: The reachability-bound problem. In: PLDI, pp. 292–304 (2010)
  • 16.Hoffmann J, Aehlig K, Hofmann M. Multivariate amortized resource analysis. ACM Trans. Program. Lang. Syst. 2012;34(3):14. doi: 10.1145/2362389.2362393. [DOI] [Google Scholar]
  • 17.http://ctuning.org/wiki/index.php/CTools:CBench
  • 18.http://forsyte.at/software/loopus/
  • 19.http://forsyte.at/static/people/sinn/loopusJAR/
  • 20.https://github.com/s-falke/llvm2kittel
  • 21.https://www.spec.org/cpu2006/
  • 22.Jin, G., Song, L., Shi, X., Scherpelz, J., Lu, S.: Understanding and detecting real-world performance bugs. In: PLDI, pp. 77–88 (2012)
  • 23.Lattner, C., Adve, V.S.: LLVM: a compilation framework for lifelong program analysis and transformation. In: CGO, pp. 75–88 (2004)
  • 24.Magill, S., Tsai, M.-H., Lee, P., Tsay, Y.-K.: Automatic numeric abstractions for heap-manipulating programs. In: POPL, pp. 211–222 (2010)
  • 25.Podelski, A., Rybalchenko, A.: A complete method for the synthesis of linear ranking functions. In: VMCAI, pp. 239–251 (2004)
  • 26.Seidl, H., Gawlitza T.M., Schwarz, M.: Parametric strategy iteration. In: Kutsia, T., Voronkov, A. (eds.) SCSS 2014. 6th International Symposium on Symbolic Computation in Software Science, Volume 30 of EPiC Series in Computing, pp. 62–76. EasyChair (2014)
  • 27.Sinn, M., Zuleger, F., Veith, H.: Difference constraints: an adequate abstraction for complexity analysis of imperative programs. In: FMCAD, pp. 144–151 (2015)
  • 28.Sinn, M., Zuleger, F., Veith, H.: A simple and scalable static analysis for bound analysis and amortized complexity analysis. CoRR, abs/1401.5842 (2014)
  • 29.Sinn, M., Zuleger, F., Veith, H.: A simple and scalable static analysis for bound analysis and amortized complexity analysis. In: CAV, pp. 745–761. Springer (2014)
  • 30.Sinn, M.: Automated complexity analysis for imperative programs. Ph.D. thesis, TU Wien, Faculty of Informatics, Wien (2016)
  • 31.Smith, G.: On the foundations of quantitative information flow. In: FOSSACS, pp. 288–302 (2009)
  • 32.Tarjan RE. Amortized computational complexity. SIAM J. Algebraic Discrete Methods. 1985;6(2):306–318. doi: 10.1137/0606031. [DOI] [Google Scholar]
  • 33.Wilhelm R, Engblom J, Ermedahl A, Holsti N, Thesing S, Whalley D, Bernat G, Ferdinand C, Heckmann R, Mitra T, Mueller F, Puaut I, Puschner P, Staschulat J, Stenström P. The worst-case execution-time problem—overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 2008;7(3):36. doi: 10.1145/1347375.1347389. [DOI] [Google Scholar]
  • 34.Zaparanuks, D., Hauswirth, M.: Algorithmic profiling. In: PLDI, pp. 67–76 (2012)
  • 35.Zuleger, F., Gulwani, S., Sinn, M., Veith, H.: Bound analysis of imperative programs with the size-change abstraction. In: SAS, pp. 280–297 (2011)

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials


Articles from Journal of Automated Reasoning are provided here courtesy of Springer

RESOURCES