Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2020 Jun 16;12225:101–125. doi: 10.1007/978-3-030-53291-8_7

Global Guidance for Local Generalization in Model Checking

Hari Govind Vediramana Krishnan 10,, YuTing Chen 11, Sharon Shoham 12, Arie Gurfinkel 10
Editors: Shuvendu K Lahiri8, Chao Wang9
PMCID: PMC7363188

Abstract

SMT-based model checkers, especially IC3-style ones, are currently the most effective techniques for verification of infinite state systems. They infer global inductive invariants via local reasoning about a single step of the transition relation of a system, while employing SMT-based procedures, such as interpolation, to mitigate the limitations of local reasoning and allow for better generalization. Unfortunately, these mitigations intertwine model checking with heuristics of the underlying SMT-solver, negatively affecting stability of model checking.

In this paper, we propose to tackle the limitations of locality in a systematic manner. We introduce explicit global guidance into the local reasoning performed by IC3-style algorithms. To this end, we extend the SMT-IC3 paradigm with three novel rules, designed to mitigate fundamental sources of failure that stem from locality. We instantiate these rules for the theory of Linear Integer Arithmetic and implement them on top of Spacer solver in Z3. Our empirical results show that GSpacer, Spacer extended with global guidance, is significantly more effective than both Spacer and sole global reasoning, and, furthermore, is insensitive to interpolation.


graphic file with name 501999_1_En_7_Figa_HTML.jpg

Introduction

SMT-based Model Checking algorithms that combine SMT-based search for bounded counterexamples with interpolation-based search for inductive invariants are currently the most effective techniques for verification of infinite state systems. They are widely applicable, including for verification of synchronous systems, protocols, parameterized systems, and software.

The Achilles heel of these approaches is the mismatch between the local reasoning used to establish absence of bounded counterexamples and a global reason for absence of unbounded counterexamples (i.e., existence of an inductive invariant). This is particularly apparent in IC3-style algorithms  [7], such as Spacer   [18]. IC3-style algorithms establish bounded safety by repeatedly computing predecessors of error (or bad) states, blocking them by local reasoning about a single step of the transition relation of the system, and, later, using the resulting lemmas to construct a candidate inductive invariant for the global safety proof. The whole process is driven by the choice of local lemmas. Good lemmas lead to quick convergence, bad lemmas make even simple-looking problems difficult to solve.

The effect of local reasoning is somewhat mitigated by the use of interpolation in lemma construction. In addition to the usual inductive generalization by dropping literals from a blocked bad state, interpolation is used to further generalize the blocked state using theory-aware reasoning. For example, when blocking a bad state Inline graphic, inductive generalization would infer a sub-clause of Inline graphic as a lemma, while interpolation might infer Inline graphic – a predicate that might be required for the inductive invariant. Spacer, that is based on this idea, is extremely effective, as demonstrated by its performance in recent CHC-COMP competitions  [10]. The downside, however, is that the approach leads to a highly unstable procedure that is extremely sensitive to syntactic changes in the system description, changes in interpolation algorithms, and any algorithmic changes in the underlying SMT-solver.

An alternative approach, often called invariant inference, is to focus on the global safety proof, i.e., an inductive invariant. This has long been advocated by such approaches as Houdini  [15], and, more recently, by a variety of machine-learning inspired techniques, e.g., FreqHorn  [14], LinearArbitrary  [28], and ICE-DT  [16]. The key idea is to iteratively generate positive (i.e., reachable states) and negative (i.e., states that reach an error) examples and to compute a candidate invariant that separates these two sets. The reasoning is more focused towards the invariant, and, the search is restricted by either predicates, templates, grammars, or some combination. Invariant inference approaches are particularly good at finding simple inductive invariants. However, they do not generalize well to a wide variety of problems. In practice, they are often used to complement other SMT-based techniques.

In this paper, we present a novel approach that extends, what we call, local reasoning of IC3-style algorithms with global guidance inspired by the invariant inference algorithms described above. Our main insight is that the set of lemmas maintained by IC3-style algorithms hint towards a potential global proof. However, these hints are lost in existing approaches. We observe that letting the current set of lemmas, that represent candidate global invariants, guide local reasoning by introducing new lemmas and states to be blocked is often sufficient to direct IC3 towards a better global proof.

We present and implement our results in the context of Spacer—a solver for Constrained Horn Clauses (CHC)—implemented in the Z3 SMT-solver  [13]. Spacer is used by multiple software model checking tools, performed remarkably well in CHC-COMP competitions  [10], and is open-sourced. However, our results are fundamental and apply to any other IC3-style algorithm. While our implementation works with arbitrary CHC instances, we simplify the presentation by focusing on infinite state model checking of transition systems.

We illustrate the pitfalls of local reasoning using three examples shown in Fig. 1. All three examples are small, simple, and have simple inductive invariants. All three are challenging for Spacer. Where these examples are based on Spacer-specific design choices, each exhibits a fundamental deficiency that stems from local reasoning. We believe they can be adapted for any other IC3-style verification algorithm. The examples assume basic familiarity with the IC3 paradigm. Readers who are not familiar with it may find it useful to read the examples after reading Sect. 2.

Fig. 1.

Fig. 1.

Verification tasks to illustrate sources of divergence for Spacer. The call nd() non-deterministically returns a Boolean value.

Myopic Generalization. Spacer diverges on the example in Fig. 1(a) by iteratively learning lemmas of the form Inline graphic for different values of k, where a, b, c, d are the program variables. These lemmas establish that there are no counterexamples of longer and longer lengths. However, the process never converges to the desired lemma Inline graphic, which excludes counterexamples of any length. The lemmas are discovered using interpolation, based on proofs found by the SMT-solver. A close examination of the corresponding proofs shows that the relationship between Inline graphic and Inline graphic does not appear in the proofs, making it impossible to find the desired lemma by tweaking local interpolation reasoning. On the other hand, looking at the global proof (i.e., the set of lemmas discovered to refute a bounded counterexample), it is almost obvious that Inline graphic is an interesting generalization to try. Amusingly, a small, syntactic, but semantic preserving change of swapping line 2 for line 3 in Fig. 1(a) changes the SMT-solver proofs, affects local interpolation, and makes the instance trivial for Spacer.

Excessive (Predecessor) Generalization. Spacer diverges on the example in Fig. 1(b) by computing an infinite sequence of lemmas of the form Inline graphic, where a and b are program variables, and Inline graphic and Inline graphic are integers. The root cause is excessive generalization in predecessor computation. The Inline graphic states are Inline graphic, and their predecessors are states such as Inline graphic, Inline graphic, etc., or, more generally, regions Inline graphic, Inline graphic, etc. Spacer always attempts to compute the most general predecessor states. This is the best local strategy, but blocking these regions by learning their negation leads to the aforementioned lemmas. According to the global proof these lemmas do not converge to a linear invariant. An alternative strategy that under-approximates the problematic regions by (numerically) simpler regions and, as a result, learns simpler lemmas is desired (and is effective on this example). For example, region Inline graphic can be under-approximated by Inline graphic, eventually leading to a lemma Inline graphic, that is a part of the final invariant: Inline graphic.

Stuck in a Rut. Finally, Spacer converges on the example in Fig. 1(c), but only after unrolling the system for 100 iterations. During the first 100 iterations, Spacer learns that program states with Inline graphic are not reachable because a is bounded by 1 in the first iteration, by 2 in the second, and so on. In each iteration, the global proof is updated by replacing a lemma of the form Inline graphic by lemma of the form Inline graphic for different values of k. Again, the strategy is good locally – total number of lemmas does not grow and the bounded proof is improved. Yet, globally, it is clear that no progress is made since the same set of bad states are blocked again and again in slightly different ways. An alternative strategy is to abstract the literal Inline graphic from the formula that represents the bad states, and, instead, conjecture that no states in Inline graphic are reachable.

Our Approach: Global Guidance. As shown in the examples above, in all the cases that Spacer diverges, the missteps are not obvious locally, but are clear when the overall proof is considered. We propose three new rules, Subsume, Concretize, and, Conjecture, that provide global guidance, by considering existing lemmas, to mitigate the problems illustrated above. Subsume introduces a lemma that generalizes existing ones, Concretize under-approximates partially-blocked predecessors to focus on repeatedly unblocked regions, and Conjecture over-approximates a predecessor by abstracting away regions that are repeatedly blocked. The rules are generic, and apply to arbitrary SMT theories. Furthermore, we propose an efficient instantiation of the rules for the theory Linear Integer Arithmetic.

We have implemented the new strategy, called GSpacer, in Spacer and compared it to the original implementation of Spacer. We show that GSpacer outperforms Spacer in benchmarks from CHC-COMP 2018 and 2019. More significantly, we show that the performance is independent of interpolation. While Spacer is highly dependent on interpolation parameters, and performs poorly when interpolation is disabled, the results of GSpacer are virtually unaffected by interpolation. We also compare GSpacer to LinearArbitrary  [28], a tool that infers invariants using global reasoning. GSpacer outperforms LinearArbitrary on the benchmarks from  [28]. These results indicate that global guidance mitigates the shortcomings of local reasoning.

The rest of the paper is structured as follows. Sect. 2 presents the necessary background. Sect. 3 introduces our global guidance as a set of abstract inference rules. Sect. 4 describes an instantiation of the rules to Linear Integer Arithmetic (LIA). Sect. 5 presents our empirical evaluation. Finally, Sect. 7 describes related work and concludes the paper.

Background

Logic. We consider first order logic modulo theories, and adopt the standard notation and terminology. A first-order language modulo theory Inline graphic is defined over a signature Inline graphic that consists of constant, function and predicate symbols, some of which may be interpreted by Inline graphic. As always, terms are constant symbols, variables, or function symbols applied to terms; atoms are predicate symbols applied to terms; literals are atoms or their negations; cubes are conjunctions of literals; and clauses are disjunctions of literals. Unless otherwise stated, we only consider closed formulas (i.e., formulas without any free variables). As usual, we use sets of formulas and their conjunctions interchangeably.

MBP. Given a set of constants Inline graphic, a formula Inline graphic and a model Inline graphic, Model Based Projection (MBP) of Inline graphic over the constants Inline graphic, denoted Inline graphic, computes a model-preserving under-approximation of Inline graphic projected onto Inline graphic. That is, Inline graphic is a formula over Inline graphic such that Inline graphic and any model Inline graphic can be extended to a model Inline graphic by providing an interpretation for Inline graphic. There are polynomial time algorithms for computing MBP in Linear Arithmetic  [5, 18].

Interpolation. Given an unsatisfiable formula Inline graphic, an interpolant, denoted Inline graphic, is a formula I over the shared signature of A and B such that Inline graphic and Inline graphic.

Safety Problem. A transition system is a pair Inline graphic, where Inline graphic is a formula over Inline graphic and Inline graphic is a formula over Inline graphic, where Inline graphic.1 The states of the system correspond to structures over Inline graphic, Inline graphic represents the initial states and Inline graphic represents the transition relation, where Inline graphic is used to represent the pre-state of a transition, and Inline graphic is used to represent the post-state. For a formula Inline graphic over Inline graphic, we denote by Inline graphic the formula obtained by substituting each Inline graphic by Inline graphic. A safety problem is a triple Inline graphic, where Inline graphic is a transition system and Inline graphic is a formula over Inline graphic representing a set of bad states.

The safety problem Inline graphic has a counterexample of length k if the following formula is satisfiable: Inline graphic where Inline graphic is defined over Inline graphic (a copy of the signature used to represent the state of the system after the execution of i steps) and is obtained from Inline graphic by substituting each Inline graphic by Inline graphic, and Inline graphic is obtained from Inline graphic by substituting Inline graphic by Inline graphic and Inline graphic by Inline graphic. The transition system is safe if the safety problem has no counterexample, of any length.

Inductive Invariants. An inductive invariant is a formula Inline graphic over Inline graphic such that (i) Inline graphic, (ii) Inline graphic, and (iii) Inline graphic. If such an inductive invariant exists, then the transition system is safe.

Spacer. The safety problem defined above is an instance of a more general problem, CHC-SAT, of satisfiability of Constrained Horn Clauses (CHC). Spacer is a semi-decision procedure for CHC-SAT. However, to simplify the presentation, we describe the algorithm only for the particular case of the safety problem. We stress that Spacer, as well as the developments of this paper, apply to the more general setting of CHCs (both linear and non-linear). We assume that the only uninterpreted symbols in Inline graphic are constant symbols, which we denote Inline graphic. Typically, these represent program variables. Without loss of generality, we assume that Inline graphic is a cube.graphic file with name 501999_1_En_7_Figb_HTML.jpg

Algorithm 1 presents the key ingredients of Spacer as a set of guarded commands (or rules). It maintains the following. Current unrolling depth N at which a counterexample is searched (there are no counterexamples with depth less than N). A trace Inline graphic of frames, such that each frame Inline graphic is a set of lemmas, and each lemma Inline graphic is a clause. A queue of proof obligations Q, where each proof obligation (pob) in Q is a pair Inline graphic of a cube Inline graphic and a level number i, Inline graphic. An under-approximation Inline graphic of reachable states. Intuitively, each frame Inline graphic is a candidate inductive invariant s.t. Inline graphic over-approximates states reachable up to i steps from Inline graphic. The latter is ensured since Inline graphic, the trace is monotone, i.e., Inline graphic, and each frame is inductive relative to its previous one, i.e., Inline graphic. Each pob Inline graphic in Q corresponds to a suffix of a potential counterexample that has to be blocked in Inline graphic, i.e., has to be proven unreachable in i steps.

The Candidate rule adds an initial pob Inline graphic to the queue. If a pob Inline graphic cannot be blocked because Inline graphic is reachable from frame Inline graphic, the Predecessor rule generates a predecessor Inline graphic of Inline graphic using MBP and adds Inline graphic to Q. The Successor rule updates the set of reachable states if the pob is reachable. If the pob is blocked, the Conflict rule strengthens the trace Inline graphic by using interpolation to learn a new lemma Inline graphic that blocks the pob, i.e., Inline graphic implies Inline graphic. The Induction rule strengthens a lemma by inductive generalization and the Propagate rule pushes a lemma to a higher frame. If the Inline graphic state has been blocked at N, the Unfold rule increments the depth of unrolling N. In practice, the rules are scheduled to ensure progress towards finding a counterexample.

Global Guidance of Local Proofs

As illustrated by the examples in Fig. 1, while Spacer is generally effective, its local reasoning is easily confused. The effectiveness is very dependent on the local computation of predecessors using model-based projection, and lemmas using interpolation. In this section, we extend Spacer with three additional global reasoning rules. The rules are inspired by the deficiencies illustrated by the motivating examples in Fig. 1. In this section, we present the rules abstractly, independent of any underlying theory, focusing on pre- and post-conditions. In Sect. 4, we specialize the rules for Linear Integer Arithmetic, and show how they are scheduled with the other rules of Spacer in an efficient verification algorithm. The new global rules are summarized in Algorithm 2. We use the same guarded command notation as in description of Spacer in Algorithm 1. Note that the rules supplement, and not replace, the ones in Algorithm 1.

Subsume is the most natural rule to explain. It says that if there is a set of lemmas Inline graphic at level i, and there exists a formula Inline graphic such that (a) Inline graphic is stronger than every lemma in Inline graphic, and (b) Inline graphic over-approximates states reachable in at most k steps, where Inline graphic, then Inline graphic can be added to the trace to subsume Inline graphic. This rule reduces the size of the global proof – that is, the number of total not-subsumed lemmas. Note that the rule allows Inline graphic to be at a level k that is higher than i. The choice of Inline graphic is left open. The details are likely to be specific to the theory involved. For example, when instantiated for LIA, Subsume is sufficient to solve example in Fig. 1(a). Interestingly, Subsume is not likely to be effective for propositional IC3. In that case, Inline graphic is a clause and the only way for it to be stronger than Inline graphic is for Inline graphic to be a syntactic sub-sequence of every lemma in Inline graphic, but such Inline graphic is already explored by local inductive generalization (rule Induction in Algorithm 1).

Concretize applies to a pob, unlike Subsume. It is motivated by example in Fig. 1(b) that highlights the problem of excessive local generalization. Spacer always computes as general predecessors as possible. This is necessary for refutational completeness since in an infinite state system there are infinitely many potential predecessors. Computing the most general predecessor ensures that Spacer finds a counterexample, if it exists. However, this also forces Spacer to discover more general, and sometimes more complex, lemmas than might be necessary for an inductive invariant. Without a global view of the overall proof, it is hard to determine when the algorithm generalizes too much. The intuition for Concretize is that generalization is excessive when there is a single pob Inline graphic that is not blocked, yet, there is a set of lemmas Inline graphic such that every lemma Inline graphic partially blocks Inline graphic. That is, for any Inline graphic, there is a sub-region Inline graphic of pob Inline graphic that is blocked by Inline graphic (i.e., Inline graphic), and there is at least one state Inline graphic that is not blocked by any existing lemma in Inline graphic (i.e., Inline graphic). In this case, Concretize computes an under-approximation Inline graphic of Inline graphic that includes some not-yet-blocked state s. The new pob is added to the lowest level at which Inline graphic is not yet blocked. Concretize is useful to solve the example in Fig. 1(b).

Conjecture guides the algorithm away from being stuck in the same part of the search space. A single pob Inline graphic might be blocked by a different lemma at each level that Inline graphic appears in. This indicates that the lemmas are too strong, and cannot be propagated successfully to a higher level. The goal of the Conjecture rule is to identify such a case to guide the algorithm to explore alternative proofs with a better potential for generalization. This is done by abstracting away the part of the pob that has been blocked in the past. The pre-condition for Conjecture is the existence of a pob Inline graphic such that Inline graphic is split into two (not necessarily disjoint) sets of literals, Inline graphic and Inline graphic. Second, there must be a set of lemmas Inline graphic, at a (typically much lower) level Inline graphic such that every lemma Inline graphic blocks Inline graphic, and, moreover, blocks Inline graphic by blocking Inline graphic. Intuitively, this implies that while there are many different lemmas (i.e., all lemmas in Inline graphic) that block Inline graphic at different levels, all of them correspond to a local generalization of Inline graphic that could not be propagated to block Inline graphic at higher levels. In this case, Conjecture abstracts the pob Inline graphic into Inline graphic, hoping to generate an alternative way to block Inline graphic. Of course, Inline graphic is conjectured only if it is not already blocked and does not contain any known reachable states. Conjecture is necessary for a quick convergence on the example in Fig. 1(c). In some respect, Conjecture is akin to widening in Abstract Interpretation  [12] – it abstracts a set of states by dropping constraints that appear to prevent further exploration. Of course, it is also quite different since it does not guarantee termination. While Conjecture is applicable to propositional IC3 as well, it is much more significant in SMT-based setting since in many FOL theories a single literal in a pob might result in infinitely many distinct lemmas.

Each of the rules can be applied by itself, but they are most effective in combination. For example, Concretize creates less general predecessors, that, in the worst case, lead to many simple lemmas. At the same time, Subsume combines lemmas together into more complex ones. The interaction of the two produces lemmas that neither one can produce in isolation. At the same time, Conjecture helps unstuck the algorithm from a single unproductive pob, allowing the other rules to take effect.

Global Guidance for Linear Integer Arithmetic

In this section, we present a specialization of our general rules, shown in Algorithm 2, to the theory of Linear Integer Arithmetic (LIA). This requires solving two problems: identifying subsets of lemmas for pre-conditions of the rules (clearly using all possible subsets is too expensive), and applying the rule once its pre-condition is met. For lemma selection, we introduce a notion of syntactic clustering based on anti-unification. For rule application, we exploit basic properties of LIA for an effective algorithm. Our presentation is focused on LIA exclusively. However, the rules extend to combinations of LIA with other theories, such as the combined theory of LIA and Arrays.

The rest of this section is structured as follows. We begin with a brief background on LIA in Sect. 4.1. We then present our lemma selection scheme, which is common to all the rules, in Sect. 4.2, followed by a description of how the rules Subsume (in Sect. 4.3), Concretize (in Sect. 4.4), and Conjecture (in Sect. 4.5) are instantiated for LIA. We conclude in Sect. 4.6 with an algorithm that integrates all the rules together.

Linear Integer Arithmetic: Background

In the theory of Linear Integer Arithmetic (LIA), formulas are defined over a signature that includes interpreted function symbols Inline graphic, −, Inline graphic, interpreted predicate symbols <, Inline graphic, Inline graphic, interpreted constant symbols Inline graphic, and uninterpreted constant symbols Inline graphic. We write Inline graphic for the set interpreted constant symbols, and call them integers. We use constants to refer exclusively to the uninterpreted constants (these are often called variables in LIA literature). Terms (and accordingly formulas) in LIA are restricted to be linear, that is, multiplication is never applied to two constants.

We write Inline graphic for the fragment of LIA that excludes divisiblity (dInline graphich) predicates. A literal in Inline graphic is a linear inequality; a cube is a conjunction of such inequalities, that is, a polytope. We find it convenient to use matrix-based notation for representing cubes in Inline graphic. A ground cube Inline graphic with p inequalities (literals) over k (uninterpreted) constants is written as Inline graphic, where A is a Inline graphic matrix of coefficients in Inline graphic, Inline graphic is a column vector that consists of the (uninterpreted) constants, and Inline graphic is a column vector in Inline graphic. For example, the cube Inline graphic is written as Inline graphic In the sequel, all vectors are column vectors, super-script T denotes transpose, dot is used for a dot product and Inline graphic stands for a matrix of column vectors Inline graphic and Inline graphic.

Lemma Selection

A common pre-condition for all of our global rules in Algorithm 2 is the existence of a subset of lemmas Inline graphic of some frame Inline graphic. Attempting to apply the rules for every subset of Inline graphic is infeasible. In practice, we use syntactic similarity between lemmas as a predictor that one of the global rules is applicable, and restrict Inline graphic to subsets of syntactically similar lemmas. In the rest of this section, we formally define what we mean by syntactic similarity, and how syntactically similar subsets of lemmas, called clusters, are maintained efficiently throughout the algorithm.

Syntactic Similarity. A formula Inline graphic with free variables is called a pattern. Note that we do not require Inline graphic to be in LIA. Let Inline graphic be a substitution, i.e., a mapping from variables to terms. We write Inline graphic for the result of replacing all occurrences of free variables in Inline graphic with their mapping under Inline graphic. A substitution Inline graphic is called numeric if it maps every variable to an integer, i.e., the range of Inline graphic is Inline graphic. We say that a formula Inline graphic numerically matches a pattern Inline graphic iff there exists a numeric substitution Inline graphic such that Inline graphic. Note that, as usual, the equality is syntactic. For example, consider the pattern Inline graphic with free variables Inline graphic and Inline graphic and uninterpreted constants a and b. The formula Inline graphic matches Inline graphic via a numeric substitution Inline graphic. However, Inline graphic, while semantically equivalent to Inline graphic, does not match Inline graphic. Similarly Inline graphic does not match Inline graphic as well.

Matching is extended to patterns in the usual way by allowing a substitution Inline graphic to map variables to variables. We say that a pattern Inline graphic is more general than a pattern Inline graphic if Inline graphic matches Inline graphic. A pattern Inline graphic is a numeric anti-unifier for a pair of formulas Inline graphic and Inline graphic if both Inline graphic and Inline graphic match Inline graphic numerically. We write Inline graphic for a most general numeric anti-unifier of Inline graphic and Inline graphic. We say that two formulas Inline graphic and Inline graphic are syntactically similar if there exists a numeric anti-unifier between them (i.e., Inline graphic is defined). Anti-unification is extended to sets of formulas in the usual way.

Clusters. We use anti-unification to define clusters of syntactically similar formulas. Let Inline graphic be a fixed set of formulas, and Inline graphic a pattern. A cluster, Inline graphic, is a subset of Inline graphic such that every formula Inline graphic numerically matches Inline graphic. That is, Inline graphic is a numeric anti-unifier for Inline graphic. In the implementation, we restrict the pre-conditions of the global rules so that a subset of lemmas Inline graphic is a cluster for some pattern Inline graphic, i.e., Inline graphic.

Clustering Lemmas. We use the following strategy to efficiently keep track of available clusters. Let Inline graphic be a new lemma to be added to Inline graphic. Assume there is at least one lemma Inline graphic that numerically anti-unifies with Inline graphic via some pattern Inline graphic. If such an Inline graphic does not belong to any cluster, a new cluster Inline graphic is formed, where Inline graphic. Otherwise, for every lemma Inline graphic that numerically matches Inline graphic and every cluster Inline graphic containing Inline graphic, Inline graphic is added to Inline graphic if Inline graphic matches Inline graphic, or a new cluster is formed using Inline graphic, Inline graphic, and any other lemmas in Inline graphic that anti-unify with them. Note that a new lemma Inline graphic might belong to multiple clusters.

For example, suppose Inline graphic, and there is already a cluster Inline graphic. Since Inline graphic anti-unifies with each of the lemmas in the cluster, but does not match the pattern Inline graphic, a new cluster that includes all of them is formed w.r.t. a more general pattern: Inline graphic.

In the presentation above, we assumed that anti-unification is completely syntactic. This is problematic in practice since it significantly limits the applicability of the global rules. Recall, for example, that Inline graphic and Inline graphic do not anti-unify numerically according to our definitions, and, therefore, do not cluster together. In practice, we augment syntactic anti-unification with simple rewrite rules that are applied greedily. For example, we normalize all Inline graphic terms, take care of implicit multiplication by 1, and of associativity and commutativity of addition. In the future, it is interesting to explore how advanced anti-unification algorithms, such as  [8, 27], can be adapted for our purpose.

Subsume Rule for LIA

Recall that the Subsume rule (Algorithm 2) takes a cluster of lemmas Inline graphic and computes a new lemma Inline graphic that subsumes all the lemmas in Inline graphic, that is Inline graphic. We find it convenient to dualize the problem. Let Inline graphic be the dual of Inline graphic, clearly Inline graphic iff Inline graphic. Note that Inline graphic is a set of clauses, Inline graphic is a set of cubes, Inline graphic is a clause, and Inline graphic is a cube. In the case of Inline graphic, this means that Inline graphic represents a union of convex sets, and Inline graphic represents a convex set that the Subsume rule must find. The strongest such Inline graphic in Inline graphic exists, and is the convex closure of Inline graphic. Thus, applying Subsume in the context of Inline graphic is reduced to computing a convex closure of a set of (negated) lemmas in a cluster. Full LIA extends Inline graphic with divisibility constraints. Therefore, Subsume obtains a stronger Inline graphic by adding such constraints.

Example 1

For example, consider the following cluster:

graphic file with name M293.gif

The convex closure of Inline graphic in Inline graphic is Inline graphic. However, a stronger over-approximation exists in LIA: Inline graphic.    Inline graphic

In the sequel, we describe subsumeCube (Algorithm 3) which computes a cube Inline graphic that over-approximates Inline graphic. Subsume is then implemented by removing from Inline graphic lemmas that are already subsumed by existing lemmas in Inline graphic, dualizing the result into Inline graphic, invoking subsumeCube on Inline graphic and returning Inline graphic as a lemma that subsumes Inline graphic.

Recall that Subsume is tried only in the case Inline graphic. We further require that the negated pattern, Inline graphic, is of the form Inline graphic, where A is a coefficients matrix, Inline graphic is a vector of constants and Inline graphic is a vector of p free variables. Under this assumption, Inline graphic (the dual of Inline graphic) is of the form Inline graphic, where Inline graphic, and for each Inline graphic, Inline graphic is a numeric substitution to Inline graphic from which one of the negated lemmas in Inline graphic is obtained. That is, Inline graphic. In Example 1, Inline graphic and

graphic file with name 501999_1_En_7_Equ6_HTML.gif

Each cube Inline graphic is equivalent to Inline graphic. Finally, Inline graphic. Thus, computing the over-approximation of Inline graphic is reduced to (a) computing the convex hull H of a set of points Inline graphic, (b) computing divisibility constraints D that are satisfied by all the points, (c) substituting Inline graphic for the disjunction in the equation above, and (c) eliminating variables Inline graphic. Both the computation of Inline graphic and the elimination of Inline graphic may be prohibitively expensive. We, therefore, over-approximate them. Our approach for doing so is presented in Algorithm 3, and explained in detail below.

Computing the convex hull of Inline graphic. lines 3 to 8 compute the convex hull of Inline graphic as a formula over Inline graphic, where variable Inline graphic, for Inline graphic, represents the Inline graphic coordinates in the vectors (points) Inline graphic. Some of the coordinates, Inline graphic, in these vectors may be linearly dependent upon others. To simplify the problem, we first identify such dependencies and compute a set of linear equalities that expresses them (L in line 4). To do so, we consider a matrix Inline graphic, where the Inline graphic row consists of Inline graphic. The Inline graphic column in N, denoted Inline graphic, corresponds to the Inline graphic coordinate, Inline graphic. The rank of N is the number of linearly independent columns (and rows). The other columns (coordinates) can be expressed by linear combinations of the linearly independent ones. To compute these linear combinations we use the kernel of Inline graphic (N appended with a column vector of 1’s), which is the set of all vectors Inline graphic such that Inline graphic, where Inline graphic is the zero vector. Let Inline graphic be a basis for the kernel of Inline graphic. Then Inline graphic, and for each vector Inline graphic, the linear equality Inline graphic holds in all the rows of N (i.e., all the given vectors satisfy it). We accumulate these equalities, which capture the linear dependencies between the coordinates, in L. Further, the equalities are used to compute Inline graphic coordinates (columns in N) that are linearly independent and, modulo L, uniquely determine the remaining coordinates. We denote by Inline graphic the subset of Inline graphic that consists of the linearly independent coordinates. We further denote by Inline graphic the projection of Inline graphic to these coordinates and by Inline graphic the projection of N to the corresponding columns. We have that Inline graphic.

In Example 1, the numeral matrix is Inline graphic, for which Inline graphic. Therefore, L is the conjunction of equalities Inline graphic, or, equivalently Inline graphic, Inline graphic, and

graphic file with name M363.gif

Next, we compute the convex closure of Inline graphic, and conjoin it with L to obtain H, the convex closure of Inline graphic.

If the dimension of Inline graphic is one, as is the case in the example above, convex closure, C, of Inline graphic is obtained by bounding the sole element of Inline graphic based on its values in Inline graphic (line 6). In Example 1, we obtain Inline graphic.

If the dimension of Inline graphic is greater than one, just computing the bounds of one of the constants is not sufficient. Instead, we use the concept of syntactic convex closure from  [2] to compute the convex closure of Inline graphic as Inline graphic where Inline graphic is a vector that consists of q fresh rational variables and C is defined as follows (line 8): Inline graphic. C states that Inline graphic is a convex combination of the rows of Inline graphic, or, in other words, Inline graphic is a convex combination of Inline graphic.

To illustrate the syntactic convex closure, consider a second example with a set of cubes: Inline graphic. The coefficient matrix A, and the numeral matrix N are then: Inline graphic and Inline graphic. Here, Inline graphic is empty – all the columns are linearly independent, hence, Inline graphic and Inline graphic. Therefore, syntactic convex closure is applied to the full matrix N, resulting in

graphic file with name M383.gif

The convex closure of Inline graphic is then Inline graphic, which is Inline graphic here.

Divisibility Constraints. Inductive invariants for verification problems often require divisibility constraints. We, therefore, use such constraints, denoted D, to obtain a stronger over-approximation of Inline graphic than the convex closure. To add a divisibility constraint for Inline graphic, we consider the column Inline graphic that corresponds to Inline graphic in Inline graphic. We find the largest positive integer d such that each integer in Inline graphic leaves the same remainder when divided by d; namely, there exists Inline graphic such that Inline graphic for every Inline graphic. This means that Inline graphic is satisfied by all the points Inline graphic. Note that such r always exists for Inline graphic. To avoid this trivial case, we add the constraint Inline graphic only if Inline graphic (line 12). We repeat this process for each Inline graphic.

In Example 1, all the elements in the (only) column of the matrix Inline graphic, which corresponds to Inline graphic, are divisible by 2, and no larger d has a corresponding r. Thus, line 12 of Algorithm 3 adds the divisibility condition Inline graphic to D.

Eliminating Existentially Quantified Variables Using MBP. By combining the linear equalities exhibited by N, the convex closure of Inline graphic and the divisibility constraints on Inline graphic, we obtain Inline graphic as an over-approximation of Inline graphic. Accordingly, Inline graphic, where Inline graphic, is an over-approximation of Inline graphic (line 13). In order to get a LIA cube that overapproximates Inline graphic, it remains to eliminate the existential quantifiers. Since quantifier elimination is expensive, and does not necessarily generate convex formulas (cubes), we approximate it using MBP. Namely, we obtain a cube Inline graphic that under-approximates Inline graphic by applying MBP on Inline graphic and a model Inline graphic. We then use an SMT solver to drop literals from Inline graphic until it over-approximates Inline graphic, and hence also Inline graphic (lines 16 to 19). The result is returned by Subsume as an over-approximation of Inline graphic.

Models Inline graphic that satisfy Inline graphic and do not satisfy any of the cubes in Inline graphic are preferred when computing MBP (line 14) as they ensure that the result of MBP is not subsumed by any of the cubes in Inline graphic.

Note that the Inline graphic are rational variables and Inline graphic are integer variables, which means we require MBP to support a mixture of integer and rational variables. To achieve this, we first relax all constants to be rationals and apply MBP over LRA to eliminate Inline graphic. We then adjust the resulting formula back to integer arithmetic by multiplying each atom by the least common multiple of the denominators of the coefficients in it. Finally, we apply MBP over the integers to eliminate Inline graphic.

Considering Example 1 again, we get that Inline graphic (the first three conjuncts correspond to Inline graphic). Note that in this case we do not have rational variables Inline graphic since Inline graphic. Depending on the model, the result of MBP can be one of

graphic file with name M426.gif

However, we prefer a model that does not satisfy any cube in Inline graphic, rules off the two possibilities on the right. None of these cubes cover Inline graphic, hence generalization is used.

If the first cube is obtained by MBP, it is generalized into Inline graphic; the second cube is already an over-approximation; the third cube is generalized into Inline graphic. Indeed, each of these cubes over-approximates Inline graphic.graphic file with name 501999_1_En_7_Figc_HTML.jpg

Concretize Rule for LIA

The Concretize rule (Algorithm 2) takes a cluster of lemmas Inline graphic and a pob Inline graphic such that each lemma in Inline graphic partially blocks Inline graphic, and creates a new pob Inline graphic that is still not blocked by Inline graphic, but Inline graphic is more concrete, i.e., Inline graphic. In our implementation, this rule is applied when Inline graphic is in Inline graphic. We further require that the pattern, Inline graphic, of Inline graphic is non-linear, i.e., some of the constants appear in Inline graphic with free variables as their coefficients. We denote these constants by U. An example is the pattern Inline graphic, where Inline graphic. Having such a cluster is an indication that attempting to block Inline graphic in full with a single lemma may require to track non-linear correlations between the constants, which is impossible to do in LIA. In such cases, we identify the coupling of the constants in U in pobs (and hence in lemmas) as the potential source of non-linearity. Hence, we concretize (strengthen) Inline graphic into a pob Inline graphic where the constants in U are no longer coupled to any other constant.

Coupling. Formally, constants u and v are coupled in a cube c, denoted Inline graphic, if there exists a literal Inline graphic in c such that both u and v appear in Inline graphic (i.e., their coefficients in Inline graphic are non-zero). For example, x and y are coupled in Inline graphic whereas neither of them are coupled with z. A constant u is said to be isolated in a cube c, denoted Inline graphic, if it appears in c but it is not coupled with any other constant in c. In the above cube, z is isolated.

Concretization by Decoupling. Given a pob Inline graphic (a cube) and a cluster Inline graphic, Algorithm 4 presents our approach for concretizing Inline graphic by decoupling the constants in U—those that have variables as coefficients in the pattern of Inline graphic (line 2). Concretization is guided by a model Inline graphic, representing a part of Inline graphic that is not yet blocked by the lemmas in Inline graphic (line 3). Given such M, we concretize Inline graphic into a model-preserving under-approximation that isolates all the constants in U and preserves all other couplings. That is, we find a cube Inline graphic, such that

graphic file with name 501999_1_En_7_Equ1_HTML.gif 1

Note that Inline graphic is not blocked by Inline graphic since M satisfies both Inline graphic and Inline graphic. For example, if Inline graphic and Inline graphic, then Inline graphic is a model preserving under-approximation that isolates Inline graphic.

Algorithm 4 computes such a cube Inline graphic by a point-wise concretization of the literals of Inline graphic followed by the removal of subsumed literals. Literals that do not contain constants from U remain unchanged. A literal of the form Inline graphic, where Inline graphic (recall that every literal in Inline graphic can be normalized to this form), that includes constants from U is concretized into a cube by (1) isolating each of the summands Inline graphic in t that include U from the rest, and (2) for each of the resulting sub-expressions creating a literal that uses its value in M as a bound. Formally, t is decomposed to Inline graphic, where Inline graphic. The concretization of Inline graphic is the cube Inline graphic, where Inline graphic denotes the interpretation of Inline graphic in M. Note that Inline graphic since the bounds are stronger than the original bound on t: Inline graphic. This ensures that Inline graphic, obtained by the conjunction of literal concretizations, implies Inline graphic. It trivially satisfies the other conditions of Eq. (1).

For example, the concretization of the literal Inline graphic with respect to Inline graphic and Inline graphic is the cube Inline graphic. Applying concretization in a similar manner to all the literals of the cube Inline graphic from the previous example, we obtain the concretization Inline graphic. Note that the last literal is not concretized as it does not include y.

Conjecture Rule for LIA

The Conjecture rule (see Algorithm 2) takes a set of lemmas Inline graphic and a pob Inline graphic such that all lemmas in Inline graphic block Inline graphic, but none of them blocks Inline graphic, where Inline graphic does not include any known reachable states. It returns Inline graphic as a new pob.

For LIA, Conjecture is applied when the following conditions are met: (1) the pob Inline graphic is of the form Inline graphic, where Inline graphic, and Inline graphic and Inline graphic are any cubes. The sub-cube Inline graphic acts as Inline graphic, while the sub-cube Inline graphic acts as Inline graphic. (2) The cluster Inline graphic consists of Inline graphic, where Inline graphic and Inline graphic. This means that each of the lemmas in Inline graphic blocks Inline graphic, and they may be ordered as a sequence of increasingly stronger lemmas, indicating that they were created by trying to block the pob at different levels, leading to too strong lemmas that failed to propagate to higher levels. (3) The formula Inline graphic is satisfiable, that is, none of the lemmas in Inline graphic block Inline graphic, and (4) Inline graphic, that is, no state in Inline graphic is known to be reachable. If all four conditions are met, we conjecture Inline graphic. This is implemented by conjecture, that returns Inline graphic (or Inline graphic when the pre-conditions are not met).

graphic file with name 501999_1_En_7_Figd_HTML.jpg

For example, consider the pob Inline graphic and a cluster of lemmas Inline graphic. In this case, Inline graphic, Inline graphic, Inline graphic, and Inline graphic. Each of the lemmas in Inline graphic block Inline graphic but none of them block Inline graphic. Therefore, we conjecture Inline graphic: Inline graphic.

Putting It All Together

Having explained the implementation of the new rules for LIA, we now put all the ingredients together into an algorithm, GSpacer. In particular, we present our choices as to when to apply the new rules, and on which clusters of lemmas and pobs. As can be seen in Sect. 5, this implementation works very well on a wide range of benchmarks.

Algorithm 5 presents GSpacer. The comments to the right side of a line refer to the abstract rules in Algorithm 1 and 2. Just like Spacer, GSpacer iteratively computes predecessors (line 10) and blocks them (line 14) in an infinite loop. Whenever a pob is proven to be reachable, the reachable states are updated (line 38). If Inline graphic intersects with a reachable state, GSpacer terminates and returns unsafe  (line 12). If one of the frames is an inductive invariant, GSpacer terminates with safe  (line 20).

When a pob Inline graphic is handled, we first apply the Concretize rule, if possible (line 7). Recall that Concretize (Algorithm 4) takes as input a cluster that partially blocks Inline graphic and has a non-linear pattern. To obtain such a cluster, we first find, using Inline graphic, a cluster Inline graphic, where Inline graphic, that includes some lemma (from frame k) that blocks Inline graphic; if none exists, Inline graphic. We then filter out from Inline graphic lemmas that completely block Inline graphic as well as lemmas that are irrelevant to Inline graphic, i.e., we obtain Inline graphic by keeping only lemmas that partially block Inline graphic. We apply Concretize on Inline graphic to obtain a new pob that under-approximates Inline graphic if (1) the remaining sub-cluster, Inline graphic, is non-empty, (2) the pattern, Inline graphic, is non-linear, and (3) Inline graphic is satisfiable, i.e., a part of Inline graphic is not blocked by any lemma in Inline graphic.

Once a pob is blocked, and a new lemma that blocks it, Inline graphic, is added to the frames, an attempt is made to apply the Subsume and Conjecture rules on a cluster that includes Inline graphic. To that end, the function Inline graphic finds a cluster Inline graphic to which Inline graphic belongs (Sect. 4.2). Note that the choice of cluster is arbitrary. The rules are applied on Inline graphic if the required pre-conditions are met (line 49 and line 53, respectively). When applicable, Subsume returns a new lemma that is added to the frames, while Conjecture returns a new pob that is added to the queue. Note that the latter is a may pob, in the sense that some of the states it represents may not lead to safety violation.

Ensuring Progress. Spacer always makes progress: as its search continues, it establishes absence of counterexamples of deeper and deeper depths. However, GSpacer does not ensure progress. Specifically, unrestricted application of the Concretize and Conjecture rules can make GSpacer diverge even on executions of a fixed bound. In our implementation, we ensure progress by allotting a fixed amount of gas to each pattern, Inline graphic, that forms a cluster. Each time Concretize or Conjecture is applied to a cluster with Inline graphic as the pattern, Inline graphic loses some gas. Whenever Inline graphic runs out of gas, the rules are no longer applied to any cluster with Inline graphic as the pattern. There are finitely many patterns (assuming LIA terms are normalized). Thus, in each bounded execution of GSpacer, the Concretize and Conjecture rules are applied only a finite number of times, thereby, ensuring progress. Since the Subsume rule does not hinder progress, it is applied without any restriction on gas.

Evaluation

We have implemented2 GSpacer  (Algorithm 5) as an extension to Spacer. To reduce the dimension of a matrix (in subsume, Sect. 4.3), we compute pairwise linear dependencies between all pairs of columns instead of computing the full kernel. This does not necessarily reduce the dimension of the matrix to its rank, but, is sufficient for our benchmarks. We have experimented with computing the full kernel using SageMath  [25], but the overall performance did not improve. Clustering is implemented by anti-unification. LIA terms are normalized using default Z3 simplifications. Our implementation also supports global generalization for non-linear CHCs. We have also extended our work to the theory of LRA. We defer the details of this extension to an extended version of the paper.

To evaluate our implementation, we have conducted two sets of experiments3. All experiments were run on Intel E5-2690 V2 CPU at 3 GHz with 128 GB memory with a timeout of 10 min. First, to evaluate the performance of local reasoning with global guidance against pure local reasoning, we have compared GSpacer with the latest Spacer, to which we refer as the baseline. We took the benchmarks from CHC-COMP 2018 and 2019  [10]. We compare to Spacer because it dominated the competition by solving Inline graphic of the benchmarks in CHC-COMP 2019 (Inline graphic more than the runner up) and Inline graphic of the benchmarks in CHC-COMP 2018 (Inline graphic more than runner up). Our evaluation shows that GSpacer outperforms Spacer both in terms of number of solved instances and, more importantly, in overall robustness.

Second, to examine the performance of local reasoning with global guidance compared to solely global reasoning, we have compared GSpacer with an ML-based data-driven invariant inference tool LinearArbitrary  [28]. Compared to other similar approaches, LinearArbitrary stands out by supporting invariants with arbitrary Boolean structure over arbitrary linear predicates. It is completely automated and does not require user-provided predicates, grammars, or any other guidance. For the comparison with LinearArbitrary, we have used both the CHC-COMP benchmarks, as well as the benchmarks from the artifact evaluation of  [28]. The machine and timeout remain the same. Our evaluation shows that GSpacer is superior in this case as well.

Comparison with Spacer. Table 1 summarizes the comparison between Spacer and GSpacer on CHC-COMP instances. Since both tools can use a variety of interpolation strategies during lemma generalization (Line 45 in Algorithm 5), we compare three different configurations of each: bw and fw stand for two interpolation strategies, backward and forward, respectively, already implemented in Spacer, and sc stands for turning interpolation off and generalizing lemmas only by subset clauses computed by inductive generalization.

Table 1.

Comparison between Spacer and GSpacer on CHC-COMP.

graphic file with name 501999_1_En_7_Tab1_HTML.jpg

Any configuration of GSpacer solves significantly more instances than even the best configuration of Spacer. Figure 2 provides a more detailed comparison between the best configurations of both tools in terms of running time and depth of convergence. There is no clear trend in terms of running time on instances solved by both tools. This is not surprising—SMT-solving run time is highly non-deterministic and any change in strategy has a significant impact on performance of SMT queries involved. In terms of depth, it is clear that GSpacer converges at the same or lower depth. The depth is significantly lower for instances solved only by GSpacer.

Fig. 2.

Fig. 2.

Best configurations: GSpacer versus Spacer.

Moreover, the performance of GSpacer is not significantly affected by the interpolation strategy used. In fact, the configuration sc in which interpolation is disabled performs the best in CHC-COMP 2018, and only slightly worse in CHC-COMP 2019! In comparison, disabling interpolation hurts Spacer significantly.

Figure 3 provides a detailed comparison of GSpacer with and without interpolation. Interpolation makes no difference to the depth of convergence. This implies that lemmas that are discovered by interpolation are discovered as efficiently by the global rules of GSpacer. On the other hand, interpolation significantly increases the running time. Interestingly, the time spent in interpolation itself is insignificant. However, the lemmas produced by interpolation tend to slow down other aspects of the algorithm. Most of the slow down is in increased time for inductive generalization and in computation of predecessors. The comparison between the other interpolation-enabled strategy and GSpacer (sc) shows a similar trend.

Fig. 3.

Fig. 3.

Comparing GSpacer with different interpolation tactics.

Comparison with LinearArbitrary. In  [28], the authors show that LinearArbitrary, to which we refer as LArb for short, significantly outperforms Spacer on a curated subset of benchmarks from SV-COMP  [24] competition.

At first, we attempted to compare LArb against GSpacer on the CHC-COMP benchmarks. However, LArb did not perform well on them. Even the baseline Spacer has outperformed LArb significantly. Therefore, for a more meaningful comparison, we have also compared Spacer, LArb and GSpacer on the benchmarks from the artifact evaluation of  [28]. The results are summarized in Table 2. As expected, LArb outperforms the baseline Spacer on the safe benchmarks. On unsafe benchmarks, Spacer is significantly better than LArb. In both categories, GSpacer dominates solving more safe benchmarks than either Spacer or LArb, while matching performance of Spacer on unsafe instances. Furthermore, GSpacer remains orders of magnitude faster than LArb on benchmarks that are solved by both. This comparison shows that incorporating local reasoning with global guidance not only mitigates its shortcomings but also surpasses global data-driven reasoning.

Table 2.

Comparison with LArb.

graphic file with name 501999_1_En_7_Tab2_HTML.jpg

Related Work

The limitations of local reasoning in SMT-based infinite state model checking are well known. Most commonly, they are addressed with either (a) different strategies for local generalization in interpolation (e.g., [1, 6, 19, 23]), or (b) shifting the focus to global invariant inference by learning an invariant of a restricted shape (e.g., [9, 1416, 28]).

Interpolation Strategies. Albarghouthi and McMillan  [1] suggest to minimize the number of literals in an interpolant, arguing that simpler (i.e., fewer half-spaces) interpolants are more likely to generalize. This helps with myopic generalizations (Fig. 1(a)), but not with excessive generalizations (Fig. 1(b)). On the contrary, Blicha et al.  [6] decompose interpolants to be numerically simpler (but with more literals), which helps with excessive, but not with myopic, generalizations. Deciding locally between these two techniques or on their combination (i.e., some parts of an interpolant might need to be split while others combined) seems impossible. Schindler and Jovanovic  [23] propose local interpolation that bounds the number of lemmas generated from a single pob (which helps with Fig. 1(c)), but only if inductive generalization is disabled. Finally,  [19] suggests using external guidance, in a form of predicates or terms, to guide interpolation. In contrast, GSpacer uses global guidance, based on the current proof, to direct different local generalization strategies. Thus, the guidance is automatically tuned to the specific instance at hand rather than to a domain of problems.

Global Invariant Inference. An alternative to inferring lemmas for the inductive invariant by blocking counterexamples is to enumerate the space of potential candidate invariants  [9, 1416, 28]. This does not suffer from the pitfall of local reasoning. However, it is only effective when the search space is constrained. While these approaches perform well on their target domain, they do not generalize well to a diverse set of benchmarks, as illustrated by results of CHC-COMP and our empirical evaluation in Sect. 5.

Locality in SMT and IMC. Local reasoning is also a known issue in SMT, and, in particular, in DPLL(T) (e.g.,  [22]). However, we are not aware of global guidance techniques for SMT solvers. Interpolation-based Model Checking (IMC)  [20, 21] that uses interpolants from proofs, inherits the problem. Compared to IMC, the propagation phase and inductive generalization of IC3   [7], can be seen as providing global guidance using lemmas found in other parts of the search-space. In contrast, GSpacer magnifies such global guidance by exploiting patterns within the lemmas themselves.

IC3-SMT-based Model Checkers. There are a number of IC3-style SMT-based infinite state model checkers, including  [11, 17, 18]. To our knowledge, none extend the IC3-SMT framework with a global guidance. A rule similar to Subsume is suggested in  [26] for the theory of bit-vectors and in  [4] for LRA, but in both cases without global guidance. In  [4], it is implemented via a combination of syntactic closure with interpolation, whereas we use MBP instead of interpolation. Refinement State Mining in  [3] uses similar insights to our Subsume rule to refine predicate abstraction.

Conclusion and Future Work

This paper introduces global guidance to mitigate the limitations of the local reasoning performed by SMT-based IC3-style model checking algorithms. Global guidance is necessary to redirect such algorithms from divergence due to persistent local reasoning. To this end, we present three general rules that introduce new lemmas and pobs by taking a global view of the lemmas learned so far. The new rules are not theory-specific, and, as demonstrated by Algorithm 5, can be incorporated to IC3-style solvers without modifying existing architecture. We instantiate, and implement, the rules for LIA in GSpacer, which extends Spacer.

Our evaluation shows that global guidance brings significant improvements to local reasoning, and surpasses invariant inference based solely on global reasoning. More importantly, global guidance decouples Spacer ’s dependency on interpolation strategy and performs almost equally well under all three interpolation schemes we consider. As such, using global guidance in the context of theories for which no good interpolation procedure exists, with bit-vectors being a primary example, arises as a promising direction for future research.

Acknowledgements

We thank Xujie Si for running the LArb experiments and collecting results. We thank the ERC starting Grant SYMCAR 639270 and the Wallenberg Academy Fellowship TheProSE for supporting the research visit. This research was partially supported by the United States-Israel Binational Science Foundation (BSF) grant No. 2016260, and the Israeli Science Foundation (ISF) grant No. 1810/18. This research was partially supported by grants from Natural Sciences and Engineering Research Council Canada.

Footnotes

1

In fact, a primed copy is introduced in Inline graphic only for the uninterpreted symbols in Inline graphic. Interpreted symbols remain the same in Inline graphic.

3

Detailed experimental results including the effectiveness of each rule, and the extensions to non-linear CHCs and LRA can be found at https://hgvk94.github.io/gspacer/.

Contributor Information

Shuvendu K. Lahiri, Email: shuvendu.lahiri@microsoft.com

Chao Wang, Email: wang626@usc.edu.

Hari Govind Vediramana Krishnan, Email: hgvk94@gmail.com.

References

  • 1.Albarghouthi A, McMillan KL. Beautiful interpolants. In: Sharygina N, Veith H, editors. Computer Aided Verification; Heidelberg: Springer; 2013. pp. 313–329. [Google Scholar]
  • 2.Benoy F, King A, Mesnard F. Computing convex hulls with a linear solver. TPLP. 2005;5(1–2):259–271. [Google Scholar]
  • 3.Birgmeier J, Bradley AR, Weissenbacher G. Counterexample to induction-guided abstraction-refinement (CTIGAR). In: Biere A, Bloem R, editors. Computer Aided Verification; Cham: Springer; 2014. pp. 831–848. [Google Scholar]
  • 4.Bjørner N, Gurfinkel A. Property directed polyhedral abstraction. In: D’Souza D, Lal A, Larsen KG, editors. Verification, Model Checking, and Abstract Interpretation; Heidelberg: Springer; 2015. pp. 263–281. [Google Scholar]
  • 5.Bjørner, N., Janota, M.: Playing with quantified satisfaction. In: 20th International Conferences on Logic for Programming, Artificial Intelligence and Reasoning - Short Presentations, LPAR 2015, Suva, Fiji, 24–28 November 2015, pp. 15–27 (2015)
  • 6.Blicha M, Hyvärinen AEJ, Kofroň J, Sharygina N. Decomposing farkas interpolants. In: Vojnar T, Zhang L, editors. Tools and Algorithms for the Construction and Analysis of Systems; Cham: Springer; 2019. pp. 3–20. [Google Scholar]
  • 7.Bradley AR. SAT-based model checking without unrolling. In: Jhala R, Schmidt D, editors. Verification, Model Checking, and Abstract Interpretation; Heidelberg: Springer; 2011. pp. 70–87. [Google Scholar]
  • 8.Bulychev PE, Kostylev EV, Zakharov VA. Anti-unification algorithms and their applications in program analysis. In: Pnueli A, Virbitskaite I, Voronkov A, editors. Perspectives of Systems Informatics; Heidelberg: Springer; 2010. pp. 413–423. [Google Scholar]
  • 9.Champion A, Chiba T, Kobayashi N, Sato R. ICE-based refinement type discovery for higher-order functional programs. In: Beyer D, Huisman M, editors. Tools and Algorithms for the Construction and Analysis of Systems; Cham: Springer; 2018. pp. 365–384. [Google Scholar]
  • 10.CHC-COMP. CHC-COMP. https://chc-comp.github.io
  • 11.Cimatti A, Griggio A, Mover S, Tonetta S. Infinite-state invariant checking with IC3 and predicate abstraction. Formal Methods Syst. Des. 2016;49(3):190–218. doi: 10.1007/s10703-016-0257-4. [DOI] [Google Scholar]
  • 12.Cousot, P., Cousot, R.: Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Conference Record of the Fourth ACM Symposium on Principles of Programming Languages, Los Angeles, California, USA, January 1977, pp. 238–252 (1977)
  • 13.de Moura L, Bjørner N. Z3: an efficient SMT solver. In: Ramakrishnan CR, Rehof J, editors. Tools and Algorithms for the Construction and Analysis of Systems; Heidelberg: Springer; 2008. pp. 337–340. [Google Scholar]
  • 14.Fedyukovich, G., Kaufman, S.J., Bodík, R.: Sampling invariants from frequency distributions. In: 2017 Formal Methods in Computer Aided Design, FMCAD 2017, Vienna, Austria, 2–6 October 2017, pp. 100–107 (2017)
  • 15.Flanagan C, Leino KRM. Houdini, an annotation assistant for ESC/Java. In: Oliveira JN, Zave P, editors. FME 2001: Formal Methods for Increasing Software Productivity; Heidelberg: Springer; 2001. pp. 500–517. [Google Scholar]
  • 16.Garg, P., Neider, D., Madhusudan, P., Roth, D.: Learning invariants using decision trees and implication counterexamples. In: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2016, St. Petersburg, FL, USA, 20–22 January 2016, pp. 499–512 (2016)
  • 17.Jovanovic, D., Dutertre, B.: Property-directed k-induction. In: 2016 Formal Methods in Computer-Aided Design, FMCAD 2016, Mountain View, CA, USA, 3–6 October 2016, pp. 85–92 (2016)
  • 18.Komuravelli A, Gurfinkel A, Chaki S. SMT-based model checking for recursive programs. In: Biere A, Bloem R, editors. Computer Aided Verification; Cham: Springer; 2014. pp. 17–34. [Google Scholar]
  • 19.Leroux J, Rümmer P, Subotić P. Guiding Craig interpolation with domain-specific abstractions. Acta Informatica. 2015;53(4):387–424. doi: 10.1007/s00236-015-0236-z. [DOI] [Google Scholar]
  • 20.McMillan KL. Interpolation and SAT-based model checking. In: Hunt WA, Somenzi F, editors. Computer Aided Verification; Heidelberg: Springer; 2003. pp. 1–13. [Google Scholar]
  • 21.McMillan KL. Lazy abstraction with interpolants. In: Ball T, Jones RB, editors. Computer Aided Verification; Heidelberg: Springer; 2006. pp. 123–136. [Google Scholar]
  • 22.McMillan KL, Kuehlmann A, Sagiv M. Generalizing DPLL to richer logics. In: Bouajjani A, Maler O, editors. Computer Aided Verification; Heidelberg: Springer; 2009. pp. 462–476. [Google Scholar]
  • 23.Schindler T, Jovanović D. Selfless interpolation for infinite-state model checking; Verification, Model Checking, and Abstract Interpretation; Cham: Springer; 2018. pp. 495–515. [Google Scholar]
  • 24.SV-COMP. SV-COMP. https://sv-comp.sosy-lab.org/
  • 25.The Sage Developers. SageMath, the Sage Mathematics Software System (Version 8.1.0) (2017). https://www.sagemath.org
  • 26.Welp, T., Kuehlmann, A.: QF\_BV model checking with property directed reachability. In: Design, Automation and Test in Europe, DATE 13, Grenoble, France, 18–22 March 2013, pp. 791–796 (2013)
  • 27.Yernaux G, Vanhoof W. Anti-unification in constraint logic programming. TPLP. 2019;19(5–6):773–789. [Google Scholar]
  • 28.Zhu, H., Magill, S., Jagannathan, S.: A data-driven CHC solver. In: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2018, Philadelphia, PA, USA, 18–22 June 2018, pp. 707–721 (2018)

Articles from Computer Aided Verification are provided here courtesy of Nature Publishing Group

RESOURCES