Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2020 Jun 16;12225:512–538. doi: 10.1007/978-3-030-53291-8_27

PrIC3: Property Directed Reachability for MDPs

Kevin Batz 10,, Sebastian Junges 11, Benjamin Lucien Kaminski 12, Joost-Pieter Katoen 10, Christoph Matheja 13, Philipp Schröer 10
Editors: Shuvendu K Lahiri8, Chao Wang9
PMCID: PMC7363441

Abstract

IC3 has been a leap forward in symbolic model checking. This paper proposes PrIC3 (pronounced pricy-three), a conservative extension of IC3 to symbolic model checking of MDPs. Our main focus is to develop the theory underlying PrIC3. Alongside, we present a first implementation of PrIC3 including the key ingredients from IC3 such as generalization, repushing, and propagation.


graphic file with name 501999_1_En_27_Figa_HTML.jpg

Introduction

IC3. Also known as property-directed reachability (PDR)  [23], IC3   [13] is a symbolic approach for verifying finite transition systems (TSs) against safety properties like “bad states are unreachable”. It combines bounded model checking (BMC)  [12] and inductive invariant generation. Put shortly, IC3 either proves that a set B of bad states is unreachable by finding a set of non-B states closed under reachability—called an inductive invariant—or refutes reachability of B by a counterexample path reaching B. Rather than unrolling the transition relation (as in BMC), IC3 attempts to incrementally strengthen the invariant “no state in B is reachable” into an inductive one. In addition, it applies aggressive abstraction to the explored state space, so-called generalization  [36]. These aspects together with the enormous advances in modern SAT solvers have led to IC3 ’s success. IC3 has been extended  [27, 38] and adapted to software verification  [19, 44]. This paper develops a quantitative IC3 framework for probabilistic models.

MDPs. Markov decision processes (MDPs) extend TSs with discrete probabilistic choices. They are central in planning, AI as well as in modeling randomized distributed algorithms. A key question in verifying MDPs is quantitative reachability: “is the (maximal) probability to reach B at most Inline graphic?”. Quantitative reachability  [5, 6] reduces to solving linear programs (LPs). Various tools support MDP model checking, e.g., Prism  [43], Storm  [22], modest  [34], and EPMC  [31]. The LPs are mostly solved using (variants of) value iteration  [8, 28, 35, 51]. Symbolic BDD-based MDP model checking originated two decades ago  [4] and is rather successful.

Towards IC3  for MDPs. Despite the success of BDD-based symbolic methods in tools like Prism, IC3 has not penetrated probabilistic model checking yet. The success of IC3 and the importance of quantitative reachability in probabilistic model checking raises the question whether and how IC3 can be adapted—not just utilized—to reason about quantitative reachability in MDPs. This paper addresses the challenges of answering this question. It extends IC3 in several dimensions to overcome these hurdles, making PrIC3—to our knowledge—the first IC3 framework for quantitative reachability in MDPs1. Notably, PrIC3 is conservative: For a threshold Inline graphic, PrIC3 solves the same qualitative problem and behaves (almost) the same as standard IC3. Our main contribution is developing the theory underlying PrIC3, which is accompanied by a proof-of-concept implementation.

Challenge 1

(Leaving the Boolean domain).IC3 iteratively computes frames, which are over-approximations of sets of states that can reach B in a bounded number of steps. For MDPs, Boolean reachability becomes a quantitative reachability probability. This requires a shift: frames become real-valued functions rather than sets of states. Thus, there are infinitely many possible frames—even for finite-state MDPs—just as for infinite-state software  [19, 44] and hybrid systems  [54]. Additionally, whereas in TSs a state reachable within k steps remains reachable on increasing k, the reachability probability in MDPs may increase. This complicates ensuring termination of an IC3 algorithm for MDPs.   Inline graphic

Challenge 2

(Counterexamples Inline graphic single paths). For TSs, a single cycle-free path2 to B suffices to refute that “B is not reachable”. This is not true in the probabilistic setting  [32]. Instead, proving that the probability of reaching B exceeds the threshold Inline graphic requires a set of possibly cyclic paths—e.g., represented as a sub-MDP  [15]—whose probability mass exceeds Inline graphic. Handling sets of paths as counterexamples in the context of IC3 is new.   Inline graphic

Challenge 3

(Strengthening). This key IC3 technique intuitively turns a proof obligation of type (i) “state s is unreachable from the initial state Inline graphic” into type (ii) “s’s predecessors are unreachable from Inline graphic”. A first issue is that in the quantitative setting, the standard characterization of reachability probabilities in MDPs (the Bellman equations) inherently reverses the direction of reasoning (cf. “reverse” IC3   [53]): Hence, strengthening turns (i) “s cannot reach Inline graphic” into (ii) “s’s successors cannot reach Inline graphic”.

A much more challenging issue, however, is that in the quantitative setting obligations of type (i) read “s is reachable with at most probability Inline graphic”. However, the strengthened type (ii) obligation must then read: “the weighted sum over the reachability probabilities of the successors of s is at most Inline graphic”. In general, there are infinitely many possible choices of subobligations for the successors of s in order to satisfy the original obligation, because—grossly simplified—there are infinitely many possibilities for a and b to satisfy weighted sums such as Inline graphic. While we only need one choice of subobligations, picking a good one is approximately as hard as solving the entire problem altogether. We hence require a heuristic, which is guided by a user-provided oracle.   Inline graphic

Challenge 4

(Generalization). “One of the key components of IC3 is [inductive] generalization”  [13]. Generalization  [36] abstracts single states. It makes IC3 scale, but is not essential for correctness. To facilitate generalization, systems should be encoded symbolically, i.e., integer-valued program variables describe states. Frames thus map variables to probabilities. A first aspect is how to effectively present them to an SMT-solver. Conceptually, we use uninterpreted functions and universal quantifiers (encoding program behavior) together with linear real arithmetic to encode the weighted sums occurring when reasoning about probabilities. A second aspect is more fundamental: Abstractly, IC3 ’s generalization guesses an unreachable set of states. We, however, need to guess this set and a probability for each state. To be effective, these guesses should moreover eventually yield an inductive frame, which is often highly nonlinear. We propose three SMT-guided interpolation variants for guessing these maps.   Inline graphic

Structure of this Paper. We develop PrIC3 gradually: We explain the underlying rationale in Sect. 3. We also describe the core of PrIC3—called Inline graphic—which resembles closely the main loop of standard IC3, but uses adapted frames and termination criteria (Challenge 1). In line with Challenge 3, Inline graphic is parameterized by a heuristic Inline graphic which is applied whenever we need to select one out of infinitely many probabilities. No requirements on the quality of Inline graphic are imposed. Inline graphic is sound and always terminates: If it returns Inline graphic, then the maximal reachability probability is bounded by Inline graphic. Without additional assumptions about Inline graphic, Inline graphic is incomplete: on returning Inline graphic, it is unknown whether the returned subMDP is indeed a counterexample (Challenge 2). Section 4 details strengthening (Challenge 3). Section 5 presents a sound and complete algorithm Inline graphic on top of Inline graphic. Section 6 presents a prototype, discusses our chosen heuristics, and addresses Challenge 4. Section 7 shows some encouraging experiments, but also illustrates need for further progress.

Related Work. Just like IC3 has been a symbiosis of different approaches, PrIC3 has been inspired by several existing techniques from the verification of probabilistic systems.

BMC. Adaptions of BMC to Markov chains (MCs) with a dedicated treatment of cycles have been pursued in  [57]. The encoding in  [24] annotates sub-formulae with probabilities. The integrated SAT solving process implicitly unrolls all paths leading to an exponential blow-up. In [52], this is circumvented by grouping paths, discretizing them, and using an encoding with quantifiers and bit-vectors, but without numerical values. Recently, [56] extends this idea to a PAC algorithm by purely propositional encodings and (approximate) model counting [17]. These approaches focus on MCs and are not mature yet.

Invariant Synthesis. Quantitative loop invariants are key in analyzing probabilistic programs whose operational semantics are (possibly infinite) MDPs  [26]. A quantitative invariant I maps states to probabilities. I is shown to be an invariant by comparing I to the result of applying the MDP’s Bellman operator to I. Existing approaches for invariant synthesis are, e.g., based on weakest pre-expectations  [33, 39, 40, 42, 46], template-based constraint solving  [25], notions of martingales  [3, 9, 16, 55], and solving recurrence relations  [10]. All but the last technique require user guidance.

Abstraction. To combat state-space explosion, abstraction is often employed. CEGAR for MDPs  [37] deals with explicit sets of paths as counterexamples. Game-based abstraction  [30, 41] and partial exploration  [14] exploit that not all paths have to be explored to prove bounds on reachability probabilities.

Statistical Methods and (deep) Reinforcement Learning. Finally, an avenue that avoids storing a (complete) model are simulation-based approaches (statistical model checking  [2]) and variants of reinforcement learning, possibly with neural networks. For MDPs, these approaches yield weak statistical guarantees  [20], but may provide good oracles.

Problem Statement

Our aim is to prove that the maximal probability of reaching a set Inline graphic of bad states from the initial state Inline graphic of a Markov decision process Inline graphic is at most some threshold Inline graphic. Below, we give a formal description of our problem. We refer to  [7, 50] for a thorough introduction.

Definition 1

(MDPs). A Markov decision process (MDP) is a tuple Inline graphic, where S is a finite set of states, Inline graphic is the initial state, Inline graphic is a finite set of actions, and Inline graphic is a transition probability function. For state s, let Inline graphic be the enabled actions at s. For all states Inline graphic, we require Inline graphic and Inline graphic.    Inline graphic

For this paper, we fix an MDP Inline graphic, a set of bad states Inline graphic, and a threshold Inline graphic. The maximal3 (unbounded) reachability probability to eventually reach a state in Inline graphic from a state s is denoted by Inline graphic. We characterize Inline graphic using the so-called Bellman operator. Let Inline graphic denote the set of functions from N to M. Anticipating IC3 terminology, we call a function Inline graphic a frame. We denote by F[s] the evaluation of frame F for state s.

Definition 2

(Bellman Operator). For a set of actions Inline graphic, we define the Bellman operator for A as a frame transformer Inline graphic with

graphic file with name M52.gif

We write Inline graphic for Inline graphic, Inline graphic for Inline graphic, and call Inline graphic simply the Bellman operator.    Inline graphic

For every state s, the maximal reachability probability Inline graphic is then given by the least fixed point of the Bellman operator Inline graphic. That is,

graphic file with name M61.gif

where the underlying partial order on frames is a complete lattice with ordering

graphic file with name M62.gif

In terms of the Bellman operator, our formal problem statement reads as follows: graphic file with name 501999_1_En_27_Figb_HTML.jpg Whenever Inline graphic indeed holds, we say that the MDP Inline graphic is safe (with respect to the set of bad states Inline graphic and threshold Inline graphic); otherwise, we call it unsafe.

Recovery Statement 1

For Inline graphic, our problem statement is equivalent to the qualitative reachability problem solved by (reverse) standard IC3, i.e, prove or refute that all bad states in Inline graphic are unreachable from the initial state Inline graphic.

Example 1

The MDP Inline graphic in Fig. 1 consists of 6 states with initial state Inline graphic and bad states Inline graphic. In Inline graphic, actions a and b are enabled; in all other states, one unlabeled action is enabled. We have Inline graphic. Hence, Inline graphic is safe for all thresholds Inline graphic and unsafe for Inline graphic. In particular, Inline graphic is unsafe for Inline graphic as Inline graphic is reachable from Inline graphic.   Inline graphic

Fig. 1.

Fig. 1.

The MDP Inline graphic serving as a running example.

The Core PrIC3 Algorithm

The purpose of PrIC3 is to prove or refute that the maximal probability to reach a bad state in Inline graphic from the initial state Inline graphic of the MDP Inline graphic is at most Inline graphic. In this section, we explain the rationale underlying PrIC3. Moreover, we describe the core of PrIC3—called Inline graphic—which bears close resemblance to the main loop of standard IC3 for TSs.

Because of the inherent direction of the Bellman operator, we build PrIC3 on reverse IC3   [53], cf. Challenge 3. Reversing constitutes a shift from reasoning along the direction initial-to-bad to bad-to-initial. While this shift is mostly inessential to the fundamentals underlying IC3, the reverse direction is unswayable in the probabilistic setting. Whenever we draw a connection to standard IC3, we thus generally mean reverse IC3.

Inductive Frames

IC3 for TSs operates on (qualitative) frames representing sets of states of the TS at hand. A frame F can hence be thought of as a mapping4 from states to Inline graphic. In PrIC3 for MDPs, we need to move from a Boolean to a quantitative regime. Hence, a (quantitative) frame is a mapping from states to probabilities in [0, 1].

For a given TS, consider the frame transformer T that adds to a given input frame Inline graphic all bad states in Inline graphic and all predecessors of the states contained in Inline graphic. The rationale of standard (reverse) IC3 is to find a frame Inline graphic such that (i) the initial state Inline graphic does not belong to F and (ii) applying T takes us down in the partial order on frames, i.e.,

graphic file with name M92.gif

Intuitively, (i) postulates the hypothesis that Inline graphic cannot reach Inline graphic and (ii) expresses that F is closed under adding bad states and taking predecessors, thus affirming the hypothesis.

Analogously, the rationale of PrIC3 is to find a frame Inline graphic such that (iF postulates that the probability of Inline graphic to reach Inline graphic is at most the threshold Inline graphic and (ii) applying the Bellman operator Inline graphic to F takes us down in the partial order on frames, i.e.,

graphic file with name M100.gif

Frames satisfying the above conditions are called inductive invariants in IC3. We adopt this terminology. By Park’s Lemma  [48], which in our setting reads

graphic file with name M101.gif

an inductive invariant F would indeed witness that Inline graphic, because

graphic file with name M103.gif

If no inductive invariant exists, then standard IC3 will find a counterexample: a path from the initial state Inline graphic to a bad state in Inline graphic, which serves as a witness to refute. Analogously, PrIC3 will find a counterexample, but of a different kind: Since single paths are insufficient as counterexamples in the probabilistic realm (Challenge 2), PrIC3 will instead find a subsystem of states of the MDP witnessing Inline graphic.

The PrIC3 Invariants

Analogously to standard IC3, PrIC3 aims to find the inductive invariant by maintaining a sequence of frames Inline graphic such that Inline graphic overapproximates the maximal probability of reaching B from s within at most i steps. This i-step-bounded reachability probability Inline graphic can be characterized using the Bellman operator: Inline graphic is the 0-step probability; it is 1 for every Inline graphic and 0 otherwise. For any Inline graphic, we have

graphic file with name 501999_1_En_27_Equ18_HTML.gif

where Inline graphic, the frame that maps every state to 0, is the least frame of the underlying complete lattice. For a finite MDP, the unbounded reachability probability is then given by the limit

graphic file with name 501999_1_En_27_Equ19_HTML.gif

where Inline graphic is a consequence of the well-known Kleene fixed point theorem  [45].

The sequence Inline graphic maintained by PrIC3 should frame-wise overapproximate the increasing sequence Inline graphic. Pictorially: graphic file with name 501999_1_En_27_Figc_HTML.jpg However, the sequence Inline graphic will never explicitly be known to PrIC3. Instead, PrIC3 will ensure the above frame-wise overapproximation property implicitly by enforcing the so-called PrIC3 invariants on the frame sequence Inline graphic. Apart from allowing for a threshold Inline graphic on the maximal reachability probability, these invariants coincide with the standard IC3 invariants (where Inline graphic is fixed). Formally:

Definition 3

( Inline graphicInvariants). Frames Inline graphic, for Inline graphic, satisfy the PrIC3 invariants, a fact we will denote by Inline graphic, if all of the following hold:

graphic file with name M124.gif

   Inline graphic

The PrIC3 invariants enforce the above picture: The chain property ensures Inline graphic. We have Inline graphic by initiality. Assuming Inline graphic as induction hypothesis, monotonicity of Inline graphic and relative inductivity imply Inline graphic.

By overapproximating Inline graphic, the frames Inline graphic in effect bound the maximal step-bounded reachability probability of every state:

Lemma 1

Let frames Inline graphic satisfy the PrIC3 invariants. Then

graphic file with name M134.gif

In particular, Lemma 1 together with frame-safety ensures that the maximal step-bounded reachability probability of the initial state Inline graphic to reach Inline graphic is at most the threshold Inline graphic.

As for proving that the unbounded reachability probability is also at most Inline graphic, it suffices to find two consecutive frames, say Inline graphic and Inline graphic, that coincide:

Lemma 2

Let frames Inline graphic satisfy the PrIC3 invariants. Then

graphic file with name M142.gif

Proof

Inline graphic and relative inductivity yield Inline graphic, rendering Inline graphic inductive. By Park’s lemma (cf. Sect. 3.1), we obtain Inline graphic and—by frame-safety—conclude

graphic file with name M147.gif

   Inline graphic

Operationalizing the PrIC3 Invariants for Proving Safety

Lemma 2 gives us a clear angle of attack for proving an MDP safe: Repeatedly add and refine frames approximating step-bounded reachability probabilities for more and more steps while enforcing the PrIC3 invariants (cf. Definition 3.2) until two consecutive frames coincide.graphic file with name 501999_1_En_27_Fige_HTML.jpg Analogously to standard IC3, this approach is taken by the core loop Inline graphic depicted in Algorithm 1; differences to the main loop of IC3 (cf.  [23, Fig. 5]) are highlighted in Inline graphic . A particular difference is that Inline graphic is parameterized by a heuristic Inline graphic for finding suitable probabilities (see Challenge 3). Since the precise choice of Inline graphic is irrelevant for the soundness of Inline graphic, we defer a detailed discussion of suitable heuristics to Sect. 4.

As input, Inline graphic takes an MDP Inline graphic, a set Inline graphic of bad states, and a threshold Inline graphic. Since the input is never changed, we assume it to be globally available, also to subroutines. As output, Inline graphic returns Inline graphic if two consecutive frames become equal. We hence say that Inline graphic is sound if it only returns Inline graphic if Inline graphic is safe.

We will formalize soundness using Hoare triples. For precondition Inline graphic, postcondition Inline graphic, and program P, the triple Inline graphic is valid (for partial correctness) if, whenever program P starts in a state satisfying precondition Inline graphic and terminates in some state Inline graphic, then Inline graphic satisfies postcondition Inline graphic. Soundness of Inline graphic then means validity of the triple

graphic file with name 501999_1_En_27_Equ20_HTML.gif

Let us briefly go through the individual steps of Inline graphic in Algorithm 1 and convince ourselves that it is indeed sound. After that, we discuss why Inline graphic terminates and what happens if it is unable to prove safety by finding two equal consecutive frames.

How Inline graphic works. Recall that Inline graphic maintains a sequence of frames Inline graphic which is initialized in l. 1 with Inline graphic, Inline graphic, and Inline graphic, where the frame Inline graphic maps every state to 1. Every time upon entering the while-loop in terms l. 2, the initial segment Inline graphic satisfies all PrIC3 invariants (cf. Definition 3), whereas the full sequence Inline graphic potentially violates frame-safety as it is possible that Inline graphic.

In l. 3, procedure Inline graphic—detailed in Sect. 4—is called to restore all PrIC3 invariants on the entire frame sequence: It either returns Inline graphic if successful or returns Inline graphic and a counterexample (in our case a subsystem of the MDP) if it was unable to do so. To ensure soundness of Inline graphic, it suffices that Inline graphic restores the PrIC3 invariants whenever it returns Inline graphic. Formally, Inline graphic must meet the following specification:

Definition 4

Procedure Inline graphic is sound if the following Hoare triple is valid:

graphic file with name M191.gif

If Inline graphic returns Inline graphic, then a new frame Inline graphic is created in l. 5. After that, the (now initial) segment Inline graphic again satisfies all PrIC3 invariants, whereas the full sequence Inline graphic potentially violates frame-safety at Inline graphic. Propagation (l. 6) aims to speed up termination by updating Inline graphic by Inline graphic iff this does not violate relative inductivity. Consequently, the previously mentioned properties remain unchanged.

If Inline graphic returns Inline graphic, the PrIC3 invariants—premises to Lemma 2 for witnessing safety—cannot be restored and Inline graphic terminates returning Inline graphic (l. 4). Returning Inline graphic (also possible in l. 8) has by specification no affect on soundness of Inline graphic.

In l. 7, we check whether there exist two identical consecutive frames. If so, Lemma 2 yields that the MDP is safe; consequently, Inline graphic returns Inline graphic. Otherwise, we increment k and are in the same setting as upon entering the loop, now with an increased frame sequence; Inline graphic then performs another iteration. In summary, we obtain:

Theorem 1

(Soundness of Inline graphic). If Inline graphic is sound and Inline graphic does not affect the PrIC3 invariants, then Inline graphic is sound, i.e., the following triple is valid:

graphic file with name 501999_1_En_27_Equ21_HTML.gif

Inline graphic Terminates for Unsafe MDPs. If the MDP is unsafe, then there exists a step-bound n, such that Inline graphic. Furthermore, any sound implementation of Inline graphic (cf. Definition 4) either immediately terminates Inline graphic by returning Inline graphic or restores the PrIC3 invariants for Inline graphic. If the former case never arises, then Inline graphic will eventually restore the PrIC3 invariants for a frame sequence of length Inline graphic. By Lemma 1, we have Inline graphic contradicting frame-safety.

Inline graphic Terminates for Safe MDPs. Standard IC3 terminates on safe finite TSs as there are only finitely many different frames, making every ascending chain of frames eventually stabilize. For us, frames map states to probabilities (Challenge 1), yielding infinitely many possible frames even for finite MDPs. Hence, Inline graphic need not ever yield a stabilizing chain of frames. If it continuously fails to stabilize while repeatedly reasoning about the same set of states, we give up. Inline graphic checks this by comparing the subsystem Inline graphic operates on with the one it operated on in the previous loop iteration (l. 8).

Theorem 2

If Inline graphic and Propagate terminate, then Inline graphic terminates.

Recovery Statement 2

For qual. reachability (Inline graphic), Inline graphic never terminates in l. 8.

Inline graphic is Incomplete. Standard IC3 either proves safety or returns Inline graphic and a counterexample—a single path from the initial to a bad state. As single paths are insufficient as counterexamples in MDPs (Challenge 2), Inline graphic instead returns a subsystem of the MDP Inline graphic provided by Inline graphic. However, as argued above, we cannot trust Inline graphic to provide a stabilizing chain of frames. Reporting Inline graphic thus only means that the given MDP may be unsafe; the returned subsystem has to be analyzed further.

The full PrIC3 algorithm presented in Sect. 5 addresses this issue. Exploiting the subsystem returned by Inline graphic, PrIC3 returns Inline graphic if the MDP is safe; otherwise, it returns Inline graphic and provides a true counterexample witnessing that the MDP is unsafe.

Example 2

We conclude this section with two example executions of Inline graphic on a simplified version of the MDP in Fig. 1. Assume that action b has been removed. Then, for every state, exactly one action is enabled, i.e., we consider a Markov chain. Figure 2 depicts the frame sequences computed by Inline graphic (for a reasonable Inline graphic) on that Markov chain for two thresholds: Inline graphic and Inline graphic. In particular, notice that proving the coarser bound of Inline graphic requires fewer frames than proving the exact bound of Inline graphic.    Inline graphic

Fig. 2.

Fig. 2.

Two runs of Inline graphic on the Markov chain induced by selecting action a in Fig. 1. For every iteration, frames are recorded after invocation of Inline graphic.

Strengthening in Inline graphic

When the main loop of Inline graphic has created a new frame Inline graphic in its previous iteration, this frame may violate frame-safety (Definition 3.3) because of Inline graphic. The task of Inline graphic is to restore the PrIC3 invariants on all frames Inline graphic. To this end, our first obligation is to lower the value in frame Inline graphic for state Inline graphic to Inline graphic. We denote such an obligation by Inline graphic. Observe that implicitly Inline graphic in the qualitative case, i.e., when proving unreachability. An obligation Inline graphic is resolved by updating the values assigned to state s in all frames Inline graphic to at most Inline graphic. That is, for all Inline graphic, we set Inline graphic to the minimum of Inline graphic and the original value Inline graphic. Such an update affects neither initiality nor the chain property (Definitions 3.1, 3.2). It may, however, violate relative inductivity (Definition 3.4), i.e., Inline graphic. Before resolving obligation Inline graphic, we may thus have to further decrease some entries in Inline graphic as well. Hence, resolving obligations may spawn additional obligations which have to be resolved first to maintain relative inductivity. In this section, we present a generic instance of Inline graphic meeting its specification (Definition 4) and discuss its correctness.

Inline graphic by Example. Inline graphic is given by the pseudo code in Algorithm 2; differences to standard IC3 (cf.  [23, Fig. 6]) are highlighted in Inline graphic . Intuitively, Inline graphic attempts to recursively resolve all obligations until either both frame-safety and relative inductivity are restored for all frames or it detects a potential counterexample justifying why it is unable to do so. We first consider an execution where the latter does not arise: graphic file with name 501999_1_En_27_Figi_HTML.jpg

Example 3

We zoom in on Example 2: Prior to the second iteration, we have created the following three frames assigning values to the states Inline graphic:

graphic file with name 501999_1_En_27_Equ22_HTML.gif

To keep track of unresolved obligations Inline graphic, Inline graphic employs a priority queue Q which pops obligations with minimal frame index i first. Our first step is to ensure frame-safety of Inline graphic, i.e., alter Inline graphic so that Inline graphic; we thus initialize the queue Q with the initial obligation Inline graphic (l. 1). To do so, we check whether updating Inline graphic to Inline graphic would invalidate relative inductivity (l. 6). This is indeed the case:

graphic file with name 501999_1_En_27_Equ23_HTML.gif

To restore relative inductivity, Inline graphic spawns one new obligation for each relevant successor of Inline graphic. These have to be resolved before retrying to resolve the old obligation.5

In contrast to standard IC3 , spawning obligations involves finding suitable probabilities Inline graphic (l. 7). In our example this means we have to spawn two obligations Inline graphic and Inline graphic such that Inline graphic. There are infinitely many choices for Inline graphic and Inline graphic satisfying this inequality. Assume some heuristic Inline graphic chooses Inline graphic and Inline graphic; we push obligations Inline graphic, Inline graphic, and Inline graphic (ll. 8, 9). In the next iteration, we first pop obligation Inline graphic (l. 3) and find that it can be resolved without violating relative inductivity (l. 6). Hence, we set Inline graphic to Inline graphic (l. 11); no new obligation is spawned. Obligation Inline graphic is resolved analogously; the updated frame is Inline graphic. Thereafter, our initial obligation Inline graphic can be resolved; relative inductivity is restored for Inline graphic. Hence, Inline graphic returns Inline graphic together with the updated frames.    Inline graphic

Inline graphic is Sound. Let us briefly discuss why Algorithm 2 meets the specification of a sound implemenation of Inline graphic (Definition 4): First, we observe that Algorithm 2 alters the frames—and thus potentially invalidates the PrIC3 invariants—only in l. 11 by resolving an obligation Inline graphic with Inline graphic (due to the check in l. 6).

Let Inline graphic denote the frame F in which F[s] is set to Inline graphic, i.e.,

graphic file with name M295.gif

Indeed, resolving obligation Inline graphic in l. 11 lowers the values assigned to state s to at most Inline graphic without invalidating the PrIC3 invariants:

Lemma 3

Let Inline graphic be an obligation and Inline graphic, for Inline graphic, be frames with Inline graphic. Then Inline graphic

graphic file with name 501999_1_En_27_Equ24_HTML.gif

Crucially, the precondition of Definition 4 guarantees that all PrIC3 invariants except frame safety hold initially. Since these invariants are never invalidated due to Lemma 3, Algorithm 2 is a sound implementation of Inline graphic if it restores frame safety whenever it returns Inline graphic, i.e., once it leaves the loop with an empty obligation queue Q (ll. 12–13). Now, an obligation Inline graphic is only popped from Q in l. 3. As Inline graphic is added to Q upon reaching l. 9, the size of Q can only ever be reduced (without returning Inline graphic) by resolving Inline graphic in l. 11. Hence, Algorithm 2 does not return Inline graphic unless it restored frame safety by resolving, amongst all other obligations, the initial obligation Inline graphic. Consequently:

Lemma 4

Procedure Inline graphic is sound, i.e., it satisfies the specification in Definition 4.

Theorem 3

Procedure Inline graphic is sound, i.e., satisfies the specification in Theorem 1.

We remark that, analogously to standard IC3, resolving an obligation in l. 11 may be accompanied by generalization. That is, we attempt to update the values of multiple states at once. Generalization is, however, highly non-trivial in a probabilistic setting. We discuss three possible approaches to generalization in Sect. 6.2.

Inline graphic Terminates. We now show that Inline graphic as in Algorithm 2 terminates. The only scenario in which Inline graphic may not terminate is if it keeps spawning obligations in l. 9. Let us thus look closer at how obligations are spawned: Whenever we detect that resolving an obligation Inline graphic would violate relative inductivity for some action a (l. 6), we first need to update the values of the successor states Inline graphic in frame Inline graphic, i.e., we push the obligations Inline graphic which have to be resolved first (ll. 7–9). It is noteworthy that, for a TS, a single action leads to a single successor state Inline graphic. Algorithm 2 employs a heuristic Inline graphic to determine the probabilities required for pushing obligations (l. 7). Assume for an obligation Inline graphic that the check in l. 6 yields Inline graphic. Then Inline graphic takes s, a, Inline graphic and reports some probability Inline graphic for every a-successor Inline graphic of s. However, an arbitrary heuristic of type Inline graphic may lead to non-terminating behavior: If Inline graphic, then the heuristic has no effect. It is thus natural to require that an adequate heuristic Inline graphic yields probabilities such that the check Inline graphic in l. 6 cannot succeed twice for the same obligation Inline graphic and same action a. Formally, this is guaranteed by the following:

Definition 5

Heuristic Inline graphic is adequate if the following triple is valid (for any frame F):

graphic file with name 501999_1_En_27_Equ25_HTML.gif

   Inline graphic

Details regarding our implementation of heuristic Inline graphic are found in Sect. 6.1.

For an adequate heuristic, attempting to resolve an obligation Inline graphic (ll. 3 – 11) either succeeds after spawning it at most Inline graphic times or Inline graphic returns Inline graphic. By a similar argument, attempting to resolve an obligation Inline graphic leads to at most Inline graphic other obligations of the form Inline graphic. Consequently, the total number of obligations spawned by Algorithm 2 is bounded. Since Algorithm 2 terminates if all obligations have been resolved (l. 12) and each of its loop iterations either returns Inline graphic, spawns obligations, or resolves an obligation, we conclude:

Lemma 5

Inline graphic terminates for every adequate heuristic Inline graphic.

Recovery Statement 3

Let Inline graphic be adequate. Then for qualitative reachability (Inline graphic), all obligations spawned by Inline graphic as in Algorithm 2 are of the form (is, 0).

Inline graphic returns Inline graphic. There are two cases in which Inline graphic fails to restore the PrIC3 invariants and returns Inline graphic. The first case (the left disjunct of l. 4) is that we encounter an obligation for frame Inline graphic. Resolving such an obligation would inevitably violate initiality; analogously to standard IC3, we thus return Inline graphic.

The second case (the right disjunct of l. 4) is that we encounter an obligation Inline graphic for a bad state Inline graphic with a probability Inline graphic (though, obviously, all Inline graphic have probability Inline graphic). Resolving such an obligation would inevitably prevents us from restoring relative inductivity: If we updated Inline graphic to Inline graphic, we would have Inline graphic. Notice that, in contrast to standard IC3, this second case can occur in PrIC3:

Example 4

Assume we have to resolve an obligation Inline graphic for the MDP in Fig. 1. This involves spawning obligations Inline graphic and Inline graphic, where Inline graphic is a bad state, such that Inline graphic. Even for Inline graphic, this is only possible if Inline graphic.    Inline graphic

Inline graphic Cannot Prove Unsafety. If standard IC3 returns Inline graphic, it proves unsafety by constructing a counterexample, i.e., a single path from the initial state to a bad state. If PrIC3 returns Inline graphic, there are two possible reasons: Either the MDP is indeed unsafe, or the heuristic Inline graphic at some point selected probabilities in a way such that Inline graphic is unable to restore the Inline graphic invariants (even though the MDP might in fact be safe). Inline graphic thus only returns a potential counterexample which either proves unsafety or indicates that our heuristic was inappropriate.

Counterexamples in our case consist of subsystems rather than a single path (see Challenge 2 and Sect. 5). Inline graphic hence returns the set Inline graphic of all states that eventually appeared in the obligation queue. This set is a conservative approximation, and optimizations as in  [1] may be beneficial. Furthermore, in the qualitative case, our potential counterexample subsumes the counterexamples constructed by standard IC3:

Recovery Statement 4

Let Inline graphic be the adequate heuristic mapping every state to 0. For qual. reachability (Inline graphic), if Inline graphic is returned by Inline graphic, then Inline graphic contains a path from the initial to a bad state.6

Dealing with Potential Counterexamples

Recall that our core algorithm Inline graphic is incomplete for a fixed heuristic Inline graphic: It cannot give a conclusive answer whenever it finds a potential counterexample for two possible reasons: Either the heuristic Inline graphic turned out to be inappropriate or the MDP is indeed unsafe. The idea to overcome the former is to call Inline graphic finitely often in an outer loop that generates new heuristics until we find an appropriate one: If Inline graphic still does not report safety of the MDP, then it is indeed unsafe. We do not blindly generate new heuristics, but use the potential counterexamples returned by Inline graphic to refine the previous one.graphic file with name 501999_1_En_27_Figk_HTML.jpg

Let consider the procedure Inline graphic in Algorithm 3 which wraps our core algorithm Inline graphic in more detail: First, we create an oracle Inline graphic which (roughly) estimates the probability of reaching Inline graphic for every state. A perfect oracle would yield precise maximal reachability probabilites, i.e., Inline graphic for every state s. We construct oracles by Inline graphic (highlighted in Inline graphic ). Examples of implementations of all user-supplied methods in Algorithm 3 are discussed in Sect. 7.

Assuming the oracle is good, but not perfect, we construct an adequate heuristic Inline graphic selecting probabilities based on the oracle7 for all successors of a given state: There are various options. The simplest is to pass-through the oracle values. A version that is more robust against noise in the oracle is discussed in Sect. 6. We then invoke Inline graphic. If Inline graphic reports safety, the MDP is indeed safe by the soundness of Inline graphic.

Check Refutation. If Inline graphic does not report safety, it reports a subsystem that hints to a potential counterexample. Formally, this subsystem is a subMDP of states that were ‘visited’ during the invocation of Inline graphic.

Definition 6

(subMDP). Let Inline graphic be an MDP and let Inline graphic with Inline graphic. We call Inline graphic the subMDP induced by Inline graphic and Inline graphic, where for all Inline graphic and all Inline graphic, we have Inline graphic.    Inline graphic

A subMDP Inline graphic may be substochastic where missing probability mass never reaches a bad state. Definition 1 is thus relaxed: For all states Inline graphic we require that Inline graphic.If the subsystem is unsafe, we can conclude that the original MDP Inline graphic is also safe.

Lemma 6

If Inline graphic is a subMDP of Inline graphic and Inline graphic is unsafe, then Inline graphic is also unsafe.

The role of Inline graphic is to establish whether the subsystem is indeed a true counterexample or a spurious one. Formally, Inline graphic should ensure:

graphic file with name M416.gif

Again, Inline graphic is backward compatible in the sense that a single fixed heuristic is always sufficient when reasoning about reachability (Inline graphic).

Recovery Statement 5

For qualitative reachability (Inline graphic) and the heuristic Inline graphic from Recovery Statement 4, Inline graphic invokes its core Inline graphic exactly once.

This statement is true, as Inline graphic returns either Inline graphic or a subsystem containing a path from the initial state to a bad state. In the latter case, Inline graphic detects that the subsystem is indeed a counterexample which cannot be spurious in the qualitative setting.

We remark that the procedure Inline graphic invoked in l. 5 is a classical fallback; it runs an (alternative) model checking algorithm, e.g., solving the set of Bellman equations, for the subsystem. In the worst case, i.e., for Inline graphic, we thus solve exactly our problem statement. Empirically (Table 1) we observe that for reasonable oracles the procedure Inline graphic is invoked on significantly smaller subMDPs. However, in the worst case the subMDP must include all paths of the original MDP, and then thus coincides.

Table 1.

Empirical results. Run times are in seconds; time out = 15 min.

|S| Inline graphic Inline graphic w/o Inline graphic lin Inline graphic pol Inline graphic hyb Inline graphic StormInline graphic StormInline graphic
BRP Inline graphic 0.035 0.1 TO TO TO TO Inline graphic 0.12
graphic file with name 501999_1_En_27_Figs_HTML.gif Inline graphic 324 125.8 324 TO MO Inline graphic 0.18
graphic file with name 501999_1_En_27_Figt_HTML.gif Inline graphic 188 38.3 188 TO MO Inline graphic 0.1
ZeroConf Inline graphic 0.5 0.9 TO TO 0.4 0 Inline graphic 0 Inline graphic 296.8
0.52 TO TO 0.2 0 Inline graphic 0 Inline graphic 282.6
graphic file with name 501999_1_En_27_Figu_HTML.gif Inline graphic 1 Inline graphic 1 Inline graphic 1 Inline graphic 1 Inline graphic 300.2
Inline graphic Inline graphic 0.9 TO TO Inline graphic 0 MO MO TO
0.75 TO TO Inline graphic 0 MO MO TO
0.52 TO TO TO TO MO TO
graphic file with name 501999_1_En_27_Figv_HTML.gif Inline graphic 1 Inline graphic 1 Inline graphic 1 Inline graphic 1 MO TO
Chain Inline graphic 0.394 0.9 18.8 0 60.2 0 1.2 0 Inline graphic 0 Inline graphic Inline graphic
0.4 20.1 0 55.4 0 Inline graphic 0 TO Inline graphic Inline graphic
graphic file with name 501999_1_En_27_Figw_HTML.gif Inline graphic 431 119.5 431 TO TO Inline graphic Inline graphic
graphic file with name 501999_1_En_27_Figx_HTML.gif Inline graphic 357 64.0 357 TO TO Inline graphic Inline graphic
Inline graphic 0.394 0.9 TO TO 1.6 0 Inline graphic 0 Inline graphic 4.5
0.4 TO TO Inline graphic 0 TO Inline graphic 4.9
graphic file with name 501999_1_En_27_Figy_HTML.gif TO TO TO TO Inline graphic 4.9
Inline graphic 0.394 0.9 TO TO Inline graphic 0 MO MO TO
0.4 TO TO Inline graphic 0 MO MO TO
Double chain Inline graphic 0.215 0.9 528.1 0 828.8 0 203.3 0 Inline graphic 0 Inline graphic Inline graphic
0.3 588.4 0 TO 138.3 0 Inline graphic 0 Inline graphic Inline graphic
0.216 Inline graphic 0 TO 765.8 0 MO Inline graphic Inline graphic
graphic file with name 501999_1_En_27_Figz_HTML.gif TO TO TO TO Inline graphic Inline graphic
Inline graphic 0.22 0.3 TO TO 17.5 0 Inline graphic 0 0.2 2.6
0.24 TO TO Inline graphic 0 MO 0.2 2.7
Inline graphic Inline graphic Inline graphic TO TO TO MO TO TO
Inline graphic TO TO Inline graphic 0 MO TO TO

Refine Oracle. Whenever we have neither proven the MDP safe nor unsafe, we refine the oracle to prevent generating the same subsystem in the next invocation of Inline graphic. To ensure termination, oracles should only be refined finitely often. That is, we need some progress measure. The set Inline graphic overapproximates all counterexamples encountered in some invocation of Inline graphic and we propose to use its size as the progress measure. While there are several possibilities to update Inline graphic through the user-defined procedure Inline graphic  (l. 6), every implementation should hence satisfy Inline graphic. Consequently, after finitely many iterations, the oracle is refined with respect to all states. In this case, we may as well rely on solving the characteristic LP problem:

Lemma 7

The algorithm Inline graphic in Algorithm 3 is sound and complete if Inline graphic returns a perfect oracle Inline graphic (with S is the set of all states).

Weaker assumptions on Inline graphic are possible, but are beyond the scope of this paper. Moreover, the above lemma does not rely on the abstract concept that heuristic Inline graphic provides suitable probabilities after finitely many refinements.8

Practical PrIC3

So far, we gave a conceptual view on PrIC3, but now take a more practical stance. We detail important features of effective implementations of PrIC3 (based on our empirical evaluation). We first describe an implementation without generalization, and then provide a prototypical extension that allows for three variants of generalization.

A Concrete PrIC3 Instance Without Generalization

Input. We describe MDPs using the Prism guarded command language9, exemplified in Fig. 3. States are described by valuations to m (integer-valued) program variables Inline graphic, and outgoing actions are described by commands of the form

graphic file with name 501999_1_En_27_Equ26_HTML.gif

If a state satisfies guard, then the corresponding action with k branches exists; probabilities are given by probi, the successor states are described by updatei, see Fig. 3b.

Fig. 3.

Fig. 3.

Illustrative Prism-style probabilistic guarded command language example

Encoding. We encode frames as logical formulae. Updating frames then corresponds to adding conjuncts, and checking for relative inductivity is a satisfiability call. Our encoding is as follows: States are assignments to the program variables, i.e., Inline graphic. We use various uninterpreted functions, to whom we give semantics using appropriate constraints. Frames10 are represented by uninterpreted functions Inline graphic satisfying Inline graphic implies Inline graphic. Likewise, the Bellman operator is an uninterpreted function Inline graphic such that Inline graphic implies Inline graphic. Finally, we use Inline graphic with Inline graphic iff Inline graphic.

Among the appropriate constraints, we ensure that variables are within their range, bound the values for the frames, and enforce Inline graphic for Inline graphic. We encode the guarded commands as exemplified by this encoding of the first command in Fig. 3:

graphic file with name M451.gif

In our implementation, we optimize the encoding. We avoid the uninterpreted functions by applying an adapted Ackerman reduction. We avoid universal quantifiers, by first observing that we always ask whether a single state is not inductive, and then unfolding the guarded commands in the constraints that describe a frame. That encoding grows linear in the size of the maximal out-degree of the MDP, and is in the quantifier-free fragment of linear arithmetic (QFLRIA).

Heuristic. We select probabilities Inline graphic by solving the following optimization problem, with variables Inline graphic, Inline graphic, for states Inline graphic and oracle Inline graphic11.

graphic file with name M457.gif

The constraint ensures that, if the values Inline graphic correspond to the actual reachability probabilities from Inline graphic, then the reachability from state s is exactly Inline graphic. A constraint stating that Inline graphic would also be sound, but we choose equality as it preserves room between the actual probability and the threshold we want to show. Finally, the objective function aims to preserve the ratio between the suggested probabilities.

Repushing and Breaking Cycles. Repushing   [23] is an essential ingredient of both standard IC3 and PrIC3. Intuitively, we avoid opening new frames and spawning obligations that can be deduced from current information. Since repushing generates further obligations in the current frame, its implementation requires that the detection of Zeno-behavior has to be moved from Inline graphic into the Inline graphic procedure. Therefore, we track the histories of the obligations in the queue. Furthermore, once we detect a cycle we first try to adapt the heuristic Inline graphic locally to overcome this cyclic behavior instead of immediately giving up. This local adaption reduces the number of Inline graphic invocations.

Extended Queue. In contrast to standard IC3, the obligation queue might contain entries that vary only in their Inline graphic entry. In particular, if the MDP is not a tree, it may occur that the queue contains both Inline graphic and Inline graphic with Inline graphic. Then, Inline graphic can be safely pruned from the queue. Similarly, after handling Inline graphic, if some fresh obligation Inline graphic is pushed to the queue, it can be substituted with Inline graphic. To efficiently operationalize these observations, we keep an additional mapping which remains intact over multiple invocations of Inline graphic. We furthermore employed some optimizations for Inline graphic aiming to track potential counterexamples better. After refining the heuristic, one may want to reuse frames or the obligation queue, but empirically this leads to performance degradation as the values in the frames are inconsistent with behavior suggested by the heuristic.

Concrete PrIC3 with Generalization

So far, frames are updated by changing single entries whenever we resolve obligations Inline graphic, i.e., we add conjunctions of the form Inline graphic. Equivalently, we may add a constraint Inline graphic with Inline graphic and Inline graphic for all Inline graphic.

Generalization in IC3 aims to update a set Inline graphic (including s) of states in a frame rather than a single one without invalidating relative inductivity. In our setting, we thus consider a function Inline graphic with Inline graphic that assigns (possibly different) probabilities to all states in Inline graphic. Updating a frame then amounts to adding the constraint

graphic file with name M486.gif

Standard IC3 generalizes by iteratively “dropping” a variable, say v. The set Inline graphic then consists of all states that do not differ from the fixed state s except for the value of v.12 We take the same approach by iteratively dropping program variables. Hence, Inline graphic effectively becomes a mapping from the value s[v] to a probability. We experimented with four types of functions Inline graphic that we describe for Markov chains. The ideas are briefly outlined below; details are beyond the scope of this paper.

Constant Inline graphic. Setting all Inline graphic to Inline graphic is straightforward but empirically not helpful.

Linear Interpolation. We use a linear function Inline graphic that interpolates two points. The first point Inline graphic is obtained from the obligation Inline graphic. For a second point, consider the following: Let Inline graphic be the unique13 command active at state s. Among all states in Inline graphic that are enabled in the guard of Inline graphic, we take the state Inline graphic in which Inline graphic is maximal14. The second point for interpolation is then Inline graphic. If the relative inductivity fails for Inline graphic we do not generalize with Inline graphic, but may attempt to find other functions.

Polynomial Interpolation. Rather than linearly interpolating between two points, we may interpolate using more than two points. In order to properly fit these points, we can use a higher-degree polynomial. We select these points using counterexamples to generalization (CTGs): We start as above with linear interpolation. However, if Inline graphic is not relative inductive, the SMT solver yields a model with state Inline graphic and probability Inline graphic, with Inline graphic violating relative inductivity, i.e., Inline graphic. We call Inline graphic a CTG, and Inline graphic is then a further interpolation point, and we repeat.

Technically, when generalizing using nonlinear constraints, we use real-valued arithmetic with a branch-and-bound-style approach to ensure integer values.

Hybrid Interpolation. In polynomial interpolation, we generate high-degree polynomials and add them to the encoding of the frame. In subsequent invocations, reasoning efficiency is drastically harmed by these high-degree polynomials. Instead, we soundly approximate Inline graphic by a piecewise linear function, and use these constraints in the frame.

Experiments

We assess how PrIC3 may contribute to the state of the art in probabilistic model checking. We do some early empirical evaluation showing that PrIC3 is feasible. We see ample room for further improvements of the prototype.

Implementation. We implemented a prototype15 of PrIC3 based on Sect. 6.1 in Python. The input is represented using efficient data structures provided by the model checker Storm. We use an incremental instance of Z3  [47] for each frame, as suggested in  [23]. A solver for each frame is important to reduce the time spent on pushing the large frame-encodings. The optimization problem in the heuristic is also solved using Z3. All previously discussed generalizations (none, linear, polynomial, hybrid) are supported.

Oracle and Refinement. We support the (pre)computation of four different types of oracles for the Inline graphic step in Algorithm 3: (1) A perfect oracle solving exactly the Bellman equations. Such an oracle is unrealistic, but interesting from a conceptual point. (2) Relative frequencies by recording all visited states during simulation. This idea is a naïve simplification of Q-learning. (3) Model checking with decision diagrams (DDs) and few value iterations. Often, a DD representation of a model can be computed fast, and the challenge is in executing sufficient value iterations. We investigate whether doing few value iterations yields a valuable oracle (and covers states close to bad states). (4) Solving a (pessimistic) LP from BFS partial exploration. States that are not expanded are assumed bad. Roughly, this yields oracles covering states close to the initial states.

To implement Inline graphic (cf. Algorithm 3, l. 7), we create an LP for the subMDP induced by the touched states. For states whose successors are not in the touched states, we add a transition to B labeled with the oracle value as probability. The solution of the resulting LP updates the entries corresponding to the touched states.

For Inline graphic (cf. Algorithm 3, l. 6), we take the union of the subsystem and the touched states. If this does not change the set of touched states, we also add its successors.

Setup. We evaluate the run time and memory consumption of our prototype of PrIC3. We choose a combination of models from the literature (BRP  [21], ZeroConf  [18]) and some structurally straightforward variants of grids (chain, double chain; see [11, Appendix A]). Since our prototype lacks the sophisticated preprocessing applied by many state-of-the-art model checkers, it is more sensitive to the precise encoding of a model, e.g., the number of commands. To account for this, we generated new encodings for all models. All experiments were conducted on an single core of an Intel® Xeon® Platinum 8160 processor. We use a 15 min time-limit and report TO otherwise. Memory is limited to 8GB; we report MO if it is exceeded. Apart from the oracle, all parameters of our prototype remain fixed over all experiments. To give an impression of the run times, we compare our prototype with both the explicit (StormInline graphic) and DD-based (StormInline graphic) engine of the model checker Storm 1.4, which compared favourably in QComp  [29].

Results. In Table 1, we present the run times for various invocations of our prototype and Oracle 416. In particular, we give the model name and the number of (non-trivial) states in the particular instance, and the (estimated) actual probability to reach B. For each model, we consider multiple thresholds Inline graphic. The next 8 columns report on the four variants of PrIC3 with varying generalization schemes. Besides the scheme with the run times, we report for each scheme the number of states of the largest (last) subsystem that Inline graphic in Algorithm 3, l. 5 was invoked upon (column |sub|). The last two columns report on the run times for Storm that we provide for comparison. In each row, we mark with Inline graphic MDPs that are unsafe, i.e., PrIC3 refutes these MDPs for the given threshold Inline graphic. We highlight the best configurations of PrIC3.

Discussion. Our experiments give a mixed picture on the performance of our implementation of PrIC3. On the one hand, Storm significantly outperforms PrIC3 on most models. On the other hand, PrIC3 is capable of reasoning about huge, yet simple, models with up to Inline graphic states that Storm is unable to analyze within the time and memory limits. There is more empirical evidence that PrIC3 may complement the state-of-the-art:

First, the size of thresholds matters. Our benchmarks show that—at least without generalization—more “wiggle room” between the precise maximal reachability probability and the threshold generally leads to a better performance. PrIC3 may thus prove bounds for large models where a precise quantitative reachability analysis is out of scope.

Second, PrIC3enjoys the benefits of bounded model checking. In some cases, e.g., ZeroConf for Inline graphic, PrIC3 refutes very fast as it does not need to build the whole model.

Third, if PrIC3 proves the safety of the system, it does so without relying on checking large subsystems in the Inline graphic step.

Fourth, generalization is crucial. Without generalization, PrIC3 is unable to prove safety for any of the considered models with more than Inline graphic states. With generalization, however, it can prove safety for very large systems and thresholds close to the exact reachability probability. For example, it proved safety of the Chain benchmark with Inline graphic states for a threshold of 0.4 which differs from the exact reachability probability by 0.006.

Fifth, there is no best generalization. There is no clear winner out of the considered generalization approaches. Linear generalization always performs worse than the other ones. In fact, it performs worse than no generalization at all. The hybrid approach, however, occasionally has the edge over the polynomial approach. This indicates that more research is required to find suitable generalizations.

In  [11, Appendix A], we also compare the additional three types of oracles (1–3). We observed that only few oracle refinements are needed to prove safety; for small models at most one refinement was sufficient. However, this does not hold if the given MDP is unsafe. DoubleChain with Inline graphic, for example, and Oracle 2 requires 25 refinements.

Conclusion

We have presented PrIC3—the first truly probabilistic, yet conservative, extension of IC3 to quantitative reachability in MDPs. Our theoretical development is accompanied by a prototypical implementation and experiments. We believe there is ample space for improvements including an in-depth investigation of suitable oracles and generalizations.

Footnotes

1

Recently, (standard) IC3 for TSs was utilized in model checking Markov chains  [49] to on-the-fly compute the states that cannot reach B.

2

In  [38], tree-like counterexamples are used for non-linear predicate transformers in IC3.

3

Maximal with respect to all possible resolutions of nondeterminism in the MDP.

4

In IC3, frames are typically characterized by logical formulae. To understand IC3 ’s fundamental principle, however, we prefer to think of frames as functions in Inline graphic partially ordered by Inline graphic.

5

We assume that the set Inline graphic of relevant a-successors of state s is returned in some arbitrary, but fixed order.

6

Inline graphic might be restricted to only contain this path by some simple adaptions.

7

We thus assume that heuristic Inline graphic invokes the oracle whenever it needs to guess some probability.

8

One could of course now also create a heuristic that is trivial for a perfect oracle and invoke Inline graphic with the heuristic for the perfect oracle, but there really is no benefit in doing so.

9

Preprocessing ensures a single thread (module) and no deadlocks.

10

In each operation, we only consider a single frame.

11

If Inline graphic, we assume Inline graphic. If Inline graphic, we omit rescaling to allow Inline graphic.

12

Formally, Inline graphic.

13

Recall that we have a Markov chain consisting of a single module.

14

This implicitly assumes that v is increased. Adaptions are possible.

15

The prototype is available open-source from https://github.com/moves-rwth/PrIC3.

16

We explore Inline graphic states using BFS and Storm.

This work has been supported by the ERC Advanced Grant 787914 (FRAPPANT), NSF grants 1545126 (VeHICaL) and 1646208, the DARPA Assured Autonomy program, Berkeley Deep Drive, and by Toyota under the iCyPhy center.

Contributor Information

Shuvendu K. Lahiri, Email: shuvendu.lahiri@microsoft.com

Chao Wang, Email: wang626@usc.edu.

Kevin Batz, Email: kevin.batz@cs.rwth-aachen.de.

References

  • 1.Ábrahám E, Becker B, Dehnert C, Jansen N, Katoen J-P, Wimmer R. Counterexample generation for discrete-time Markov models: an introductory survey. In: Bernardo M, Damiani F, Hähnle R, Johnsen EB, Schaefer I, editors. Formal Methods for Executable Software Models; Cham: Springer; 2014. pp. 65–121. [Google Scholar]
  • 2.Agha G, Palmskog K. A survey of statistical model checking. ACM Trans. Model. Comput. Simul. 2018;28(1):6:1–6:39. doi: 10.1145/3158668. [DOI] [Google Scholar]
  • 3.Agrawal, S., Chatterjee, K., Novotný, P.: Lexicographic ranking supermartingales: an efficient approach to termination of probabilistic programs. In: PACMPL 2(POPL), pp. 34:1–34:32 (2018)
  • 4.de Alfaro L, Kwiatkowska M, Norman G, Parker D, Segala R. Symbolic model checking of probabilistic processes using MTBDDs and the kronecker representation. In: Graf S, Schwartzbach M, editors. Tools and Algorithms for the Construction and Analysis of Systems; Heidelberg: Springer; 2000. pp. 395–410. [Google Scholar]
  • 5.Baier C, de Alfaro L, Forejt V, Kwiatkowska M. Handbook of Model Checking. Cham: Springer; 2018. Model checking probabilistic systems; pp. 963–999. [Google Scholar]
  • 6.Baier C, Hermanns H, Katoen J-P. The 10,000 facets of MDP model checking. In: Steffen B, Woeginger G, editors. Computing and Software Science; Cham: Springer; 2019. pp. 420–451. [Google Scholar]
  • 7.Baier C, Katoen J-P. Principles of Model Checking. Cambridge: MIT Press; 2008. [Google Scholar]
  • 8.Baier C, Klein J, Leuschner L, Parker D, Wunderlich S. Ensuring the reliability of your model checker: interval iteration for markov decision processes. In: Majumdar R, Kunčak V, editors. Computer Aided Verification; Cham: Springer; 2017. pp. 160–180. [Google Scholar]
  • 9.Barthe G, Espitau T, Ferrer Fioriti LM, Hsu J. Synthesizing probabilistic invariants via Doob’s decomposition. In: Chaudhuri S, Farzan A, editors. Computer Aided Verification; Cham: Springer; 2016. pp. 43–61. [Google Scholar]
  • 10.Bartocci E, Kovács L, Stankovič M. Automatic generation of moment-based invariants for prob-solvable loops. In: Chen Y-F, Cheng C-H, Esparza J, editors. Automated Technology for Verification and Analysis; Cham: Springer; 2019. pp. 255–276. [Google Scholar]
  • 11.Batz, K., Junges, S., Kaminski, B.L., Katoen, J.-P., Matheja, C., Schröer, P.: Pric3: Property directed reachability for MDPS. ArXiv e-prints (2020). https://arxiv.org/abs/2004.14835
  • 12.Biere, A.: Bounded model checking, Handbook of Satisfiability. Frontiers in Artificial Intelligence and Applications, vol. 185, pp. 457–481. IOS Press (2009)
  • 13.Bradley AR. SAT-based model checking without unrolling. In: Jhala R, Schmidt D, editors. Verification, Model Checking, and Abstract Interpretation; Heidelberg: Springer; 2011. pp. 70–87. [Google Scholar]
  • 14.Brázdil T, Chatterjee K, Chmelík M, Forejt V, Křetínský J, Kwiatkowska M, Parker D, Ujma M. Verification of Markov decision processes using learning algorithms. In: Cassez F, Raskin J-F, editors. Automated Technology for Verification and Analysis; Cham: Springer; 2014. pp. 98–114. [Google Scholar]
  • 15.Chadha R, Viswanathan M. A counterexample-guided abstraction-refinement framework for Markov decision processes. ACM Trans. Comput. Logist. 2010;12(1):1:1–1:49. [Google Scholar]
  • 16.Chakarov A, Sankaranarayanan S. Probabilistic program analysis with martingales. In: Sharygina N, Veith H, editors. Computer Aided Verification; Heidelberg: Springer; 2013. pp. 511–526. [Google Scholar]
  • 17.Chakraborty, S., Fried, D., Meel, K.S., Vardi, M.Y.: From weighted to unweighted model counting. In: IJCAI, pp. 689–695. AAAI Press (2015)
  • 18.Cheshire S, Aboba B, Guttman E. Dynamic configuration of ipv4 link-local addresses. RFC. 2005;3927:1–33. [Google Scholar]
  • 19.Cimatti A, Griggio A, Mover S, Tonetta S. Infinite-state invariant checking with IC3 and predicate abstraction. FMSD. 2016;49(3):190–218. [Google Scholar]
  • 20.D’Argenio, P.R., Hartmanns, A., Sedwards, S.: Lightweight statistical model checking in nondeterministic continuous time. In: Margaria, T., Steffen, B. (eds.) ISoLA 2018. LNCS, vol. 11245, pp. 336–353. Springer, Cham (2018). 10.1007/978-3-030-03421-4_22
  • 21.D’Argenio, P.R., Jeannet, B., Jensen, H.E., Larsen, K.G.: Reachability analysis of probabilistic systems by successive refinements. In: de Alfaro, L., Gilmore, S. (eds.) PAPM-PROBMIV 2001. LNCS, vol. 2165, pp. 39–56. Springer, Heidelberg (2001). 10.1007/3-540-44804-7_3
  • 22.Dehnert C, Junges S, Katoen J-P, Volk M. A Storm is coming: a modern probabilistic model checker. In: Majumdar R, Kunčak V, editors. Computer Aided Verification; Cham: Springer; 2017. pp. 592–600. [Google Scholar]
  • 23.Eén, N., Mishchenko, A., Brayton, R.K.: Efficient implementation of property directed reachability. In: FMCAD, pp. 125–134. FMCAD Inc. (2011)
  • 24.Fränzle M, Hermanns H, Teige T. Stochastic satisfiability modulo theory: a novel technique for the analysis of probabilistic hybrid systems. In: Egerstedt M, Mishra B, editors. Hybrid Systems: Computation and Control; Heidelberg: Springer; 2008. pp. 172–186. [Google Scholar]
  • 25.Gretz F, Katoen J-P, McIver A. Prinsys—On a Quest for Probabilistic Loop Invariants. In: Joshi K, Siegle M, Stoelinga M, D’Argenio PR, editors. Quantitative Evaluation of Systems; Heidelberg: Springer; 2013. pp. 193–208. [Google Scholar]
  • 26.Gretz F, Katoen J-P, McIver A. Operational versus weakest pre-expectation semantics for the probabilistic guarded command language. Perform. Eval. 2014;73:110–132. doi: 10.1016/j.peva.2013.11.004. [DOI] [Google Scholar]
  • 27.Gurfinkel, A., Ivrii, A.: Pushing to the top. In: FMCAD, pp. 65–72. IEEE (2015)
  • 28.Haddad S, Monmege B. Interval iteration algorithm for MDPs and IMDPs. Theor. Comput. Sci. 2018;735:111–131. doi: 10.1016/j.tcs.2016.12.003. [DOI] [Google Scholar]
  • 29.Hahn EM, Hartmanns A, Hensel C, Klauck M, Klein J, Křetínský J, Parker D, Quatmann T, Ruijters E, Steinmetz M. The 2019 comparison of tools for the analysis of quantitative formal models. In: Beyer D, Huisman M, Kordon F, Steffen B, editors. Tools and Algorithms for the Construction and Analysis of Systems; Cham: Springer; 2019. pp. 69–92. [Google Scholar]
  • 30.Hahn EM, Hermanns H, Wachter B, Zhang L. PASS: abstraction refinement for infinite probabilistic models. In: Esparza J, Majumdar R, editors. Tools and Algorithms for the Construction and Analysis of Systems; Heidelberg: Springer; 2010. pp. 353–357. [Google Scholar]
  • 31.Hahn EM, Li Y, Schewe S, Turrini A, Zhang L. iscasMc: a web-based probabilistic model checker. In: Jones C, Pihlajasaari P, Sun J, editors. FM 2014: Formal Methods; Cham: Springer; 2014. pp. 312–317. [Google Scholar]
  • 32.Han T, Katoen J-P, Damman B. Counterexample generation in probabilistic model checking. IEEE Trans. Software Eng. 2009;35(2):241–257. doi: 10.1109/TSE.2009.5. [DOI] [Google Scholar]
  • 33.Hark, M., Kaminski, B.L., Giesl, J., Katoen, J.-P.: Aiming low is harder: Induction for lower bounds in probabilistic program verification. In: PACMPL 4(POPL), 37:1–37:28 (2020)
  • 34.Hartmanns A, Hermanns H. The modest toolset: an integrated environment for quantitative modelling and verification. In: Ábrahám E, Havelund K, editors. Tools and Algorithms for the Construction and Analysis of Systems; Heidelberg: Springer; 2014. pp. 593–598. [Google Scholar]
  • 35.Hartmanns, A., Kaminski, B.L.: Optimistic value iteration. CAV. LNCS, Springer (2020). [to appear]
  • 36.Hassan, Z., Bradley, A.R., Somenzi, F.: Better generalization in IC3. In: FMCAD, pp. 157–164. IEEE (2013)
  • 37.Hermanns H, Wachter B, Zhang L. Probabilistic CEGAR. In: Gupta A, Malik S, editors. Computer Aided Verification; Heidelberg: Springer; 2008. pp. 162–175. [Google Scholar]
  • 38.Hoder K, Bjørner N. Generalized property directed reachability. In: Cimatti A, Sebastiani R, editors. Theory and Applications of Satisfiability Testing – SAT 2012; Heidelberg: Springer; 2012. pp. 157–171. [Google Scholar]
  • 39.Kaminski, B.L.: Advanced Weakest Precondition Calculi for Probabilistic Programs. Ph.D. thesis, RWTH Aachen University, Germany (2019). http://publications.rwth-aachen.de/record/755408/files/755408.pdf
  • 40.Kaminski BL, Katoen J-P, Matheja C, Olmedo F. Weakest precondition reasoning for expected runtimes of randomized algorithms. J. ACM. 2018;65(5):30:1–30:68. doi: 10.1145/3208102. [DOI] [Google Scholar]
  • 41.Kattenbelt M, Kwiatkowska MZ, Norman G, Parker D. A game-based abstraction-refinement framework for Markov decision processes. FMSD. 2010;36(3):246–280. [Google Scholar]
  • 42.Kozen, D.: A probabilistic PDL. In: STOC, pp. 291–297. ACM (1983)
  • 43.Kwiatkowska M, Norman G, Parker D. PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan G, Qadeer S, editors. Computer Aided Verification; Heidelberg: Springer; 2011. pp. 585–591. [Google Scholar]
  • 44.Lange, T., Neuhäußer, M.R., Noll, T., Katoen, J.-P.: IC3 software model checking. In: STTT, vol. 22, pp. 135–161 (2020)
  • 45.Lassez JL, Nguyen VL, Sonenberg L. Fixed point theorems and semantics: a folk tale. Inf. Process. Lett. 1982;14(3):112–116. doi: 10.1016/0020-0190(82)90065-5. [DOI] [Google Scholar]
  • 46.McIver A, Morgan C. Abstraction, Refinement and Proof for Probabilistic Systems. New York: Springer; 2005. [Google Scholar]
  • 47.de Moura L, Bjørner N. Z3: an efficient SMT solver. In: Ramakrishnan CR, Rehof J, editors. Tools and Algorithms for the Construction and Analysis of Systems; Heidelberg: Springer; 2008. pp. 337–340. [Google Scholar]
  • 48.Park D. Fixpoint induction and proofs of program properties. Machine intelligence. 1969;5:59–78. [Google Scholar]
  • 49.Polgreen, E., Brain, M., Fränzle, M., Abate, A.: Verifying reachability properties in Markov chains via incremental induction. CoRR abs/1909.08017 (2019)
  • 50.Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Statistics, Wiley, Hoboken (1994)
  • 51.Quatmann T, Katoen J-P. Sound value iteration. In: Chockler H, Weissenbacher G, editors. Computer Aided Verification; Cham: Springer; 2018. pp. 643–661. [Google Scholar]
  • 52.Rabe MN, Wintersteiger CM, Kugler H, Yordanov B, Hamadi Y. Symbolic approximation of the bounded reachability probability in large Markov chains. In: Norman G, Sanders W, editors. Quantitative Evaluation of Systems; Cham: Springer; 2014. pp. 388–403. [Google Scholar]
  • 53.Seufert, T., Scholl, C.: Sequential verification using reverse PDR. MBMV. pp. 79–90. Shaker Verlag (2017)
  • 54.Suenaga K, Ishizawa T. Generalized property-directed reachability for hybrid systems. In: Beyer D, Zufferey D, editors. Verification, Model Checking, and Abstract Interpretation; Cham: Springer; 2020. pp. 293–313. [Google Scholar]
  • 55.Takisaka T, Oyabu Y, Urabe N, Hasuo I. Ranking and repulsing supermartingales for reachability in probabilistic programs. In: Lahiri SK, Wang C, editors. Automated Technology for Verification and Analysis; Cham: Springer; 2018. pp. 476–493. [Google Scholar]
  • 56.Vazquez-Chanlatte, M., Rabe, M.N., Seshia, S.A.: A model counter’s guide to probabilistic systems. CoRR abs/1903.09354 (2019)
  • 57.Wimmer R, Braitling B, Becker B. Counterexample generation for discrete-time markov chains using bounded model checking. In: Jones ND, Müller-Olm M, editors. Verification, Model Checking, and Abstract Interpretation; Heidelberg: Springer; 2008. pp. 366–380. [Google Scholar]

Articles from Computer Aided Verification are provided here courtesy of Nature Publishing Group

RESOURCES