Principles of Symbolic Model Checking with Apalache

In order to take advantage of Apalache's symbolic model checking, there are a few principles one must bear in mind when writing TLA.

Assignments and symbolic transitions

Let us go back to the example test/tla/y2k.tla and run apalache against test/tla/y2k_override.tla:

$ apalache check y2k_override.tla

We can check the detailed output of the TransitionFinderPass in the file x/<timestamp>/out-transition.tla, where <timestamp> looks like 09.03-10.03.2020-508266549191958257:

---------------------------- MODULE y2k_override ----------------------------
(*
 * One way to instantiate constants for apalache is to use the OVERRIDE prefix. 
 *)
 
EXTENDS y2k

OVERRIDE_BIRTH_YEAR == 80
OVERRIDE_LICENSE_AGE == 18

=============================================================================
\* Modification History
\* Last modified Tue Jan 07 11:24:55 CET 2020 by igor
\* Created Tue Jan 07 11:16:18 CET 2020 by igor

As you can see, the model checker did two things:

  1. It has translated several expressions that look like x' = e into x' := e. For instance, you can see year' := 80 and hasLicense' := FALSE in Init$0. We call these expressions assignments.
  2. It has factored the operator Next into two operators Next$0 and Next$1. We call these operators symbolic transitions.

Pure TLA+ does not have the notions of assignments and symbolic transitions. However, TLC sometimes treats expressions x' = e and x' \in S as if they were assigning a value to the variable x'. TLC does so dynamically, during the breadth-first search. Apalache looks statically for assignments among the expressions x' = e and x' \in S.

When factoring out operators into symbolic transitions, Apalache splits the action operators Init and Next into disjunctions (e.g., A_0 \/ ... \/ A_n), represented in the concrete syntax as a sequence of operator definitions of the form

A$0 == ...
...
A$n == ...

The main contract between the assignments and symbolic transitions is as follows:

For every variable x declared with VARIABLE, there is exactly one assignment of the form x' := e in every symbolic transition A_n.

If Apalache cannot find expressions with the above properties, it fails. Consider the example test/tla/Assignments20200309.tla:

----- MODULE Assignments20200309 -----
VARIABLE
    \* @type: Int;
    a

\* this specification fails, as it has no expression
\* that can be treated as an assignment
Init == TRUE
Next == a' = a
Inv == FALSE
===============

Running the checker with

apalache check Assignments20200309.tla

Apalache reports an error as follows:

...
PASS #9: TransitionFinderPass
To understand the error, check the manual:
[https://apalache.informal.systems/docs/apalache/assignments.html]
Assignment error: No assignments found for: a
It took me 0 days  0 hours  0 min  1 sec
Total time: 1.88 sec
EXITCODE: ERROR (99)

Type annotations

Check the Snowcat tutorial and HOWTO on annotations.

Naturals

If you look carefully at the HOWTO on annotations, you will find that there is no designated type for naturals. Indeed, one can just use the type Int, whenever a natural number is required. If we introduced a special type for naturals, that would cause a lot of confusion for the type checker. What would be the type of the literal 42? That depends on, whether you extend Naturals or Integers. And if you extend Naturals and later somebody else extends your module and also Integers, should be the type of 42 be an integer?

Apalache still allows you to extend Naturals. However, it will treat all number-like literals as integers. This is consistent with the view that the naturals are a subset of the integers, and the integers are a subset of the reals. Classically, one would not define subtraction for naturals. However, the module Naturals defines binary minus, which can easily drive a variable outside of Nat. For instance, see the following example:

----------------------------- MODULE NatCounter ------------------------
EXTENDS Naturals

VARIABLE
    \* @type: Int;
    x

Init == x = 3

\* a natural counter can go below zero, and this is expected behavior
Next == x' = x - 1

Inv == x >= 0
========================================================================

Given that you will need the value Int for a type annotation, it probably does not make a lot of sense to extend Naturals in your own specifications, as you will have to extend Integers for the type annotation too. We are currently working on a different kind of type annotations, which would not require Int.

Recursive operators and functions

WARNING: In our experience, it is hard to check recursive operators and functions in Apalache:

  • Recursive operators need additional annotations.
  • Resursive operators and functions usually apply CHOOSE, which is hard to implement faithfully without blow up the SMT encoding.

Hence, we recommend using the operators FoldSet and FoldSeq:

(*****************************************************************************)
(* The folding operator, used to implement computation over a set.           *)
(* Apalache implements a more efficient encoding than the one below.         *)
(* (from the community modules).                                             *)
(*****************************************************************************)
RECURSIVE FoldSet(_,_,_)
FoldSet( Op(_,_), v, S ) == IF S = {}
                            THEN v
                            ELSE LET w == CHOOSE x \in S: TRUE
                                  IN LET T == S \ {w}
                                      IN FoldSet( Op, Op(v,w), T )

(*****************************************************************************)
(* The folding operator, used to implement computation over a sequence.      *)
(* Apalache implements a more efficient encoding than the one below.         *)
(* (from the community modules).                                             *)
(*****************************************************************************)
RECURSIVE FoldSeq(_,_,_)
FoldSeq( Op(_,_), v, seq ) == IF seq = <<>>
                              THEN v
                              ELSE FoldSeq( Op, Op(v,Head(seq)), Tail(seq) )

These operators are treated by Apalache in a more efficient manner than recursive operators. They are always terminating and thus do not require an annotation that specifies an upper bound on the number of operator iterations. We are preparing a tutorial on this topic.

Recursive operators

In the preprocessing phase, Apalache replaces every application of a user operator with its body. We call this process "operator inlining". This cannot be done for recursive operators, for two reasons:

  1. A recursive operator may be non-terminating (although a non-terminating operator is useless in TLA+);

  2. A terminating call to an operator may take an unpredicted number of iterations.

However, in practice, when one fixes specification parameters (that is, CONSTANTS), it is usually easy to find a bound on the number of operator iterations. For instance, consider the following specification:

--------- MODULE Rec6 -----------------
EXTENDS Integers

N == 5

VARIABLES
    \* @type: Set(Int);
    set,
    \* @type: Int;
    count

RECURSIVE Sum(_)

Sum(S) ==
  IF S = {}
  THEN 0
  ELSE LET x == CHOOSE y \in S: TRUE IN
    x + Sum(S \ {x})

UNROLL_DEFAULT_Sum == 0
UNROLL_TIMES_Sum == N

Init ==
  /\ set = {}
  /\ count = 0

Next ==
  \E x \in (1..N) \ set:
    /\ count' = count + x
    /\ set' = set \union {x}

Inv == count = Sum(set)
=======================================    

It is clear that the expression Sum(S) requires the number of iterations that is equal to Cardinality(S) + 1. Moreover, the expression set \subseteq 1..N is an invariant, and thus every call Sum(set) requires up to N+1 iterations.

When we can find an upper bound on the number of iterations, Apalache can unroll the recursive operator up to this bound. To this end, we define two additional operators. For instance:

--------- MODULE MC_Rec6 --------------
VARIABLE count, set

INSTANCE Rec6 WITH N <- 5

UNFOLD_TIMES_Sum == 6
UNFOLD_DEFAULT_Sum == 0

=======================================

In this case, Apalache unrolls every call to Sum exactly UNROLL_TIMES_Sum times, that is, four times. On the default branch, Apalache places UNROLL_DEFAULT_Sum, that is, 0.

All recursively defined operators should follow this convention where, for every such operator Oper, the user defines both UNROLL_TIMES_Oper, which expands to a positive integer value, and UNROLL_DEFAULT_Oper, which expands to some default value Oper(args*) should take, if the computation would require more than UNROLL_TIMES_Oper recursive calls. At present, we only support literals (e.g. 4) or primitive arithmetic expressions (e.g. 2 + 2) in the body of UNROLL_TIMES_Oper.

Recursive functions

Apalache offers limited support for recursive functions. However, read the warning below on why you should not use recursive functions. The restrictions are as follows:

  1. Apalache supports recursive functions that return an integer or a Boolean.

  2. As Apalache's simple type checker is not able to find the type of a recursive function, all uses of a recursive function should come with a type annotation.

  3. As in TLC, the function domain must be a finite set.

The example below shows a recursive function that computes the factorial of n.

------------------------------ MODULE Rec8 ------------------------------------
EXTENDS Integers

VARIABLES
    \* @type: Int;
    n,
    \* @type: Int;
    factSpec,
    \* @type: Int;
    factComp

(*
 Defining a recursive function on a finite domain. Although it is rather
 unnatural to define factorial on a finite set, both Apalache and TLC
 require finite domains. As is usual for function application, the result
 of the application is not defined on the elements outside of the function
 domain.
 *)
Fact[k \in 1..20] ==
    IF k <= 1
    THEN 1
    ELSE k * Fact[k - 1]

Init ==
    /\ n = 1
    /\ factSpec = Fact[n]
    /\ factComp = 1

Next ==     
    /\ n' = n + 1
    /\ factSpec' = Fact[n']
    /\ factComp' = n' * factComp

Inv ==
    factComp = factSpec

===============================================================================

Check other examples in test/tla that start with the prefix Rec.

Why you should avoid recursive functions. Sometimes, recursive functions concisely describe the function that you need. The nice examples are the factorial function (see above) and Fibonacci numbers (see Rec3). However, when you define a recursive function over sets, the complexity gets ugly really fast.

Consider the example Rec9, which computes set cardinality. Here is a fragment of the spec:

 The set cardinality function. It needs an upper bound on the set size.
 Although this function looks nice, be warned that this definition requires us
 to construct the powerset SUBSET NUMS and then write down the constraints
 for the function Card. This encoding is (at least) double-exponential.
 *)
Card[S \in SUBSET NUMS] ==
    IF S = {}
    THEN 0
    ELSE LET i == CHOOSE j \in S: TRUE IN
        1 + Card[S \ {i}]

Init ==
    /\ set = {}
    /\ size = 0

Since we cannot fix the order, in which the set elements are evaluated, we define function Card over SUBSET NUMS, that is, all possible subsets of NUMS. Apalache translates the function in a quantifier-free theory of SMT. Hence, in this case, Apalache expands SUBSET NUMS, so it introduces 2^|NUMS| sets! Further, Apalache writes down the SMT constraints for the domain of Card. As a result, it produces NUMS * 2^|NUMS| constraints. As you can see, recursive functions over sets explode quite fast.

It is usually a good idea to use recursive operators over sets rather than recursive functions. The downside is that you have to provide an upper bound on the number of the operator iterations. The upside is that recursive operators are usually unrolled more efficiently. (If they contain only constant expressions, they are even computed by the translator!) For instance, set cardinality does not require 2^|NUMS| constraints, when using a recursive operator.