How to optimize Concuerror's search?

Table of contents

  1. The bound options
    1. Depth bound
    2. Interleaving bound
    3. Scheduling bound
      1. “Simpler” schedulings
      2. Scheduling bound
      3. Scheduling bound types
  2. Using the bounds and optimizing your tests
    1. See why Concuerror is exploring interleavings
    2. “Shutdown” sequences of OTP behaviours
    3. Start simple
    4. Use a scheduling bound
    5. Read the hints
  3. Final note

This post contains tips for optimizing Concuerror’s search.

The bound options

Concuerror has a number of different bound-related options. These can be used to reduce the search space is different ways.

Depth bound

The depth bound (-d, --depth_bound) is a limit on how many events1 there can be in a single execution of the given test. Concuerror requires test executions to have finite length, and the depth limit is a simple way to guarantee that every execution has a limited number of events. However, if an execution reaches this limit Concuerror cannot prove correctness of the test (and it will report so but you can make it keep exploring executions anyway).

Interleaving bound

The interleaving bound (-i, --interleaving_bound) is a hard limit on how many interleavings of the given test will be explored. Concuerror is currently always deterministic, so, unless the test or other options are modified, the same interleavings will be explored, in the same order, and running a test repeatedly will not return different results.

Scheduling bound

Exploring all the interleavings of a concurrent program (and therefore verifying it), even with Concuerror’s specialized techniques, can lead to exploring a number of interleavings that grows exponentially with the length of the execution. If one is interested in finding bugs, on the other hand, it might be fruitful to explore “simpler” schedulings before trying more elaborate ones.

Concuerror can help achieve that goal by using a scheduling bound. As explained in the Bounding chapter in “Effective Techniques for Stateless Model Checking”:

This [goal] can be achieved using bounding techniques, that impose constraints on how/when processes can be scheduled by a stateless model checking algorithm and force the algorithm to focus the exploration on schedulings that satisfy those constraints. In this way, bugs in ‘simpler’ schedulings can be detected faster than when using exhaustive exploration. Schedulings that violate the constraints can also be explored, but each exploration begins with a budget (also called a bound), which is spent whenever the algorithm schedules processes in a way that violates the constraints. When no budget remains, the SMC algorithm can only explore schedulings that satisfy the constraints.

“Simpler” schedulings

When schedule bounding techniques are used, the “simplest” scheduling is chosen by the combination of scheduling options, currently --scheduling and --strict_scheduling.

Scheduling bound

The -b, --scheduling_bound option defines the value of the scheduling bound.

Scheduling bound types

The possible options for scheduling bounding are

  • none (default), if no scheduling bounding should be used

  • delay (recommended, when using bounding, and selected by default if a value is chosen for --scheduling_bound and no type is specified), which in Concuerror does not correspond to the technique found in other bibliography, but at a variant which we have named “exploration tree bounding” in the same publication2. A benefit of this technique is that it is compatible with any DPOR algorithm and leads to fairly smooth scaling of the explored search space with the increase of the bound.

  • bpor, as described by [16] in the same publication3, which is based on limiting the number of times a process can be preempted. A practical weakness of this technique is that when a process becomes blocked any other process can be scheduled without consuming bounding budget, possibly leading to unpredictable scaling behaviours.

  • ubpor, which is a mostly abandoned experimental variant of bpor and is not recommended.

Using the bounds and optimizing your tests

This section provides some general guidelines on how to use Concuerror effectively.

See why Concuerror is exploring interleavings

Concuerror explores interleavings based on racing operations it detects. Using the log_all option together with show_races (and a small interleaving bound, to just sample the space of the test) can let you see the racing operations in a test, and can be a useful aid in simplifying.

“Shutdown” sequences of OTP behaviours

If you are spawning processes using supervisors or OTP behaviours, it is often the case that when your main test process finishes a cascade of shutdowns will be triggered, which may or may not lead to more races. If you are not interested in the behaviour of your processes shutting down, blocking the main test process with a receive after infinity -> ok end (and ignoring the deadlocks with --ignore-error deadlock) might simplify your test.

Start simple

Two or three processes and relatively simple test scenarios are enough to trigger most race conditions. Even a simple “sequential concatenation” of two complex racing scenarios will lead to a state space that has the size of the product of the two individual scenarios (Concuerror is dynamic and cannot know e.g. that the second scenario always, independently follows the first).

Use a scheduling bound

Using a scheduling bound can quickly and effectively lead to a better search for bugs in your tests. A good way to summarize this is: “if you suspect a bug is there, a small scheduling bound will find it”.

Read the hints

Concuerror emits hints upon detecting some patterns that may lead to unexpected explosion of the search space. Reading and understanding them is recommended.

Final note

Concuerror is a fairly expert tool and mastering its use requires practise. If you have further questions consider communicating over the suggested channels!

Footnotes

  1. As events, Concuerror considers any kind of Erlang built-in operation that can be potentially interfering with operations in other processes, such as the delivery of a message, the expiration of a timeout, or writing and reading values in an ETS table. If you are interested in a snapshot of what kind of operations in Erlang programs can interfere, read the paper “The shared-memory interferences of Erlang/OTP built-ins”

  2. “Effective Techniques for Stateless Model Checking” 

  3. Same as above.