People often think of formal methods and theorem provers as forbidding tools, cool in theory but with a steep learning curve that makes them hard to use in real life. In this post, we’re going to describe a case we ran into recently where we were able to leverage theorem proving technology, Z3 in particular, to validate some real world engineering we were doing on the OCaml compiler. This post is aimed at readers interested in compilers, but assumes no familiarity with actual compiler development.

We’ll start by discussing the kind of compiler optimizations we’re interested in and why they might be difficult to get right, in particular when applied to floating-point computations. We’ll then show how we used Z3 to review some optimizations we were considering adding to the OCaml compiler, and finding a subtle bug in one of them.

## Constant Folding

Compilers usually perform a lot of optimizations to ensure that the
programs written by developers run as fast as possible. Among these
optimizations, a common one is
constant folding,
which allows the compiler to rewrite an expression such as `2 * 3`

to
`6`

, performing the computation at compile time rather than at run
time. Of course, developers don’t usually write expression like
`2 * 3`

, but such examples show up anyway, because other compiler
optimizations can create them. e.g., by replacing a symbolic constant
with its actual value. Similarly, constant folding can in turn trigger
other optimizations, like determining that a code fragment is
unreachable, which leads to it being deleted.

A natural question is, of course, the applicability of such optimizations. In the previous example, a multiplication can be computed at compile time not only if we know the values of both operands, but also by applying basic mathematical knowledge such as:

- the presence of a
*neutral element*:`1 * expr`

can be rewritten to`expr`

; - the presence of an
*absorbing element*:`0 * expr`

can be rewritten to`0`

; - the equivalence with another operation:
`-1 * expr`

can be rewritten to`-expr`

(the latter being both shorter and faster on most CPUs).

A common pitfall for compiler developers is to apply mathematical knowledge to objects having slightly different properties. Indeed, while the rewrites above are trivially correct for mathematical integers, one has to keep in mind that what we call integers on computers is typically quite different from what we call integers in mathematics. For instance, in most programming languages, integer values are represented using a fixed number of bits which leads to a bounded range of possible integers.

As an example, if a signed integer is represented using 8 bits, then the range
of possible values is usually `-128`

..`127`

. In this setting, is it correct to
rewrite `-1 * expr`

to `-expr`

? It turns out that it is, but only because the
usual way overflow is handled (at least when integer values are encoded using
the two’s complement
representation) is such that `-(-128) = -128`

. The mathematical intuition was
correct, but for not-so-obvious reasons. Floating-point numbers are
unfortunately even trickier to manipulate.

## Floating-point Computations

Floating-point numbers
are an approximate representation of real numbers. Floating-point computations
are notoriously error-prone, mostly because some basic mathematical properties
over real numbers do not hold for floating-point values. For instance,
floating-point addition is not associative *i.e.* given three floating-point
values `x`

, `y`

and `z`

, the following computations do not always yield the same
result:

`x + (y + z)`

;`(x + y) + z`

.

To complicate things even further, some floating-point subtleties have no direct
equivalent in mathematics, such as the fact that there are two different values
for zero: a positive one and a negative one. The values will be considered equal
when compared but may yield different results when used in computations. For
instance, `1.0 / +0.0`

is equal to positive infinity while `1.0 / -0.0`

is equal
to negative infinity. (Which, incidentally, means it is not safe to replace a
floating-point value with another one that is equal according to floating-point
comparison.)

Compiler optimizations are expected to transform programs while retaining the same behavior. As a consequence, a number of compilers either do not perform any optimization on floating-point computations, or provide means to specifically disable such optimizations. The solution we discuss in this post is the restriction to optimizations that always rewrite to computations producing the same result.

While the previous sentence seems to be perfectly reasonable, it is
missing practical details: what do we mean by “always” and “same”? The
latter is easy to *define*: we regard two floating-point values as
identical if and only if they have the very same bit pattern according
to the
IEEE 754
specification (this ensures that they could not be discriminated by
any operation). The former is considerably trickier to *enforce*
because all possible combinations of values have to be considered, and
ideally all possible rounding modes. A rounding mode determines which
value should be chosen when the exact result of an operation cannot be
represented (because floating-point values are only approximations).

## Automated Theorem Proving to the Rescue

The traditional way to demonstrate that a property is always true is to write a mathematical proof. However, writing a formal proof is hard work, so we are going to take an alternate approach, which is to use a tool in order to automate the process of doing the proof.

A SAT solver is a tool able to solve a
boolean *SAT*isfiability problem,
*i.e.* to determine whether, given a logical formula such as
`a`

∧ (`b`

∨ `c`

) ∧ ¬ `c`

∧ ¬ (`b`

∧ `d`

),
there exists a mapping from the formula variables to truth values that evaluates
the formula to *true*.

SMT stands for
*Satisfiability Modulo Theories*,
where the *theories* act as a built-in knowledge about various kinds of
structures such as integers or arrays. Thus, accepting formulas such as
(`a`

> 10) ∧ (`b`

∨ (c = 2)) ∧ ¬ (`b`

∧ (`d[c]`

< 0)).
The built-in knowledge of SMT solvers makes them much more efficient than SAT
solvers because in a number of situations they can leverage properties from a
given *theory* in order to reduce the search space.

The basic use of an SMT solver consists of declaring various entities
and expressing constraints over those entities before asking the
solver whether the constraints can be *satisfied* (*i.e.* can hold
simultaneously). If the constraints can be satisfied, the solver can
usually produce a model, that is a mapping from entities to values
that satisfy the constraints.

Z3 is an SMT solver that comes with support for both integer values (through bit vectors of custom sizes) and floating-point values (using the IEEE 754 representation). Assuming we want to check whether the positive zero is a right neutral element for addition, we would submit the following query to Z3:

```
(set-logic QF_FP)
(declare-const x (_ FloatingPoint 11 53))
(define-fun z () (_ FloatingPoint 11 53) (_ +zero 11 53))
(declare-const r RoundingMode)
(assert (not (= (fp.add r x z) x)))
(check-sat)
(get-model)
```

Besides the first line that sets the theory to be used, the file is relatively straightforward:

- the
`declare-const`

/`define-fun`

lines declare the various entities to be used later on: an unknown floating-point value, an alias for the positive zero, and an unknown rounding mode (the values`11`

and`53`

are simply the numbers of bits used for respectively the exponent and the mantissa of a floating-point value); - the
`assert`

line encodes the formula we want to check*i.e.*`x + +0 = x`

; - the last two lines ask Z3 to check the satisfiability of the formula and to output the model if any.

It is noteworthy that we actually encode the negation of the property we want to check. As a consequence, satisfiability means that the property does not hold, and the model is actually a counterexample. If we get a counterexample, then we can double-check that Z3 is right. Otherwise, we basically have to trust both Z3 itself and our use of the tool.

Z3 answers the previous query with the following:

```
sat
(model
(define-fun x () (_ FloatingPoint 11 53)
(_ -zero 11 53))
(define-fun r () RoundingMode
roundNearestTiesToEven)
)
```

On the first line, `sat`

means that the assertion is satisfied. The other lines
simply present a model (*i.e.* a counterexample for us) where `x`

is bound to
negative zero. The suspicious Z3 user can then feed the values to an OCaml
toplevel:

```
let x = -0.
and z = +0.
and bits = Int64.bits_of_float in
bits (x +. z), bits x;;
```

whose output indeed shows two different values (`0`

and
`-9223372036854775808`

). We should hence not rewrite `x + +0`

to `x`

.

Should Z3 have answered that there is no model for a given request, we would
have been able to ask it to produce a proof. However, the proof establishing
that there is no model for `(assert (not (= (fp.mul r x one) x)))`

(*i.e.* that
the `x * 1`

⇝ `x`

rewrite is correct) has more than 100,000 lines. This
means that, in practice, such a proof would only be passed to another tool and
thus only move the trust issue to another piece of code…

## Upcoming flambda 2.0

The previous example has not been chosen at random. The `x + +0`

⇝ `x`

rewrite was for a (very) short time part of the development version of the
compiler, and then rapidly removed once Z3 proved its invalidity. Possibly
saving us a painful debugging session later on.

Using Z3, we have checked that all the simplifications over integer and floating-point values (involving arithmetic, bitwise and shifting operators) in the upcoming flambda 2.0 variant of the OCaml compiler are correct. More, these simplifications are correct for all integer sizes (OCaml defining 4 kinds of integers) and all rounding modes (OCaml offers no means to tweak the rounding mode, but it could be modified through an external function call).

Eventually, we plan to use a ppx extension combined to a simple DSL in order to be able to express the properties underlying the various rewrites as mere attributes of the OCaml code actually performing the rewrites (hence ensuring that the checks and the code are kept synchronized). The simplifications for floating-point multiplication when only one operand is known would look like:

```
...
| Mul ->
if F.equal this_side F.one then
The_other_side
[@z3 check_float_binary_neutral `Mul (+1.0) `Right]
[@z3 check_float_binary_neutral `Mul (+1.0) `Left]
else if F.equal this_side F.minus_one then
Float_negation_of_the_other_side
[@z3 check_float_binary_opposite `Mul (-1.0) `Left]
[@z3 check_float_binary_opposite `Mul (-1.0) `Right]
else
Cannot_simplify
...
```

Relying on a tool such as Z3 to check whether candidate rewrites are correct is invaluable. It does not only ensure that we do not introduce (possibly subtle and hard-to-reproduce) bugs, but also paves the way to more complex rewrites. Indeed, the safety net acts as an encouragement to experiment with ideas that could be deemed too risky otherwise.

##### Further reading

*Verifying Optimizations using SMT Solvers*
(by Nuno Lopes) shows how Z3 can be used to ensure that transforms from LLVM IR
to assembly code are correct.

*What Every Computer Scientist Should Know About Floating-Point Arithmetic*
(by David Goldberg) is an in-depth lecture about floating-point representation
and operations, addressing a number of common misconceptions.