Where the Coq lives

| Comments GitHub

Writing reflective tactics

One important aspect of Coq's logic is the special status given to computation: while some systems require one to apply explicit deductive steps to show that two given terms are equal, Coq's logic considers any two terms that evaluate to the same result to be equal automatically, without the need for additional reasoning.
Without getting into too much detail, we can illustrate this idea with some simple examples. Russell and Whitehead's seminal Principia Mathematica had to develop hundreds of pages of foundational mathematics before being able to prove that 1 + 1 = 2. In contrast, here's what this proof looks like in Coq:

Definition one_plus_one : 1 + 1 = 2 := erefl.

erefl is the only constructor of the eq type; its type, forall A (a : A), a = a, tells us that we can use it to prove that given term a is equal to itself. Coq accepts one_plus_one as a valid proof because, even though the two sides of the equation are not syntactically the same, it is able to use the definition of + to compute the left-hand side and check that the result is the same as the right-hand side. This also works for some statements with variables in them, for instance

Definition zero_plus_n n : 0 + n = n := erefl.

The same principle applies here: + is defined by case analysis on its first argument, and doesn't even need to inspect the second one. Since the first argument on the left-hand side is a constructor (0), Coq can reduce the expression and conclude that both sides are equal.
Unfortunately, not every equality is a direct consequence of computation. For example, this proof attempt is rejected:

Fail Definition n_plus_zero n : n + 0 = n := erefl.

What happened here? As mentioned before, + is defined by case analysis on the first argument; since the first argument of the left-hand side doesn't start with a constructor, Coq doesn't know how to compute there. As it turns out, one actually needs an inductive argument to prove this result, which might end up looking like this, if we were to check the proof term that Coq produces:

Fixpoint n_plus_zero n : n + 0 = n :=
  match n with
  | 0 => erefl
  | n.+1 => let: erefl := n_plus_zero n in erefl

It seems that, although interesting, computation inside Coq isn't of much use when proving something. Or is it?
In this post, I will show how computation in Coq can be used to write certified automation tactics with a technique known as proof by reflection. Reflection is extensively used in Coq and in other proof assistants as well; it is at the core of powerful automation tactics such as ring, and played an important role in the formalization of the Four-color theorem. As a matter of fact, the name Ssreflect stands for small-scale reflection, due to the library's pervasive use of reflection and computation.
Let's see how reflection works by means of a basic (but interesting) example: a tactic for checking equalities between simple expressions involving natural numbers.

Arithmetic with reflection

Imagine that we were in the middle of a proof and needed to show that two natural numbers are equal:

Lemma lem n m p : (n + m) * p = p * m + p * n.

ring is powerful enough to solve this goal by itself, but just for the sake of the example, suppose that we had to prove it by hand. We could write something like

Proof. by rewrite mulnDl (mulnC n) (mulnC m) addnC. Qed.

This was not terribly complicated, but there's certainly room for improvement. In a paper proof, a mathematician would probably assume that the reader is capable of verifying this result on their own, without any additional detail. But how exactly would the reader proceed?
In the case of the simple arithmetic expression above, it suffices to apply the distributivity law as long as possible, until both expressions become a sum of monomials. Then, thanks to associativity and commutativity, we just have to reorder the factors and terms and check that both sides of the equation match.
The idea of proof by reflection is to reduce a the validity of a logical statement to a symbolic computation, usually by proving a theorem of the form thm : b = true -> P with b : bool. If b can be computed explicitly and reduces to true, then Coq recognizes erefl as a proof of b = true, which means that thm erefl becomes a proof of P.
To make things concrete, let's go back to our example. The idea that we described above for checking whether two numbers are equal can be used whenever we have expressions involving addition, multiplication, and variables. We will define a Coq data type for representing such expressions, as we will need to compute with them:

Inductive expr :=
| Var of nat
| Add of expr & expr
| Mul of expr & expr.

Variables are represented by natural numbers using the Var constructor, and Add and Mul can be used to combine expressions. The following term, for instance, represents the expression n * (m + n):

Example expr_ex :=
  Mul (Var 0) (Add (Var 1) (Var 0)).

where Var 0 and Var 1 denote n and m, respectively.
If we are given a function vals assigning variables to numbers, we can compute the value of an expression with a simple recursive function:

Fixpoint nat_of_expr vals e :=
  match e with
  | Var v => vals v
  | Add e1 e2 => nat_of_expr vals e1 + nat_of_expr vals e2
  | Mul e1 e2 => nat_of_expr vals e1 * nat_of_expr vals e2

Now, since every expression of that form can be written as a sum of monomials, we can define a function for converting an expr to that form:

Fixpoint monoms e :=
  match e with
  | Var v => [:: [:: v] ]
  | Add e1 e2 => monoms e1 ++ monoms e2
  | Mul e1 e2 => [seq m1 ++ m2 | m1 <- monoms e1, m2 <- monoms e2]

Here, each monomial is represented by a list enumerating all variables that occur in it, counting their multiplicities. Hence, a sum of monomials is represented as a list of lists. For example, here's the result of normalizing expr_ex:

Example monoms_expr_ex :
  monoms expr_ex = [:: [:: 0; 1]; [:: 0; 0]].
Proof. by []. Qed.

To prove that monoms has the intended behavior, we show that the value of an expression is preserved by it. By using the big operations \sum and \prod from the MathComp library, we can compute the value of a sum of monomials very easily:

Lemma monomsE vals e :
  nat_of_expr vals e = \sum_(m <- monoms e) \prod_(v <- m) vals v.
elim: e=> [v|e1 IH1 e2 IH2|e1 IH1 e2 IH2] /=.
- by rewrite 2!big_seq1.
- by rewrite big_cat IH1 IH2.
rewrite {}IH1 {}IH2 big_distrlr /=.
elim: (monoms e1) (monoms e2)=> [|v m1 IH] m2 /=; first by rewrite 2!big_nil.
rewrite big_cons big_cat /= IH; congr addn.
by rewrite big_map; apply/eq_big=> //= m3 _; rewrite big_cat.

Hence, to check that two expressions are equivalent, it suffices to compare the results of monoms, modulo the ordering. We can do this by sorting the variable names on each monomial and then testing whether one list of monomials is a permutation of the other:

Definition normalize := map (sort leq) \o monoms.

Lemma normalizeE vals e :
  nat_of_expr vals e = \sum_(m <- normalize e) \prod_(v <- m) vals v.
rewrite monomsE /normalize /=; elim: (monoms e)=> [|m ms IH] /=.
  by rewrite big_nil.
rewrite 2!big_cons IH; congr addn.
by apply/eq_big_perm; rewrite perm_eq_sym perm_sort.

Definition expr_eq e1 e2 := perm_eq (normalize e1) (normalize e2).

Lemma expr_eqP vals e1 e2 :
  expr_eq e1 e2 ->
  nat_of_expr vals e1 = nat_of_expr vals e2.
Proof. rewrite 2!normalizeE; exact/eq_big_perm. Qed.

To see how this lemma works, let's revisit our original example. Here's a new proof that uses expr_eqP:

Lemma lem' n m p : (n + m) * p = p * m + p * n.
exact: (@expr_eqP (nth 0 [:: n; m; p])
                  (Mul (Add (Var 0) (Var 1)) (Var 2))
                  (Add (Mul (Var 2) (Var 1)) (Mul (Var 2) (Var 0)))

The first argument to our lemma assigns "real" variables to variable numbers: 0 corresponds to n (the first element of the list), 1 to m, and 2 to p. The second and third argument are symbolic representations of the left and right-hand sides of our equation. The fourth argument is the most interesting one: the expr_eq was defined as a boolean function that returns true when its two arguments are equivalent expressions. As we've seen above, this means that whenever expr_eq e1 e2 computes to true, erefl is a valid proof of it. Finally, when Coq tries to check whether the conclusion of expr_eqP can be used on our goal, it computes nat_of_expr on both sides, realizing that the conclusion and the goal are exactly the same. For instance:

Lemma expr_eval n m p :
  nat_of_expr (nth 0 [:: n; m; p]) (Mul (Add (Var 0) (Var 1)) (Var 2))
  = (n + m) * p.
Proof. reflexivity. Qed.

Of course, expr_eqP doesn't force its first argument to always return actual Coq variables, so it can be applied even in some cases where the expressions contain other operators besides + and *:

Lemma lem'' n m : 2 ^ n * m = m * 2 ^ n.
exact: (@expr_eqP (nth 0 [:: 2 ^ n; m])
                  (Mul (Var 0) (Var 1)) (Mul (Var 1) (Var 0))

At this point, it may seem that we haven't gained much from using expr_eqP, since the second proof of our example was much bigger than the first one. This is just an illusion, however, as the proof term produced on the first case is actually quite big:

lem =
fun n m p : nat =>
(fun _evar_0_ : n * p + m * p = p * m + p * n =>
 eq_ind_r (eq^~ (p * m + p * n)) _evar_0_ (mulnDl n m p))
  ((fun _evar_0_ : p * n + m * p = p * m + p * n =>
      (fun _pattern_value_ : nat => _pattern_value_ + m * p = p * m + p * n)
      _evar_0_ (mulnC n p))
     ((fun _evar_0_ : p * n + p * m = p * m + p * n =>
         (fun _pattern_value_ : nat =>
          p * n + _pattern_value_ = p * m + p * n) _evar_0_
         (mulnC m p))
        ((fun _evar_0_ : p * m + p * n = p * m + p * n =>
          eq_ind_r (eq^~ (p * m + p * n)) _evar_0_ (addnC (p * n) (p * m)))
           (erefl (p * m + p * n)))))
     : forall n m p : nat, (n + m) * p = p * m + p * n
By using reflection, we were able to transform the explicit reasoning steps of the first proof into implicit computation that is carried out by the proof assistant. And since proof terms have to be stored in memory or included into the compiled vo file, it is good to make them smaller if we can.
Nevertheless, even with a smaller proof term, having to manually type in that proof term is not very convenient. The problem is that Coq's unification engine is not smart enough to infer the symbolic form of an expression, forcing us to provide it ourselves. Fortunately, we can use some code to fill in the missing bits.


To reify something means to produce a representation of that object that can be directly manipulated in computation. In our case, that object is a Gallina expression of type nat, and the representation we are producing is a term of type expr.
Reification is ubiquitous in proofs by reflection. The Coq standard library comes with a plugin for reifying formulas, but it is not general enough to accommodate our use case. Therefore, we will program our own reification tactic in ltac.
We will begin by writing a function that looks for a variable on a list and returns its position. If the variable is not present, we add it to the end of the list and return the updated list as well:

Ltac intern vars e :=
  let rec loop n vars' :=
    match vars' with
    | [::] =>
      let vars'' := eval simpl in (rcons vars e) in
      constr:(n, vars'')
    | e :: ?vars'' => constr:(n, vars)
    | _ :: ?vars'' => loop (S n) vars''
    end in
  loop 0 vars.

Notice the call to eval simpl on the first branch of loop. Remember that in ltac everything is matched almost purely syntactically, so we have to explicitly evaluate a term when we are just interested on its value, and not on how it is written.
We can now write a tactic for reifying an expression. reify_expr takes two arguments: a list vars to be used with intern for reifying variables, plus the expression e to be reified. It returns a pair (e',vars') contained the reified expression e' and an updated variable list vars'.

Ltac reify_expr vars e :=
  match e with
  | ?e1 + ?e2 =>
    let r1 := reify_expr vars e1 in
    match r1 with
    | (?qe1, ?vars') =>
      let r2 := reify_expr vars' e2 in
      match r2 with
      | (?qe2, ?vars'') => constr:(Add qe1 qe2, vars'')
  | ?e1 * ?e2 =>
    let r1 := reify_expr vars e1 in
    match r1 with
    | (?qe1, ?vars') =>
      let r2 := reify_expr vars' e2 in
      match r2 with
      | (?qe2, ?vars'') => constr:(Mul qe1 qe2, vars'')
  | _ =>
    let r := intern vars e in
    match r with
    | (?n, ?vars') => constr:(Var n, vars')

Again, because this is an ltac function, we can traverse our Gallina expression syntactically, as if it were a data structure. Notice how we thread though the updated variable lists after each call; this is done to ensure that variables are named consistently.
Finally, using reify_expr, we can write solve_nat_eq, which reifies both sides of the equation on the goal and applies expr_eqP with the appropriate arguments.

Ltac solve_nat_eq :=
  match goal with
  | |- ?e1 = ?e2 =>
    let r1 := reify_expr (Nil nat) e1 in
    match r1 with
    | (?qe1, ?vm') =>
      let r2 := reify_expr vm' e2 in
      match r2 with
      | (?qe2, ?vm'') => exact: (@expr_eqP (nth 0 vm'') qe1 qe2 erefl)

We can check that our tactic works on our original example:

Lemma lem''' n m p : (n + m) * p = p * m + p * n.
Proof. solve_nat_eq. Qed.

With solve_nat_eq, every equation of that form becomes very easy to solve, including cases where a human prover might have trouble at first sight!

Lemma complicated n m p r t :
  (n + 2 ^ r * m) * (p + t) * (n + p)
  = n * n * p + m * 2 ^ r * (p * n + p * t + t * n + p * p)
  + n * (p * p + t * p + t * n).
Proof. solve_nat_eq. Qed.


We have seen how we can use internal computation in Coq to write powerful tactics. Besides generating small proof terms, tactics that use reflection have another important benefit: they are mostly written in Gallina, a typed language, and come with correctness proofs. This contrasts with most custom tactics written in ltac, which tend to break quite often due to the lack of static guarantees (and to how unstructure the tactic language is). For solve_nat_eq, we only had to write the reification engine in ltac, which results in a more manageable code base.
If you want to learn more about reflection, Adam Chlipala's CPDT book has an entire chapter devoted to the subject, which I highly recommend.

| Comments GitHub

Two nice Coq-related blogs

Here are two blogs with nice Coq-related content:

  • Guillaume Claret, a Ph.D. student in Paris, has also a blog devoted to Coq. You can find some good articles in there, such as this one, explaining how to use OPAM as a package manager for Coq.

  • Ömer’s blog has a lot of general programming-languages and functional-programming material. This post, for instance, presents a simple exercise in dependently typed programming.

| Comments GitHub

Analyzing sorting algorithms

In this post, we will formalize one of the most well-known results of algorithm analysis: no comparison sort can run in asymptotically less than n * log n steps, where n is the size of its input.
Before starting, I should point out that this is the first post in this blog to use the Ssreflect and the Mathematical Components (MathComp) libraries. Ssreflect is an amazing Coq extension that brings several improvements, including a nicer set of base tactics. Both libraries cover a wide range of theories, including fairly sophisticated Mathematics - as a matter of fact, they are featured in the Coq formalization of the Feit-Thompson theorem, known for its extremely complex and detailed proof.
As we will see, having good library support can help a lot when doing mechanized proofs, even for such simple results as this one. Two things that come in handy here, in particular, are the theories of permutations and sets over finite types that are available in MathComp. Indeed, the MathComp definitions enable many useful, higher-level reasoning principles that don't come for free in Coq, such as extensional and decidable equality. Furthermore, many lemmas on the library require a fair amount of machinery to be developed on their own - for example, showing that there are exactly n! permutations over a set of n elements. Previous versions of this post (which you can still find on the repository) tried to avoid external libraries, but were much longer and more complicated, prompting me to bite the bullet and port everything to Ssreflect/MathComp.


The informal proof of this result is fairly simple:
  • If a comparison sort is correct, then it must be capable of shuffling an input vector of size n according to any of the n! permutations of its elements.
  • On the other hand, any such algorithm can recognize at most 2 ^ k distinct permutations, where k is the maximum number of comparisons performed. Hence, n! <= 2 ^ k or, equivalently, log2 n! <= k.
  • To conclude, Stirling's approximation tells us that n * log2 n = O(log2 n!), which yields our result.
We'll begin our formalization by defining a convenient datatype for representing the execution of a comparison sort. For our purposes, a comparison sort can be seen as a binary tree: internal nodes indicate when a comparison between two elements occurs, with the right and left branches telling how to proceed depending on its result. The leaves of the tree mark when the algorithm ends and yields back a result. Thus, we obtain the following type:

Inductive sort_alg (n : nat) : Type :=
| Compare (i j : 'I_n) (l r : sort_alg n)
| Done of 'S_n.

Let's analyze this declaration in detail. The n parameter will be used later to track the length of the array that we are sorting. 'I_n is the type of natural numbers bounded by n, whereas 'S_n represents permutations of elements of that type. The idea is that, when running our algorithm, the i and j arguments of Compare tell which elements of our array to compare. The permutation in Done specifies how to rearrange the elements of the input array to produce an answer. We can formalize the previous description in the following function:

Fixpoint execute T n c (a : sort_alg n) (xs : n.-tuple T) :=
  match a with
  | Compare i j l r =>
    let a' := if c (tnth xs i) (tnth xs j) then l else r in
    execute c a' xs
  | Done p => [tuple tnth xs (p i) | i < n]

execute is a polymorphic function that works for any type T with a comparison operator c. Given an algorithm a and an input array xs of length n, the function compares the elements of a, following the appropriate branches along the way, until finding out how to rearrange a's elements.
With execute, we can define what it means for a sorting algorithm to be correct, by relating its results to the MathComp sort function:

Definition sort_alg_ok n (a : sort_alg n) :=
  forall (T : eqType) (le : rel T),
  forall xs, execute le a xs = sort le xs :> seq T.

Finally, to translate the above informal argument, we will need some more definitions. Let's first write a function for computing how many comparisons an algorithm performs in the worst case:

Fixpoint comparisons n (a : sort_alg n) : nat :=
  match a with
  | Compare _ _ l r => (maxn (comparisons l) (comparisons r)).+1
  | Done _ => 0

And here's a function for computing the set of permutations that an algorithm can perform (notice the use of the set library of MathComp; here, :|: denotes set union):

Fixpoint perms n (a : sort_alg n) : {set 'S_n} :=
  match a with
  | Compare _ _ l r => perms l :|: perms r
  | Done p => [set p]

(Strictly speaking, both comparisons and perms give upper bounds on the values they should compute, but this does not affect us in any crucial way.)

Show me the proofs

To show that a correct algorithm must be able of performing arbitrary permutations, notice that, if xs is a sorted array with distinct elements, then permuting its elements is an injective operation. That is, different permutations produce different arrays.
Lemma permsT n (a : sort_alg n) : sort_alg_ok a -> perms a = [set: 'S_n].
move=> a_ok.
apply/eqP; rewrite -subTset; apply/subsetP=> /= p _.
move: {a_ok} (a_ok _ leq [tuple val (p^-1 i) | i < n]).
rewrite (_ : sort _ _ = [tuple val i | i < n]); last first.
  apply: (eq_sorted leq_trans anti_leq (sort_sorted leq_total _)).
    by rewrite /= val_enum_ord iota_sorted.
  rewrite (perm_eq_trans (introT perm_eqlP (perm_sort _ _))) //.
  apply/tuple_perm_eqP; exists p^-1; congr val; apply/eq_from_tnth=> i.
  by rewrite 3!tnth_map 2!tnth_ord_tuple.
elim: a=> [/= i j l IHl r IHr|p'].
  by case: ifP=> [_ /IHl|_ /IHr]; rewrite in_setU => -> //; rewrite orbT.
move/val_inj=> /=.
rewrite in_set1=> e; apply/eqP/permP=> i; apply/esym/(canRL (permKV p)).
rewrite (_ : val i = tnth [tuple val i | i < n] i); last first.
  by rewrite tnth_map tnth_ord_tuple.
by rewrite -{}e 2!tnth_map !tnth_ord_tuple.

Bounding the number of permutations performed by an algorithm is simple, and amounts to invoking basic lemmas about arithmetic and sets.

Lemma card_perms n (a : sort_alg n) : #|perms a| <= 2 ^ comparisons a.
elim: a=> [i j l IHl r IHr|p] /=; last by rewrite cards1.
rewrite (leq_trans (leq_of_leqif (leq_card_setU (perms l) (perms r)))) //.
by rewrite expnS mul2n -addnn leq_add // ?(leq_trans IHl, leq_trans IHr) //
   leq_exp2l // ?(leq_maxl, leq_maxr).

Doing the last step is a bit trickier, as we don't have a proof of Stirling's approximation we can use. Instead, we take a more direct route, showing the following lemma by induction on n (trunc_log, as its name implies, is the truncated logarithm):

Local Notation log2 := (trunc_log 2).

Lemma log2_fact n : (n * log2 n)./2 <= log2 n`!.
In order to get our proof to go through, we must strengthen our induction hypothesis a little bit:
suff: n * (log2 n).+2 <= (log2 n`! + 2 ^ (log2 n)).*2.+1.
We can then proceed with a straightforward (although not completlely trivial) inductive argument.
elim: n=> [|n IH] //=.
(* ... *)

Our main result follows almost immediately from these three intermediate lemmas:

Lemma sort_alg_ok_leq n (a : sort_alg n) :
  sort_alg_ok a -> (n * log2 n)./2 <= comparisons a.
move=> a_ok; suff lb: n`! <= 2 ^ comparisons a.
  rewrite (leq_trans (log2_fact n)) // -(@leq_exp2l 2) //.
  by rewrite (leq_trans (trunc_logP (leqnn 2) (fact_gt0 n))).
rewrite (leq_trans _ (card_perms a)) // -{1}(card_ord n) -cardsT -card_perm.
by rewrite -(cardsE (perm_on [set: 'I_n])) subset_leq_card // permsT //.


We've seen how to formalize a result of algorithm analysis in an abstract setting: although it is fair to say that our model of a comparison sort is detailed enough for our purposes, we haven't connected it yet to a more traditional computation model, something I plan to discuss on a future post.
Aside from that, we've seen how a rich set of theories, such as the ones in MathComp, allow us to express higher-level concepts in our definitions and proofs, leading to much shorter formalizations.