Coq views truth through the lens of provability. The hypotheses it
manipulates are not mere assertions of truth, but *formal proofs* of the
corresponding statements ─ data structures that can be inspected to build
other proofs. It is not a coincidence that function types and logical
implication use the same notation, A -> B, because proofs of implication
in Coq *are* functions: they take proofs of the precondition as inputs and
return a proof of the consequent as the output. Such proofs are written
with the same language we use for programming in Coq; tactics are but
scripts that build such programs for us. A proof that implication is
transitive, for example, amounts to function composition.

Definition implication_is_transitive (A B C : Prop) :

(A -> B) -> (B -> C) -> (A -> C) :=

fun HAB HBC HA => HBC (HAB HA).

Similarly, inductive propositions in Coq behave just like algebraic data
types in typical functional programming languages. With pattern matching,
we can check which constructor was used in a proof and act accordingly.

Definition or_false_r (A : Prop) : A \/ False -> A :=

fun (H : A \/ False) =>

match H with

| or_introl HA => HA

| or_intror contra => match contra with end

end.

Disjunction \/ is an inductive proposition with two constructors,
or_introl and or_intror, whose arguments are proofs of its left and
right sides. In other words, a proof of A \/ B is either a proof of A
or a proof of B. Falsehood, on the other hand, is an inductive
proposition with no constructors. Matching on a proof of False does not
require us to consider any cases, thus allowing the expression to have any
type we please. This corresponds to the so-called principle of
explosion, which asserts that from a contradiction, anything follows.
The idea of viewing proofs as programs is known as the Curry-Howard
correspondence. It has been a fruitful source of inspiration for the
design of many other logics and programming languages beyond Coq, other
noteworthy examples including Agda
and Nuprl. I will
discuss a particular facet of this correspondence in Coq: the meaning of a
proof of equality.
## Defining equality

The Coq standard library defines equality as an indexed inductive
proposition. (The familiar x = y syntax is provided by the standard
library using Coq's notation mechanism.)

Inductive eq (T : Type) (x : T) : T -> Prop :=

| eq_refl : eq T x x.

This declaration says that the most basic way of showing x = y is when x
and y are the "same" term ─ not in the strict sense of syntactic
equality, but in the more lenient sense of equality "up to computation" used
in Coq's theory. For instance, we can use eq_refl to show that 1 + 1 =
2, because Coq can simplify the left-hand side using the definition of +
and arrive at the right-hand side.
To prove interesting facts about equality, we generally use the rewrite
tactic, which in turn is implemented by pattern matching. Matching on
proofs of equality is more complicated than for typical data types because
it is a *non-uniform* indexed proposition ─ that is, the value of its last
argument is not fixed for the whole declaration, but depends on the
constructor used. (This non-uniformity is what allows us to put two
occurrences of x on the type of eq_refl.)
Concretely, suppose that we have elements x and y of a type T, and a
predicate P : T -> Prop. We want to prove that P y holds assuming that
x = y and P x hold. This can be done with the following program:

Definition rewriting

(T : Type) (P : T -> Prop) (x y : T) (p : x = y) (H : P x) : P y :=

match p in _ = z return P z with

| eq_refl => H

end.

Compared to common match expressions, this one has two extra clauses. The
first, in _ = z, allows us to provide a name to the non-uniform argument
of the type of p. The second, return P z, allows us to declare what the
return type of the match expression is as a function of z. At the top
level, z corresponds to y, which means that the whole match has type
P y. When checking each individual branch, however, Coq requires proofs
of P z using values of z that correspond to the constructors of that
branch. Inside the eq_refl branch, z corresponds to x, which means
that we have to provide a proof of P x. This is why the use of H there
is valid.
To illustrate, here are proofs of two basic facts about equality:
transitivity and symmetry.

Definition etrans {T} {x y z : T} (p : x = y) (q : y = z) : x = z :=

match p in _ = w return w = z -> x = z with

| eq_refl => fun q' : x = z => q'

end q.

Definition esym {T} {x y : T} (p : x = y) : y = x :=

match p in _ = z return z = x with

| eq_refl => eq_refl

end.

Notice the return clause in the first proof, which uses a function type. We
cannot use w = z alone, as the final type of the expression would be y =
z. The other reasonable guess, x = z, wouldn't work either, since we
would have nothing of that type to return in the branch ─ q has type y =
z, and Coq does not automatically change it to x = z just because we know
that x and y ought to be equal inside the branch. The practice of
returning a function is so common when matching on dependent types that it
even has its own name: the *convoy pattern*, a term coined by Adam Chlipala
in his CDPT
book.
In addition to functions, pretty much any type expression can go in the
return clause of a match. This flexibility allows us to derive many basic
reasoning principles ─ for instance, the fact that constructors are
disjoint and injective.

Definition eq_Sn_m (n m : nat) (p : S n = m) :=

match p in _ = k return match k with

| 0 => False

| S m' => n = m'

end with

| eq_refl => eq_refl

end.

Definition succ_not_zero n : S n <> 0 :=

eq_Sn_m n 0.

Definition succ_inj n m : S n = S m -> n = m :=

eq_Sn_m n (S m).

In the eq_refl branch, we know that k is of the form S n. By
substituting this value in the return type, we find that the result of the
branch must have type n = n, which is why eq_refl is accepted. Since
this is only value of k we have to handle, it doesn't matter that False
appears in the return type of the match: that branch will never be used.
The more familiar lemmas succ_not_zero and succ_inj simply correspond to
special cases of eq_Sn_m. A similar trick can be used for many other
inductive types, such as booleans, lists, and so on.
## Mixing proofs and computation

Proofs can be used not only to build other proofs, but also in more
conventional programs. If we know that a list is not empty, for example, we
can write a function that extracts its first element.

From mathcomp Require Import seq.

Definition first {T} (s : seq T) (Hs : s <> [::]) : T :=

match s return s <> [::] -> T with

| [::] => fun Hs : [::] <> [::] => match Hs eq_refl with end

| x :: _ => fun _ => x

end Hs.

Here we see a slightly different use of dependent pattern matching: the
return type depends on the analyzed value s, not just on the indices of
its type. The rules for checking that this expression is valid are the
same: we substitute the pattern of each branch for s in the return type,
and ensure that it is compatible with the result it produces. On the first
branch, this gives a contradictory hypothesis [::] <> [::], which we can
discard by pattern matching as we did earlier. On the second branch, we can
just return the first element of the list.
Proofs can also be stored in regular data structures. Consider for instance
the subset type {x : T | P x}, which restricts the elements of the type
T to those that satisfy the predicate P. Elements of this type are of
the form exist x H, where x is an element of T and H is a proof of
P x. Here is an alternative version of first, which expects the
arguments s and Hs packed as an element of a subset type.

Definition first' {T} (s : {s : seq T | s <> [::]}) : T :=

match s with

| exist s Hs => first s Hs

end.

While powerful, this idiom comes with a price: when reasoning about a term
that mentions proofs, the proofs must be explicitly taken into account. For
instance, we cannot show that two elements exist x H1 and exist x H2 are
equal just by reflexivity; we must explicitly argue that the proofs H1 and
H2 are equal. Unfortunately, there are many cases in which this is not
possible ─ for example, two proofs of a disjunction A \/ B need to use
the same constructor to be considered equal.
The situation is not as bad as it might sound, because Coq was designed to
allow a *proof irrelevance* axiom without compromising its soundness. This
axiom says that any two proofs of the same proposition are equal.

Axiom proof_irrelevance : forall (P : Prop) (p q : P), p = q.

If we are willing to extend the theory with this axiom, much of the pain of
mixing proofs and computation goes away; nevertheless, it is a bit upsetting
that we need an extra axiom to make the use of proofs in computation
practical. Fortunately, much of this convenience is already built into
Coq's theory, thanks to the structure of proofs of equality.
## Proof irrelevance and equality

A classical result of type theory says that equalities between elements of a
type T are proof irrelevant *provided that* T has decidable equality.
Many useful properties can be expressed in this way; in particular, any
boolean function f : S -> bool can be seen as a predicate S -> Prop
defined as fun x : S => f x = true. Thus, if we restrict subset types to
*computable* predicates, we do not need to worry about the proofs that
appear in its elements.
You might wonder why any assumptions are needed in this result ─ after all,
the definition of equality only had a single constructor; how could two
proofs be different? Let us begin by trying to show the result without
relying on any extra assumptions. We can show that general proof irrelevance
can be reduced to irrelevance of "reflexive equality": all proofs of x = x
are equal to eq_refl x.

Section Irrelevance.

Variable T : Type.

Implicit Types x y : T.

Definition almost_irrelevance :

(forall x (p : x = x), p = eq_refl x) ->

(forall x y (p q : x = y), p = q) :=

fun H x y p q =>

match q in _ = z return forall p' : x = z, p' = q with

| eq_refl => fun p' => H x p'

end p.

This proof uses the extended form of dependent pattern matching we have seen
in the definition of first: the return type mentions q, the very element
we are matching on. It also uses the convoy pattern to "update" the type of
p with the extra information gained by matching on q.
The almost_irrelevance lemma may look like progress, but it does not
actually get us anywhere, because there is no way of proving its premise
without assumptions. Here is a failed attempt.

Fail Definition irrelevance x (p : x = x) : p = eq_refl x :=

match p in _ = y return p = eq_refl x with

| eq_refl => eq_refl

end.

Coq complains that the return clause is ill-typed: its right-hand side has
type x = x, but its left-hand side has type x = y. That is because when
checking the return type, Coq does not use the original type of p, but the
one obtained by generalizing the index of its type according to the in
clause.
It took many years to understand that, even though the inductive definition
of equality only mentions one constructor, it is possible to extend the type
theory to allow for provably different proofs of equality between two
elements. Homotopy type
theory, for example, introduced a *univalence principle* that says that
proofs of equality between two types correspond to isomorphisms between
them. Since there are often many different isomorphisms between two types,
irrelevance cannot hold in full generality.
To obtain an irrelevance result, we must assume that T has decidable
equality.

Hypothesis eq_dec : forall x y, x = y \/ x <> y.

The argument roughly proceeds as follows. We use decidable equality to
define a normalization procedure that takes a proof of equality as input and
produces a canonical proof of equality of the same type as output.
Crucially, the output of the procedure does not depend on its input. We
then show that the normalization procedure has an inverse, allowing us to
conclude ─ all proofs must be equal to the canonical one.
Here is the normalization procedure.

Definition normalize {x y} (p : x = y) : x = y :=

match eq_dec x y with

| or_introl e => e

| or_intror _ => p

end.

If x = y holds, eq_dec x y must return something of the form or_introl
e, the other branch being contradictory. This implies that normalize is
constant.

Lemma normalize_const {x y} (p q : x = y) : normalize p = normalize q.

Proof. by rewrite /normalize; case: (eq_dec x y). Qed.

The inverse of normalize is defined by combining transitivity and symmetry
of equality.

Notation "p * q" := (etrans p q).

Notation "p ^-1" := (esym p)

(at level 3, left associativity, format "p ^-1").

Definition denormalize {x y} (p : x = y) := p * (normalize (eq_refl y))^-1.

As the above notation suggests, we can show that esym is the inverse of
etrans, in the following sense.

Definition etransK x y (p : x = y) : p * p^-1 = eq_refl x :=

match p in _ = y return p * p^-1 = eq_refl x with

| eq_refl => eq_refl (eq_refl x)

end.

This proof avoids the problem that we encountered in our failed proof of
irrelevance, resulting from generalizing the right-hand side of p. In
this return type, p * p^-1 has type x = x, which matches the one of
eq_refl x. Notice why the result of the eq_refl branch is valid: we
must produce something of type eq_refl x * (eq_refl x)^-1 = eq_refl x, but
by the definitions of etrans and esym, the left-hand side computes
precisely to eq_refl x.
Armed with etransK, we can now relate normalize to its inverse, and
conclude the proof of irrelevance.

Definition normalizeK x y (p : x = y) :

normalize p * (normalize (eq_refl y))^-1 = p :=

match p in _ = y return normalize p * (normalize (eq_refl y))^-1 = p with

| eq_refl => etransK x x (normalize (eq_refl x))

end.

Lemma irrelevance x y (p q : x = y) : p = q.

Proof.

by rewrite -[LHS]normalizeK -[RHS]normalizeK (normalize_const p q).

Qed.

End Irrelevance.

Besides all these events, this year also featured the second edition of the CoqPL workshop. Its main attraction may have been the release of the long-awaited Coq 8.5. Matthieu Sozeau and Maxime Dénès gave a detailed presentation of its main new features, which include asynchronous proof checking and editing, universe polymorphism, and a new tactic engine. Congratulations to the Coq team for the great work!

Another fun talk was by Clément Pit-Claudel, where he announced company-coq, a set of Proof General extensions brings many nice editing features for Coq code in Emacs. These include: automatically prettifying code (for instance, replacing `forall`

by `∀`

), auto-completion, code folding, and improved error messages, among many others. If you work with Coq under Emacs, you should definitely give it a try!

One important aspect of Coq's logic is the special status given to
*computation*: while some systems require one to apply explicit
deductive steps to show that two given terms are equal, Coq's logic
considers any two terms that *evaluate* to the same result to be equal
automatically, without the need for additional reasoning.
Without getting into too much detail, we can illustrate this idea with
some simple examples. Russell and Whitehead's seminal *Principia
Mathematica* had to develop hundreds
of pages of foundational mathematics before being able to prove
that 1 + 1 = 2. In contrast, here's what this proof looks like in
Coq:

Definition one_plus_one : 1 + 1 = 2 := erefl.

erefl is the only constructor of the eq type; its type,
forall A (a : A), a = a, tells us that we can use it to prove that
given term a is equal to itself. Coq accepts one_plus_one as a
valid proof because, even though the two sides of the equation are not
syntactically the same, it is able to use the definition of + to
compute the left-hand side and check that the result is the same as
the right-hand side. This also works for some statements with
variables in them, for instance

Definition zero_plus_n n : 0 + n = n := erefl.

The same principle applies here: + is defined by case analysis
on its first argument, and doesn't even need to inspect the second
one. Since the first argument on the left-hand side is a constructor
(0), Coq can reduce the expression and conclude that both sides are
equal.
Unfortunately, not every equality is a direct consequence of
computation. For example, this proof attempt is rejected:

Fail Definition n_plus_zero n : n + 0 = n := erefl.

What happened here? As mentioned before, + is defined by case
analysis on the first argument; since the first argument of the
left-hand side doesn't start with a constructor, Coq doesn't know how
to compute there. As it turns out, one actually needs an inductive
argument to prove this result, which might end up looking like this,
if we were to check the proof term that Coq produces:

Fixpoint n_plus_zero n : n + 0 = n :=

match n with

| 0 => erefl

| n.+1 => let: erefl := n_plus_zero n in erefl

end.

It seems that, although interesting, computation inside Coq isn't
of much use when proving something. Or is it?
In this post, I will show how computation in Coq can be used to write
certified automation tactics with a technique known as *proof by
reflection*. Reflection is extensively used in Coq and in other proof
assistants as well; it is at the core of powerful automation tactics
such as ring, and played an important role in the formalization of
the Four-color
theorem. As a matter of fact, the name Ssreflect stands for
*small-scale reflection*, due to the library's pervasive use of
reflection and computation.
Let's see how reflection works by means of a basic example: a tactic
for checking equalities between simple expressions involving natural
numbers.
## Arithmetic with reflection

Imagine that we were in the middle of a proof and needed to show that
two natural numbers are equal:

Lemma lem n m p : (n + m) * p = p * m + p * n.

ring is powerful enough to solve this goal by itself, but just
for the sake of the example, suppose that we had to prove it by
hand. We could write something like

Proof. by rewrite mulnDl (mulnC n) (mulnC m) addnC. Qed.

This was not terribly complicated, but there's certainly room for
improvement. In a paper proof, a mathematician would probably assume
that the reader is capable of verifying this result on their own,
without any additional detail. But how exactly would the reader
proceed?
In the case of the simple arithmetic expression above, it suffices to
apply the distributivity law as long as possible, until both
expressions become a sum of monomials. Then, thanks to associativity
and commutativity, we just have to reorder the factors and terms and
check that both sides of the equation match.
The idea of proof by reflection is to reduce a the validity of a
logical statement to a *symbolic computation*, usually by proving a
theorem of the form thm : b = true -> P with b : bool. If b can
be computed explicitly and reduces to true, then Coq recognizes
erefl as a proof of b = true, which means that thm erefl becomes
a proof of P.
To make things concrete, let's go back to our example. The idea that
we described above for checking whether two numbers are equal can be
used whenever we have expressions involving addition, multiplication,
and variables. We will define a Coq data type for representing such
expressions, as we will need to compute with them:

Inductive expr :=

| Var of nat

| Add of expr & expr

| Mul of expr & expr.

Variables are represented by natural numbers using the Var
constructor, and Add and Mul can be used to combine
expressions. The following term, for instance, represents the
expression n * (m + n):

Example expr_ex :=

Mul (Var 0) (Add (Var 1) (Var 0)).

where Var 0 and Var 1 denote n and m, respectively.
If we are given a function vals assigning variables to numbers, we
can compute the value of an expression with a simple recursive
function:

Fixpoint nat_of_expr vals e :=

match e with

| Var v => vals v

| Add e1 e2 => nat_of_expr vals e1 + nat_of_expr vals e2

| Mul e1 e2 => nat_of_expr vals e1 * nat_of_expr vals e2

end.

Now, since every expression of that form can be written as a sum
of monomials, we can define a function for converting an expr to
that form:

Fixpoint monoms e :=

match e with

| Var v => [:: [:: v] ]

| Add e1 e2 => monoms e1 ++ monoms e2

| Mul e1 e2 => [seq m1 ++ m2 | m1 <- monoms e1, m2 <- monoms e2]

end.

Here, each monomial is represented by a list enumerating all
variables that occur in it, counting their multiplicities. Hence, a
sum of monomials is represented as a list of lists. For example,
here's the result of normalizing expr_ex:

Example monoms_expr_ex :

monoms expr_ex = [:: [:: 0; 1]; [:: 0; 0]].

Proof. by []. Qed.

To prove that monoms has the intended behavior, we show that the
value of an expression is preserved by it. By using the big operations
\sum and \prod from the MathComp library, we can compute the value
of a sum of monomials very easily:

Lemma monomsE vals e :

nat_of_expr vals e = \sum_(m <- monoms e) \prod_(v <- m) vals v.

Proof.

elim: e=> [v|e1 IH1 e2 IH2|e1 IH1 e2 IH2] /=.

- by rewrite 2!big_seq1.

- by rewrite big_cat IH1 IH2.

rewrite {}IH1 {}IH2 big_distrlr /=.

elim: (monoms e1) (monoms e2)=> [|v m1 IH] m2 /=; first by rewrite 2!big_nil.

rewrite big_cons big_cat /= IH; congr addn.

by rewrite big_map; apply/eq_big=> //= m3 _; rewrite big_cat.

Qed.

Hence, to check that two expressions are equivalent, it suffices
to compare the results of monoms, modulo the ordering. We can do
this by sorting the variable names on each monomial and then testing
whether one list of monomials is a permutation of the other:

Definition normalize := map (sort leq) \o monoms.

Lemma normalizeE vals e :

nat_of_expr vals e = \sum_(m <- normalize e) \prod_(v <- m) vals v.

Proof.

rewrite monomsE /normalize /=; elim: (monoms e)=> [|m ms IH] /=.

by rewrite big_nil.

rewrite 2!big_cons IH; congr addn.

by apply/eq_big_perm; rewrite perm_eq_sym perm_sort.

Qed.

Definition expr_eq e1 e2 := perm_eq (normalize e1) (normalize e2).

Lemma expr_eqP vals e1 e2 :

expr_eq e1 e2 ->

nat_of_expr vals e1 = nat_of_expr vals e2.

Proof. rewrite 2!normalizeE; exact/eq_big_perm. Qed.

To see how this lemma works, let's revisit our original
example. Here's a new proof that uses expr_eqP:

Lemma lem' n m p : (n + m) * p = p * m + p * n.

Proof.

exact: (@expr_eqP (nth 0 [:: n; m; p])

(Mul (Add (Var 0) (Var 1)) (Var 2))

(Add (Mul (Var 2) (Var 1)) (Mul (Var 2) (Var 0)))

erefl).

Qed.

The first argument to our lemma assigns "real" variables to
variable numbers: 0 corresponds to n (the first element of the
list), 1 to m, and 2 to p. The second and third argument are
symbolic representations of the left and right-hand sides of our
equation. The fourth argument is the most interesting one: the
expr_eq was defined as a *boolean* function that returns true when
its two arguments are equivalent expressions. As we've seen above,
this means that whenever expr_eq e1 e2 computes to true, erefl
is a valid proof of it. Finally, when Coq tries to check whether the
conclusion of expr_eqP can be used on our goal, it computes
nat_of_expr on both sides, realizing that the conclusion and the
goal are exactly the same. For instance:

Lemma expr_eval n m p :

nat_of_expr (nth 0 [:: n; m; p]) (Mul (Add (Var 0) (Var 1)) (Var 2))

= (n + m) * p.

Proof. reflexivity. Qed.

Of course, expr_eqP doesn't force its first argument to always
return actual Coq variables, so it can be applied even in some cases
where the expressions contain other operators besides + and *:

Lemma lem'' n m : 2 ^ n * m = m * 2 ^ n.

Proof.

exact: (@expr_eqP (nth 0 [:: 2 ^ n; m])

(Mul (Var 0) (Var 1)) (Mul (Var 1) (Var 0))

erefl).

Qed.

At this point, it may seem that we haven't gained much from using
expr_eqP, since the second proof of our example was much bigger than
the first one. This is just an illusion, however, as the proof term
produced on the first case is actually quite big:

lem =

fun n m p : nat =>

(fun _evar_0_ : n * p + m * p = p * m + p * n =>

eq_ind_r (eq^~ (p * m + p * n)) _evar_0_ (mulnDl n m p))

((fun _evar_0_ : p * n + m * p = p * m + p * n =>

eq_ind_r

(fun _pattern_value_ : nat => _pattern_value_ + m * p = p * m + p * n)

_evar_0_ (mulnC n p))

((fun _evar_0_ : p * n + p * m = p * m + p * n =>

eq_ind_r

(fun _pattern_value_ : nat =>

p * n + _pattern_value_ = p * m + p * n) _evar_0_

(mulnC m p))

((fun _evar_0_ : p * m + p * n = p * m + p * n =>

eq_ind_r (eq^~ (p * m + p * n)) _evar_0_ (addnC (p * n) (p * m)))

(erefl (p * m + p * n)))))

: forall n m p : nat, (n + m) * p = p * m + p * n
By using reflection, we were able to transform the explicit reasoning
steps of the first proof into implicit computation that is carried out
by the proof assistant. And since proof terms have to be stored in
memory or included into the compiled vo file, it is good to make
them smaller if we can.
Nevertheless, even with a smaller proof term, having to manually type
in that proof term is not very convenient. The problem is that Coq's
unification engine is not smart enough to infer the symbolic form of
an expression, forcing us to provide it ourselves. Fortunately, we can
use some code to fill in the missing bits.
## Reification

To *reify* something means to produce a representation of that object
that can be directly manipulated in computation. In our case, that
object is a Gallina expression of type nat, and the representation
we are producing is a term of type expr.
Reification is ubiquitous in proofs by reflection. The Coq standard
library comes with a plugin
for reifying formulas, but it is not general enough to accommodate our
use case. Therefore, we will program our own reification tactic in
ltac.
We will begin by writing a function that looks for a variable on a
list and returns its position. If the variable is not present, we add
it to the end of the list and return the updated list as well:

lem =

fun n m p : nat =>

(fun _evar_0_ : n * p + m * p = p * m + p * n =>

eq_ind_r (eq^~ (p * m + p * n)) _evar_0_ (mulnDl n m p))

((fun _evar_0_ : p * n + m * p = p * m + p * n =>

eq_ind_r

(fun _pattern_value_ : nat => _pattern_value_ + m * p = p * m + p * n)

_evar_0_ (mulnC n p))

((fun _evar_0_ : p * n + p * m = p * m + p * n =>

eq_ind_r

(fun _pattern_value_ : nat =>

p * n + _pattern_value_ = p * m + p * n) _evar_0_

(mulnC m p))

((fun _evar_0_ : p * m + p * n = p * m + p * n =>

eq_ind_r (eq^~ (p * m + p * n)) _evar_0_ (addnC (p * n) (p * m)))

(erefl (p * m + p * n)))))

: forall n m p : nat, (n + m) * p = p * m + p * n

Ltac intern vars e :=

let rec loop n vars' :=

match vars' with

| [::] =>

let vars'' := eval simpl in (rcons vars e) in

constr:((n, vars''))

| e :: ?vars'' => constr:((n, vars))

| _ :: ?vars'' => loop (S n) vars''

end in

loop 0 vars.

Notice the call to eval simpl on the first branch of
loop. Remember that in ltac everything is matched almost purely
syntactically, so we have to explicitly evaluate a term when we are
just interested on its value, and not on how it is written.
We can now write a tactic for reifying an expression. reify_expr
takes two arguments: a list vars to be used with intern for
reifying variables, plus the expression e to be reified. It returns
a pair (e',vars') contained the reified expression e' and an
updated variable list vars'.

Ltac reify_expr vars e :=

match e with

| ?e1 + ?e2 =>

let r1 := reify_expr vars e1 in

match r1 with

| (?qe1, ?vars') =>

let r2 := reify_expr vars' e2 in

match r2 with

| (?qe2, ?vars'') => constr:((Add qe1 qe2, vars''))

end

end

| ?e1 * ?e2 =>

let r1 := reify_expr vars e1 in

match r1 with

| (?qe1, ?vars') =>

let r2 := reify_expr vars' e2 in

match r2 with

| (?qe2, ?vars'') => constr:((Mul qe1 qe2, vars''))

end

end

| _ =>

let r := intern vars e in

match r with

| (?n, ?vars') => constr:((Var n, vars'))

end

end.

Again, because this is an ltac function, we can traverse our
Gallina expression syntactically, as if it were a data
structure. Notice how we thread though the updated variable lists
after each call; this is done to ensure that variables are named
consistently.
Finally, using reify_expr, we can write solve_nat_eq, which
reifies both sides of the equation on the goal and applies expr_eqP
with the appropriate arguments.

Ltac solve_nat_eq :=

match goal with

| |- ?e1 = ?e2 =>

let r1 := reify_expr (Nil nat) e1 in

match r1 with

| (?qe1, ?vm') =>

let r2 := reify_expr vm' e2 in

match r2 with

| (?qe2, ?vm'') => exact: (@expr_eqP (nth 0 vm'') qe1 qe2 erefl)

end

end

end.

We can check that our tactic works on our original example:

Lemma lem''' n m p : (n + m) * p = p * m + p * n.

Proof. solve_nat_eq. Qed.

With solve_nat_eq, every equation of that form becomes very easy
to solve, including cases where a human prover might have trouble at
first sight!

Lemma complicated n m p r t :

(n + 2 ^ r * m) * (p + t) * (n + p)

= n * n * p + m * 2 ^ r * (p * n + p * t + t * n + p * p)

+ n * (p * p + t * p + t * n).

Proof. solve_nat_eq. Qed.

Guillaume Claret, a Ph.D. student in Paris, has also a blog devoted to Coq. You can find some good articles in there, such as this one, explaining how to use OPAM as a package manager for Coq.

Ömer’s blog has a lot of general programming-languages and functional-programming material. This post, for instance, presents a simple exercise in dependently typed programming.

In this post, we will formalize one of the most well-known results
of algorithm analysis: no comparison sort can run in asymptotically
less than n * log n steps, where n is the size of its input.
Before starting, I should point out that this is the first post in
this blog to use the Ssreflect and the Mathematical Components
(MathComp) libraries. Ssreflect is an
amazing Coq extension that brings several improvements, including a
nicer set of base tactics. Both libraries cover a wide range of
theories, including fairly sophisticated Mathematics - as a matter of
fact, they are featured in the Coq formalization of the Feit-Thompson
theorem, known for its extremely complex and detailed proof.
As we will see, having good library support can help a lot when doing
mechanized proofs, even for such simple results as this one. Two
things that come in handy here, in particular, are the theories of
*permutations* and *sets* over finite types that are available in
MathComp. Indeed, the MathComp definitions enable many useful,
higher-level reasoning principles that don't come for free in Coq,
such as extensional and decidable equality. Furthermore, many lemmas
on the library require a fair amount of machinery to be developed on
their own - for example, showing that there are exactly n!
permutations over a set of n elements. Previous versions of this
post (which you can still find on the repository) tried to avoid
external libraries, but were much longer and more complicated,
prompting me to bite the bullet and port everything to
Ssreflect/MathComp.
## Basics

The informal proof of this result is fairly simple:
We'll begin our formalization by defining a convenient datatype for
representing the execution of a comparison sort. For our purposes, a
comparison sort can be seen as a binary tree: internal nodes indicate
when a comparison between two elements occurs, with the right and left
branches telling how to proceed depending on its result. The leaves of
the tree mark when the algorithm ends and yields back a result. Thus,
we obtain the following type:

- If a comparison sort is correct, then it must be capable of
shuffling an input vector of size n according to
*any*of the n! permutations of its elements. - On the other hand, any such algorithm can recognize at most 2 ^
k distinct permutations, where k is the maximum number of
comparisons performed. Hence, n! <= 2 ^ k or, equivalently,
log2 n! <= k.
- To conclude, Stirling's approximation tells us that n * log2 n = O(log2 n!), which yields our result.

Inductive sort_alg (n : nat) : Type :=

| Compare (i j : 'I_n) (l r : sort_alg n)

| Done of 'S_n.

Let's analyze this declaration in detail. The n parameter will
be used later to track the length of the array that we are
sorting. 'I_n is the type of natural numbers bounded by n, whereas
'S_n represents permutations of elements of that type. The idea is
that, when running our algorithm, the i and j arguments of
Compare tell which elements of our array to compare. The permutation
in Done specifies how to rearrange the elements of the input array
to produce an answer. We can formalize the previous description in the
following function:

Fixpoint execute T n c (a : sort_alg n) (xs : n.-tuple T) :=

match a with

| Compare i j l r =>

let a' := if c (tnth xs i) (tnth xs j) then l else r in

execute c a' xs

| Done p => [tuple tnth xs (p i) | i < n]

end.

execute is a polymorphic function that works for any type T
with a comparison operator c. Given an algorithm a and an input
array xs of length n, the function compares the elements of a,
following the appropriate branches along the way, until finding out
how to rearrange a's elements.
With execute, we can define what it means for a sorting algorithm to
be correct, by relating its results to the MathComp sort function:

Definition sort_alg_ok n (a : sort_alg n) :=

forall (T : eqType) (le : rel T),

forall xs, execute le a xs = sort le xs :> seq T.

Finally, to translate the above informal argument, we will need
some more definitions. Let's first write a function for computing how
many comparisons an algorithm performs in the worst case:

Fixpoint comparisons n (a : sort_alg n) : nat :=

match a with

| Compare _ _ l r => (maxn (comparisons l) (comparisons r)).+1

| Done _ => 0

end.

And here's a function for computing the set of permutations that
an algorithm can perform (notice the use of the set library of
MathComp; here, :|: denotes set union):

Fixpoint perms n (a : sort_alg n) : {set 'S_n} :=

match a with

| Compare _ _ l r => perms l :|: perms r

| Done p => [set p]

end.

(Strictly speaking, both comparisons and perms give upper
bounds on the values they should compute, but this does not affect us
in any crucial way.)
## Show me the proofs

To show that a correct algorithm must be able of performing arbitrary
permutations, notice that, if xs is a sorted array with distinct
elements, then permuting its elements is an *injective*
operation. That is, different permutations produce different
arrays.

Lemma permsT n (a : sort_alg n) : sort_alg_ok a -> perms a = [set: 'S_n].

Proof.

move=> a_ok.

apply/eqP; rewrite -subTset; apply/subsetP=> /= p _.

move: {a_ok} (a_ok _ leq [tuple val (p^-1 i) | i < n]).

rewrite (_ : sort _ _ = [tuple val i | i < n]); last first.

apply: (eq_sorted leq_trans anti_leq (sort_sorted leq_total _)).

by rewrite /= val_enum_ord iota_sorted.

rewrite (perm_eq_trans (introT perm_eqlP (perm_sort _ _))) //.

apply/tuple_perm_eqP; exists p^-1; congr val; apply/eq_from_tnth=> i.

by rewrite 3!tnth_map 2!tnth_ord_tuple.

elim: a=> [/= i j l IHl r IHr|p'].

by case: ifP=> [_ /IHl|_ /IHr]; rewrite in_setU => -> //; rewrite orbT.

move/val_inj=> /=.

rewrite in_set1=> e; apply/eqP/permP=> i; apply/esym/(canRL (permKV p)).

apply/val_inj.

rewrite (_ : val i = tnth [tuple val i | i < n] i); last first.

by rewrite tnth_map tnth_ord_tuple.

by rewrite -{}e 2!tnth_map !tnth_ord_tuple.

Qed.

Proof.

move=> a_ok.

apply/eqP; rewrite -subTset; apply/subsetP=> /= p _.

move: {a_ok} (a_ok _ leq [tuple val (p^-1 i) | i < n]).

rewrite (_ : sort _ _ = [tuple val i | i < n]); last first.

apply: (eq_sorted leq_trans anti_leq (sort_sorted leq_total _)).

by rewrite /= val_enum_ord iota_sorted.

rewrite (perm_eq_trans (introT perm_eqlP (perm_sort _ _))) //.

apply/tuple_perm_eqP; exists p^-1; congr val; apply/eq_from_tnth=> i.

by rewrite 3!tnth_map 2!tnth_ord_tuple.

elim: a=> [/= i j l IHl r IHr|p'].

by case: ifP=> [_ /IHl|_ /IHr]; rewrite in_setU => -> //; rewrite orbT.

move/val_inj=> /=.

rewrite in_set1=> e; apply/eqP/permP=> i; apply/esym/(canRL (permKV p)).

apply/val_inj.

rewrite (_ : val i = tnth [tuple val i | i < n] i); last first.

by rewrite tnth_map tnth_ord_tuple.

by rewrite -{}e 2!tnth_map !tnth_ord_tuple.

Qed.

Bounding the number of permutations performed by an algorithm is
simple, and amounts to invoking basic lemmas about arithmetic and
sets.

Lemma card_perms n (a : sort_alg n) : #|perms a| <= 2 ^ comparisons a.

Proof.

elim: a=> [i j l IHl r IHr|p] /=; last by rewrite cards1.

rewrite (leq_trans (leq_of_leqif (leq_card_setU (perms l) (perms r)))) //.

by rewrite expnS mul2n -addnn leq_add // ?(leq_trans IHl, leq_trans IHr) //

leq_exp2l // ?(leq_maxl, leq_maxr).

Qed.

Doing the last step is a bit trickier, as we don't have a proof of
Stirling's approximation we can use. Instead, we take a more direct
route, showing the following lemma by induction on n (trunc_log,
as its name implies, is the truncated logarithm):

Local Notation log2 := (trunc_log 2).

Lemma log2_fact n : (n * log2 n)./2 <= log2 n`!.

Proof.

In order to get our proof to go through, we must strengthen our
induction hypothesis a little bit:

suff: n * (log2 n).+2 <= (log2 n`! + 2 ^ (log2 n)).*2.+1.

We can then proceed with a straightforward (although not
completlely trivial) inductive argument.

elim: n=> [|n IH] //=.

(* ... *)

Qed.

(* ... *)

Qed.

Our main result follows almost immediately from these three
intermediate lemmas:

Lemma sort_alg_ok_leq n (a : sort_alg n) :

sort_alg_ok a -> (n * log2 n)./2 <= comparisons a.

Proof.

move=> a_ok; suff lb: n`! <= 2 ^ comparisons a.

rewrite (leq_trans (log2_fact n)) // -(@leq_exp2l 2) //.

by rewrite (leq_trans (trunc_logP (leqnn 2) (fact_gt0 n))).

rewrite (leq_trans _ (card_perms a)) // -{1}(card_ord n) -cardsT -card_perm.

by rewrite -(cardsE (perm_on [set: 'I_n])) subset_leq_card // permsT //.

Qed.

Coq's built-in extensible parser, although quite convenient in
many cases, does have some limitations. On one hand, some syntax
extensions that one might like to write cannot be expressed in
it. Furthermore, the extensions are not first-class, having to be
defined outside of the core language. We will see how to implement
some interesting syntax extensions in Coq using just coercions and
regular Coq functions. Besides being first-class, we will see that the
mechanism can be used for defining extensions that cannot be expressed
just with Coq's extensible parser.
We want to describe each syntax extension with a few Coq types and
functions that determine how it should be parsed. Our first attempt
might look like this:

Module Simple.

Record parser := Parser {

token : Type;

result : Type;

initial_result : result;

read_token : result -> token -> result

}.

Our parser works by reading tokens and returning a value of type
result at the end of the process. It does this by starting with some
value initial_result and updating that value with each token read,
using function read_token. For instance, here's a function that
returns the result of parsing three tokens:

Section WithParser.

Variable p : parser.

Definition read_three_tokens t1 t2 t3 :=

read_token p (

read_token p (

read_token p (initial_result p) t1

) t2

) t3.

We can use this definition as a syntax extension by declaring
read_token as a coercion from result to Funclass, allowing us to
use each result as if it were a function. Then, applying
initial_result p to a sequence of tokens will correspond to calling
our parser on that sequence.

Coercion read_token : result >-> Funclass.

Definition read_three_tokens' t1 t2 t3 :=

(initial_result p) t1 t2 t3.

This works because each application of read_token returns a
result, so trying to apply that updated result to another token
triggers yet another coercion, allowing the process to continue
indefinitely. We can check that both definitions of
read_three_tokens yield the same function, meaning that the coercion
is behaving as expected:

Lemma read_three_tokens_same : read_three_tokens = read_three_tokens'.

Proof. reflexivity. Qed.

To make the mechanism more robust, we wrap our parser in a new
type, ensuring that the Coq type checker will not fail to perform some
coercion by reducing result more than it should.

Record parser_wrapper : Type := ParserWrapper {

get_result : result p

}.

Definition read_token' (w : parser_wrapper) t :=

ParserWrapper (read_token p (get_result w) t).

Coercion read_token' : parser_wrapper >-> Funclass.

End WithParser.

As a last tweak, we declare another coercion and a notation to
make using our embedded parsers more similar to other systems, like in
Template Haskell:

Definition init_parser p := ParserWrapper _ (initial_result p).

Coercion init_parser : parser >-> parser_wrapper.

Notation "[ x ]" := (get_result _ x) (at level 0).

Now, we can invoke a parser simply by writing [name_of_parser
<list of tokens>]:

Definition read_three_tokens'' (p : parser) t1 t2 t3 :=

[p t1 t2 t3].

As a first example, we can define an alternative syntax for Coq
lists that doesn't need separators between the elements.

Definition listp (X : Type) := {|

token := X;

result := list X;

initial_result := nil;

read_token l x := app l (cons x nil)

|}.

The listp parser is parameterized by X, the type of the
elements on the list. We initialize the parser with an empty list, and
each token that we read is an element of X, which will be
progressively added at the end of our list.

Definition list_exp : list nat := [listp nat 0 1 2 3 4 5 6 7 8 9 10].

End Simple.

Module State.

Record parser := Parser {

state : Type;

initial_state : state;

token : Type;

next : state -> token -> state;

result : Type;

initial_result : result;

read_token : state -> result -> token -> result

}.

End State.

The state field represents the internal state of our parser at a
given point. It is set initially set to initial_state, and is
updated using function next. We also change read_token to pass it
the current state as an additional argument.
While more general, this version isn't quite good yet. We require our
parsers to carry around a complete result value that it can return
after having read *any* sequence of tokens. Usually, however, parsing
can result in errors, and there is no meaningful value that can be
returned by the parser until it finishes its job. To solve this
problem, we introduce one last change to our definition: *dependent
types*.

Record parser := Parser {

state : Type;

initial_state : state;

token : state -> Type;

next : forall s, token s -> state;

partial_result : state -> Type;

initial_partial_result : partial_result initial_state;

read_token : forall s, partial_result s -> forall t, partial_result (next s t)

}.

Now, the type of tokens expected by the parser, as well as its
type of partial results, can depend on the current parsing state, and
the parsing functions have been updated to take this dependency into
account.
With dependent types, the type of the value being built,
partial_result, can change during the parsing process, allowing us
to distinguish a complete, successfully parsed result, from one that
still needs more tokens, or even from a message describing a parse
error. By making token depend on the current state, we can constrain
which tokens can be read at each parsing state, allowing us to expose
parse errors as type errors.
For listp, there is no need to use anything interesting for the
state type. For more complicated parsers, however, state comes in
handy. To see how, we will define parsers for prefix and postfix
arithmetic expressions. This is also a nice example because prefix and
postfix expressions cannot be handled by Coq's extensible parser
alone.
Expressions can involve three operations: addition, subtraction and
multiplication.

Inductive op := Add | Sub | Mul.

The parsers will read tokens that can be natural numbers or one of
the above operations. We group those in a common type exp_token.

Inductive exp_token :=

| Op (o : op)

| Const (n : nat).

Notation "''+'" := Add (at level 0).

Notation "''-'" := Sub (at level 0).

Notation "''*'" := Mul (at level 0).

Coercion Op : op >-> exp_token.

Coercion Const : nat >-> exp_token.

Let's consider how to deal with prefix expressions first. At any
point during parsing, the parser is waiting for some number of
additional expressions it must read in order to complete the top-level
expression. We will use this number as the parser state. Initially,
the parser wants to read just one expression, so we will use that as
the initial state. Once the state reaches zero, there are no more
expressions to read, so we know that the parser is done.

Module Pre.

Definition state := nat.

Definition initial_state : state := 1.

If our current state is zero, any additional tokens fed to the
parser should result in a parse error. To implement this behavior, we
define the type of tokens for that state to be Empty_set. Trying to
feed an extra token to parser at that point will result in a type
error, signaling that a parse error just occurred. If the parser still
expects expressions (i.e., if the current state is greater than zero),
we just use exp_token for the token type.

Definition token (s : state) : Type :=

match s with

| S _ => exp_token

| 0 => Empty_set

end.

The value built by the parser will be a continuation that expects
n numbers, one for each expression that the parser still needs to
read.

Fixpoint partial_result (n : nat) : Type :=

match n with

| 0 => nat

| S n => nat -> partial_result n

end.

We must define how the parser actually interprets the tokens it
reads. If the parser expects n expressions, reading a constant will
make this number go down by one, since a constant is a complete
expression. If it reads an operation, on the other hand, that number
is increased by one. If n = 0, there is no token to be read, hence
no next state, so we perform an empty pattern match to show that can't
happen.

Definition next (s : state) : token s -> state :=

match s with

| S n' => fun t =>

match t with

| Op _ => S (S n')

| Const _ => n'

end

| 0 => fun t => match t with end

end.

How do we update the result? If we read a constant, we just feed
it to the continuation. If we read an operation, we compose that
operation with the continuation, which has the net effect of adding
one argument to it. Here, ap_op is a function that maps each op to
the corresponding Coq function.

Definition read_token s : partial_result s -> forall t, partial_result (next s t) :=

match s with

| S n' =>

fun res t =>

match t with

| Op o => fun n1 n2 => res (ap_op o n1 n2)

| Const n => res n

end

| _ => fun _ t => match t with end

end.

End Pre.

We can now package our definitions as a complete parser and try it
on some examples:

Definition pre := {|

state := Pre.state;

initial_state := 1;

token := Pre.token;

partial_result := Pre.partial_result;

next := Pre.next;

initial_partial_result := fun t => t;

read_token := Pre.read_token

|}.

Definition pre_exp1 : nat :=

[pre '+ '- 1 2 '+ 4 4].

Definition pre_exp2 : nat :=

[pre '+ '* 12 '* 12 12 '* 1 '* 1 1].

We can also see that invalid expressions are rejected, as expected.

Fail Definition pre_exp_wrong : nat :=

[pre '+ 1 1 1].

Error: The term "1" has type "nat" while it is expected to have type

"token pre (next pre (next pre (next pre (initial_state pre) '+) 1) 1)".

Module Post.

Definition state := nat.

Definition initial_state : state := 0.

Since our operations need at least two numbers on the stack, we
restrict our parser to accept only numbers if the stack doesn't have
the appropriate size:

Definition token (s : state) : Type :=

match s with

| 0 | 1 => nat

| _ => exp_token

end.

The partial_result type is a length-indexed list of natural
numbers with one small twist: we ensure that partial_result 1 = nat
definitionally, so that we can use postfix expressions as having type
nat without the need for any projections.

Fixpoint partial_result' (n : nat) : Type :=

match n with

| 0 => nat

| S n => (partial_result' n * nat)%type

end.

Definition partial_result (s : state) : Type :=

match s with

| 0 => unit

| S n => partial_result' n

end.

next and read_token are dual to the definitions of the
previous parser: reading a constant increases the stack size by one,
while reading an operation decreases the stack size by one.

Definition next s : token s -> state :=

match s with

| 0 => fun _ => 1

| 1 => fun _ => 2

| S (S n) => fun t =>

match t with

| Op _ => 1 + n

| Const _ => 3 + n

end

end.

Definition read_token s : partial_result s -> forall t, partial_result (next s t) :=

match s with

| 0 => fun _ t => t

| 1 => fun res t => (res, t)

| 2 => fun res t =>

match res with

| (n1, n2) =>

match t with

| Op o => ap_op o n1 n2

| Const n => (n1, n2, n)

end

end

| S (S (S n)) => fun res t =>

match res with

| (res, n1, n2) =>

match t with

| Op o => (res, ap_op o n1 n2)

| Const n => (res, n1, n2, n)

end

end

end.

End Post.

We now have a full parser for postfix expressions, which we can
test on some examples.

Definition post := {|

state := Post.state;

initial_state := 0;

token := Post.token;

next := Post.next;

partial_result := Post.partial_result;

initial_partial_result := tt;

read_token := Post.read_token

|}.

Definition post_exp1 : nat := [post 4 4 '+ 2 1 '- '+].

Definition post_exp2 : nat := [post 9 9 '* 9 '* 10 10 '* 10 '* '+].

Consider the following problem. Suppose we are in a 100-story
building. We know that, when dropping an egg from the window, the egg
will stay intact if we are below a certain floor. However, if we
repeat the same exercise above that critical floor, the egg will
break. How can we find this floor and minimize the number of egg drops
performed in the worst case if we have only two eggs? We suppose that
we are allowed to reuse eggs that fall without breaking.
We will see how we can model this problem in Coq and find a correct
solution. We model a playing strategy as a decision tree:

Inductive strategy : Type :=

| Guess (floor : nat)

| Drop (floor : nat) (broken intact : strategy).

In the above definition, Guess floor represents the end of the
algorithm, when we try to guess the target floor. If floor is equal
to the target, we win the game; otherwise, we lose. Drop floor broken
intact represents an egg drop at floor. If the egg breaks, we
continue playing with strategy broken; otherwise, we continue with
intact.
Simulating an egg-drop game is just a matter of performing a tree
search. play target s returns true if and only if strategy s
succeeds in guessing floor target.

Fixpoint play (target : nat) (s : strategy) : bool :=

match s with

| Guess floor =>

beq_nat floor target

| Drop floor broken intact =>

play target (if leb target floor then broken

else intact)

end.

We can also find how many eggs a strategy needs and how many drops
are performed in the worst case. drops just computes the strategy
tree height, whereas eggs computes a "skewed" height, where right
branches do not add to the final value.

Fixpoint eggs (s : strategy) : nat :=

match s with

| Guess _ => 0

| Drop _ broken intact => max (S (eggs broken)) (eggs intact)

end.

Fixpoint drops (s : strategy) : nat :=

match s with

| Guess _ => 0

| Drop _ broken intact => S (max (drops broken) (drops intact))

end.

Finally, using these concepts, we can describe what the solution
for our problem is. winning lower range s says that strategy s is
able to find range target floors starting at lower, while
is_optimal range e d states that there is a winning strategy for
guessing range target floors, uses at most e eggs and performing
at most d drops, such that d is the smallest possible.

Definition winning (lower range : nat) (s : strategy) : Prop :=

forall target, lower <= target < lower + range ->

play target s = true.

Definition is_optimal (range e d : nat) : Prop :=

exists s : strategy,

eggs s <= e /\

drops s = d /\

winning 0 range s /\

forall s', eggs s' <= e ->

winning 0 range s' ->

d <= drops s'.

Fixpoint linear (lower range : nat) : strategy :=

match range with

| 0 => Guess lower

| S range' => Drop lower (Guess lower) (linear (S lower) range')

end.

linear lower range works on a range of S range floors, using
at most one egg. Because of this, it is not very efficient, performing
n drops in the worst case.

Lemma linear_winning lower range :

winning lower (S range) (linear lower range).

Lemma linear_eggs lower range :

eggs (linear lower range) = min 1 range.

Lemma linear_drops lower range :

drops (linear lower range) = range.

Definition solution_take_1 : strategy :=

Drop 49 (linear 0 49) (linear 50 49).

Lemma solution_take_1_winning :

winning 0 100 solution_take_1.

Lemma solution_take_1_eggs :

eggs solution_take_1 = 2.

Lemma solution_take_1_drops :

drops solution_take_1 = 50.

Although much better than a pure linear strategy, this is still
far from being optimal: if our egg doesn't break on our first drop, we
will still be using at most only one egg on that upper range. If we
allowed ourselves to break one of them, presumably, we would be able
to solve the upper range in less than 50 drops, which would in turn
allow us to perform our first drop at a lower floor, e.g.

Definition solution_take_2 : strategy :=

Drop 33 (linear 0 33)

(Drop 66 (linear 34 32) (linear 67 32)).

Lemma solution_take_2_winning :

winning 0 100 solution_take_2.

Lemma solution_take_2_eggs :

eggs solution_take_2 = 2.

Lemma solution_take_2_drops :

drops solution_take_2 = 34.

Our new solution performs much better, but there is still room for
improvement. Once again, if our first two drops are below the target
floor, we will be using only one egg to search through 33 floors,
which could be done better if we had used both of them. This suggests
that the optimal strategy should be some form of skewed binary search,
where the upper range that is produced by an egg drop should use its
available eggs in an optimal way.
## Finding the optimum

How can we formalize the intuition that we just developed? The key
insight is to reason by *duality* and, instead, ask "what is the
largest range of floors we can scan using at most some number of eggs
and drops?" When looking at the problem this way, it becomes clear
that optimality has a recursive structure that is easy to describe: to
find a floor using at most e eggs and d drops, we need to combine
two optimal strategies: one using at most e-1 eggs and d-1 drops,
for the case where our first drop causes the egg to break, and another
using at most e eggs and d-1 drops, for the case where our egg
does not break at first. We can readily express this idea as
code. optimal_range e d computes the maximal range size that can be
solved using e eggs and d drops at most, while optimal lower e d
computes a strategy for doing so starting from floor lower.

Fixpoint optimal_range_minus_1 (e d : nat) {struct d} : nat :=

match d, e with

| S d', S e' => S (optimal_range_minus_1 e' d' +

optimal_range_minus_1 (S e') d')

| _, _ => 0

end.

Definition optimal_range e d := S (optimal_range_minus_1 e d).

Fixpoint optimal (lower e d : nat) {struct d} : strategy :=

match d, e with

| S d', S e' =>

let floor := lower + optimal_range_minus_1 e' d' in

Drop floor

(optimal lower e' d')

(optimal (S floor) (S e') d')

| _, _ => Guess lower

end.

It is easy to show that optimal lower e d scans optimal_range e
d floors, as well that it uses the resources that we expect.

Lemma optimal_winning lower e d :

winning lower (optimal_range e d) (optimal lower e d).

Lemma optimal_eggs lower e d :

eggs (optimal lower e d) = min e d.

Lemma optimal_drops lower e d :

drops (optimal lower e d) = min 1 e * d.

To actually show optimality, we need to show that optimal_range
indeed computes what it is supposed to. We start by showing two
inversion lemmas, relating the range that is scanned by a winning
strategy to the range scanned by its sub-strategies.

Lemma winning_inv_guess lower range floor :

winning lower range (Guess floor) -> range <= 1.

Lemma winning_inv_drop lower range floor broken intact :

winning lower range (Drop floor broken intact) ->

exists r1 r2 lower',

range = r1 + r2 /\

winning lower r1 broken /\

winning lower' r2 intact.

Lemma optimal_range_correct :

forall lower e d s range,

eggs s <= e ->

drops s <= d ->

winning lower range s ->

range <= optimal_range e d.

Combining this lemma with the ranges we had derived for linear
and optimal, we can prove useful results about optimal_range.

Lemma optimal_range_monotone :

forall e e' d d',

e <= e' ->

d <= d' ->

optimal_range e d <= optimal_range e' d'.

Proof.

intros e e' d d' He Hd.

apply (optimal_range_correct 0 e' d' (optimal 0 e d));

[ rewrite optimal_eggs; lia

| rewrite optimal_drops; destruct e; simpl; lia

| apply optimal_winning ].

Qed.

Lemma optimal_range_lower_bound :

forall e d, d <= (optimal_range (S e) d).

Proof.

intros e d.

cut (S d <= optimal_range (S e) d); try lia.

apply (optimal_range_correct 0 (S e) d (linear 0 d));

[ rewrite linear_eggs

| rewrite linear_drops

| apply linear_winning ]; lia.

Qed.

Given that optimal_range is monotone, we can find what the
optimal number of drops for a given range is by picking the smallest
value of t such that range <= optimal_range e t. We formalize this
idea by writing a generic function find_root that can find such
values for any monotone function f, given a suitable initial
guess.

Fixpoint find_root (f : nat -> nat) (target n : nat) : nat :=

match n with

| 0 => 0

| S n' =>

if leb target (f n') then

find_root f target n'

else

S n'

end.

Lemma find_root_correct :

forall f target n,

(forall x y, x <= y -> f x <= f y) ->

target <= f n ->

let x := find_root f target n in

target <= f x /\

forall y, y < x -> f y < target.

By instantiating this theorem with optimal_range and applying
the appropriate theorems, we obtain our final result. The proof of
optimality goes by contradiction. Let t = find_optimum (S e)
range. if we find another strategy s such that eggs s <= S e and
drops s < t, we know that range <= optimal_range (S e) t by
optimal_range_correct, but we must also have optimal_range (S e) t
< range by the correctness of find_root.

Definition find_optimum e target :=

find_root (optimal_range e) target target.

Lemma find_optimum_correct :

forall e range,

let d := find_optimum (S e) range in

is_optimal range (S e) d.

Proof.

intros e range d.

assert (H : range <= optimal_range (S e) d /\

forall d', d' < d -> optimal_range (S e) d' < range).

(* By correctness of find_root *)

destruct H as [H1 H2].

exists (optimal 0 (S e) d).

rewrite optimal_drops, optimal_eggs.

repeat split; try lia; simpl; try lia.

- intros x Hx.

apply optimal_winning. lia.

- intros s Hs WIN.

destruct (le_lt_dec d (drops s)) as [LE | LT]; trivial.

assert (Hd : drops s <= drops s) by lia.

assert (He : eggs s <= S e) by lia.

(* optimal_range < range *)

pose proof (H2 _ LT).

(* range <= optimal_range *)

pose proof (optimal_range_correct _ _ _ _ _ He Hd WIN).

lia.

Qed.

Using this result, we can find the answer for our original problem.

Lemma solution :

is_optimal 100 2 14.

Proof. apply find_optimum_correct. Qed.

Whenever a user doesn't name a variable in an Ltac script, Coq
will automatically generate a fresh name for it. After doing some
proofs, however, you start to realize that this behavior is more of a
nuisance than a feature. The heuristics used by Coq when choosing new
names are hard to predict and depend on the current proof state,
making proofs likely to break whenever a small change is
introduced. As every experienced Coq user knows, trying to figure out
whether hypothesis H22 now corresponds to H9 or H24 when fixing
a complicated proof script cannot exactly be described as a pleasant
experience, specially when the proof script is 2000 lines long and
you've spent hours dealing with similar horrors.
When writing robust proof scripts, one must ensure that the code
doesn't depend on weird automatically chosen names. Here are some tips
for making the naming problem disappear.
### Name variables explicitly

Most of the times, you can get rid of this problem by not being lazy
and naming everything explicitly. Avoid using tactics like intros or
destruct in their unnamed form; instead, use intros v1 v2 v3 ...
or destruct v as [a1 a2|a1 a2 a3|...]. You can also name hypothesis
in the statement of your lemma, using a forall instead of a plain
implication arrow (remember, in Coq, these are actually the same
thing). If you don't pass any names to intros, it will try to keep
the same names that appear in the current goal:

Lemma l1 : forall (A : Prop) (HYP : A), A.

Proof. intros. exact HYP. Qed.

This also works when reintroducing a variable from the context
into the goal, for instance with generalize dependent.
### Name constructor arguments

When working with complicated inductive predicates, naming all new
variables explicitly on every destruct or inversion can be
cumbersome. A reasonable compromise is to name all the arguments to
your constructors in inductive definitions. This makes the names that
are generated by those tactics meaningful, more stable and
predictable. Here's an example of a simple inductively defined
relation with explicit argument names:

Inductive collatz : nat -> nat -> Prop :=

| collatz_even (n m : nat)

(EVEN : 2 * m = n) : collatz n m

| collatz_odd (n m : nat)

(ODD : 2 * m + 1 = n) : collatz n (3 * n + 1)

| collatz_trans (n m p : nat)

(COLLATZnm : collatz n m)

(COLLATZmp : collatz m p) : collatz n p.

Lemma collatz_cycle :

forall n m,

(n = 1 \/ n = 4 \/ n = 2) ->

collatz n m ->

m = 1 \/ m = 4 \/ m = 2.

Proof.

intros.

match goal with

| H : collatz _ _ |- _ => induction H

end.

- omega.

- repeat match goal with

| H : _ \/ _ |- _ => destruct H

end; omega.

- match goal with

| H1 : ?P,

H2 : ?P -> ?Q,

H3 : ?Q -> ?R |- ?R =>

apply H3; apply H2; apply H1

end.

Qed.

In the above (admittedly artificial) example, we were able to
refer exactly to what we wanted in the context without having to
resort to the names that Coq chose. Notice how in the third case we
were able to specify how premises and conclusions should match when
picking the hypotheses. In most cases, a simple match suffices to
remove explicit names from proofs.

In the previous
post, we've introduced the concept of *combinatorial game* and
saw how we can employ a simple formalism to model all such games
uniformly, the game type. To see what this point of view can bring
us, we'll discuss the idea of *summing* combinatorial games, which can
be used to analyze games by decomposing them into smaller, simpler
ones. As we'll see, this notion has a straightforward computational
interpretation for the game type, making it convenient to
manipulate and reason about.
The intuition behind summing games is that they often evolve to a
point where they can be seen as several smaller ones being played
"simultaneously". In go, for instance, the board is progressively
partioned into different regions, and each player must choose in which
of those to play during their turn. This suggests a representation
where the state of a game is given by the product of the states of the
subgames. To make a move, a player modifies the state of one of the
subgames according to its rules, leaving the rest of the state
untouched.
To define the sum of two combinatorial games (as given by the
combinatorial_game type), we can simply combine the lists of moves
available for each game component. We must still show that this
results in a terminating game, but it suffices to observe that the
valid_move relation in this case is equivalent to the product of two
well-founded relations, and the Coq standard library provides lemmas
for dealing with this case.

Definition cg_pair_order {cg1 cg2} :=

symprod _ _ (valid_move cg1) (valid_move cg2).

Definition cg_pair_order_wf {cg1 cg2} : well_founded cg_pair_order :=

wf_symprod _ _ _ _ (finite_game cg1) (finite_game cg2).

Program Definition sum_cg (cg1 cg2 : combinatorial_game)

: combinatorial_game :=

{| position := position cg1 * position cg2;

moves p pos := map (fun pos1' => (pos1', snd pos))

(moves cg1 p (fst pos)) ++

map (fun pos2' => (fst pos, pos2'))

(moves cg2 p (snd pos)) |}.

Next Obligation.

match goal with

| |- well_founded ?R =>

assert (EQ : RelationClasses.relation_equivalence R cg_pair_order)

end.

(* ... *)

rewrite EQ.

apply cg_pair_order_wf.

Qed.

A problem with this definition is that it is not directly amenable
to computation. We can overcome this problem by defining game sums
directly over the game type. Since game is universal, we can hope
this should be enough to define what a sum of games is generically. A
naive adaptation doesn't work, as recursive calls to sum don't have
a single decreasing argument, e.g.

Fail Fixpoint sum (g1 g2 : game) : game :=

Game (map_game game_as_cg g1 Left (fun g1' P => sum g1' g2) ++

map_game game_as_cg g2 Left (fun g2' P => sum g1 g2'))

(map_game game_as_cg g1 Right (fun g1' P => sum g1' g2) ++

map_game game_as_cg g2 Right (fun g2' P => sum g1 g2')).

(* Error: Cannot guess decreasing argument of fix. *)

One solution is again to pair both arguments and use the Fix combinator
with cg_pair_order_wf. Manipulating proof terms in the recursive calls can be
made less awkward by using the simple refine tactic:

Definition sum (g1 g2 : game) : game.

simple refine (

Fix cg_pair_order_wf (fun _ => game)

(fun gs =>

match gs with

| (g1, g2) => fun sum =>

let sum_fst g1' P := sum (g1', g2) _ in

let sum_snd g2' P := sum (g1, g2') _ in

Game (map_game game_as_cg g1 Left sum_fst ++

map_game game_as_cg g2 Left sum_snd)

(map_game game_as_cg g1 Right sum_fst ++

map_game game_as_cg g2 Right sum_snd)

end) (g1, g2));

clear - P; constructor; trivial.

Defined.

As with all definitions involving Fix, we must now prove a lemma
that shows how sum unfolds. The proof is very similar to the one of
embed_in_game_eq, and is thus omitted.

Lemma sum_eq (g1 g2 : game) :

sum g1 g2 =

Game (map (fun g1' => sum g1' g2) (left_moves g1) ++

map (fun g2' => sum g1 g2') (left_moves g2))

(map (fun g1' => sum g1' g2) (right_moves g1) ++

map (fun g2' => sum g1 g2') (right_moves g2)).

The name sum suggests that combinatorial games could behave like
numbers. We won't discuss this correspondence in much detail for now,
but some interesting identities do show up already:

Lemma zero_plus_zero : sum zero zero = zero.

Lemma one_plus_zero : sum one zero = one.

Showing that sum correctly computes what it should is not
difficult: we proceed by well-founded induction combining simple
lemmas about the behavior of sum, embed_in_game, and map. We
need an auxiliary result that allows us to apply our inductive
hypothesis:

Lemma map_ext_strong :

forall {A B}

(l : list A)

(f g : A -> B)

(EXT : forall x, In x l -> f x = g x),

map f l = map g l.

The statement of lemma just says that embed_in_game exchanges
sum and sum_cg.

Lemma sum_is_sum (cg1 cg2 : combinatorial_game)

(pos1 : position cg1) (pos2 : position cg2) :

embed_in_game (sum_cg cg1 cg2) (pos1, pos2) =

sum (embed_in_game cg1 pos1) (embed_in_game cg2 pos2).

Proof.

remember (pos1, pos2) as pos.

replace pos1 with (fst pos) by (destruct pos; simpl; congruence).

replace pos2 with (snd pos) by (destruct pos; simpl; congruence).

clear.

induction pos as [[pos1 pos2] IH]

using (well_founded_ind cg_pair_order_wf).

rewrite embed_in_game_eq, sum_eq. simpl.

repeat rewrite (embed_in_game_moves _ _ Left).

repeat rewrite (embed_in_game_moves _ _ Right).

repeat rewrite map_app. repeat rewrite map_map.

do 2 f_equal;

apply map_ext_strong; intros pos IN; apply IH; constructor; eexists; eauto.

Qed.

In the next posts, we will see how to use this machinery when
decomposing games as sums and comparing subgames.

In this post, I will begin to formalize a small part of
combinatorial game theory using Coq. Combinatorial game theory
attempts to model sequential, deterministic games between two players,
both of which take turns causing the game state to change. It
restricts itself to *perfect information* games, where the current
configuration of the game is known to both players. Thus, it can be
used to study games such as chess, tic-tac-toe, and go, but not games
such as poker or blackjack.
The power of combinatorial game theory comes from abstracting away
details that are too specific to each game, such as whether it is
played moving pieces on a board, how the pieces can move, etc. It
defines a single mathematical object that can represent all games
uniformily, allowing us to study general situations that could occur
in many kinds of games. In this post, I will present this
representation and discuss why it makes sense. I will start with a
more intuitive formalization of combinatorial games, and then show how
combinatorial game theory applies to all of those.
## Defining combinatorial games

In combinatorial games, players are traditionally called *left* and
*right*.

Inductive player : Type := Left | Right.

Each combinatorial game has a set of *positions*. In chess, for
instance, a position is a chess board with black and white pieces on
it. The rules of the game determine which *moves* are available to
each player at a given position. They also describe how the game ends,
and which player wins in that case. To simplify our analysis, we will
assume that games end when some player must play but has no moves
left, in which case the other player wins. This frees us from having
to model how matches end separately for each game. We will also
consider only *finite* games, i.e. ones that can't be played
indefinitely. Notice that these two assumptions taken together,
strictly speaking, rule out many interesting games: chess can end in a
draw, something that can't happen in our model.
Translating the above requirements into code results in the following
definition:

Inductive combinatorial_game := CombinatorialGame {

position : Type;

moves : player -> position -> list position;

valid_move next current := exists s, In next (moves s current);

finite_game : well_founded valid_move

}.

To formalize how games are played, we define a predicate Match cg
first winner m that represents a match of game cg where player
first starts and player winner wins. m is the sequence of
positions traversed during the match, from first to last.

Definition other (s : player) : player :=

match s with

| Left => Right

| Right => Left

end.

Inductive Match cg : forall (first winner : player), list (position cg) -> Prop :=

| Match_end : forall pl pos,

moves cg pl pos = [] ->

Match cg pl (other pl) [pos]

| Match_move : forall pl winner pos next m,

In next (moves cg pl pos) ->

Match cg (other pl) winner (next :: m) ->

Match cg pl winner (pos :: next :: m).

In the Match_end clause, we check that the current player has no
moves left. In Match_move, we check that the current player can make
a move and that the match can then proceed with the other positions.
## A universal game

We will now define a combinatorial game that is, in a precise sense to
be explained later, the most general one. This is what combinatorial
game theory uses to study combinatorial games.
The crucial observation is that in the definition of Match we only
care about the moves that can be made from a given position, but not
about what the positions themselves *are*. This suggests a definition
where each position is just a pair of sets, one for each player's
moves. This forms the type game of games.

Inductive game := Game {

left_moves : list game;

right_moves : list game

}.

Each position in this game can be pictured as an
arbitrarily-branching tree with two sets of children. On each player's
turn, they choose one child in their set of moves to be the new
position, and lose if they can't choose anything.
The simplest game is zero, where both players have no moves
available. It is a game where the first player always loses.

Definition zero : game := Game [] [].

Using zero we can define star, where the first player *always*
wins by making a move to zero.

Definition star : game := Game [zero] [zero].

A small variation gives us one and minus_one, where Left and
Right always win, respectively.

Definition one : game := Game [zero] [].

Definition minus_one : game := Game [] [zero].

It should be possible to encapsulate game in a
combinatorial_game record. Defining moves is simple, but proving
that games are finite requires some additional work. The nested
inductive in the definition of game makes Coq generate an induction
principle that is too weak to be useful:

game_ind

: forall P : game -> Prop,

(forall left_moves right_moves : list game,

P {| left_moves := left_moves; right_moves := right_moves |}) ->

forall g : game, P g
The usual solution is to define a new one by hand that gives us
induction hypotheses to use:

game_ind

: forall P : game -> Prop,

(forall left_moves right_moves : list game,

P {| left_moves := left_moves; right_moves := right_moves |}) ->

forall g : game, P g

Lemma lift_forall :

forall T (P : T -> Prop),

(forall t, P t) ->

forall l, Forall P l.

Proof. induction l; auto. Defined.

Definition game_ind' (P : game -> Prop)

(H : forall l r, Forall P l -> Forall P r -> P (Game l r)) :

forall g : game, P g :=

fix F (g : game) : P g :=

match g with

| Game l r =>

H l r (lift_forall _ P F l) (lift_forall _ P F r)

end.

Using this principle, we can now prove that games always
terminate and define a combinatorial_game for game.

Definition game_as_cg : combinatorial_game.

refine ({| position := game;

moves s := if s then left_moves else right_moves |}).

intros p1.

induction p1 as [l r IHl IHr] using game_ind'.

(* ... *)

Defined.

Definition game_embedding (cg1 cg2 : combinatorial_game)

(embedding : position cg1 -> position cg2) : Prop :=

forall first winner (m : list (position cg1)),

Match cg1 first winner m ->

Match cg2 first winner (map embedding m).

With this notion of game embedding, combinatorial games form a
category. I will now show that every combinatorial game can be
embedded in game, making game a terminal object in this category
and the most general combinatorial game. In this formulation, it is
only a *weakly* terminal object (i.e., embeddings are not unique), as
we are using Coq lists to represent sets.
To embed an arbitrary combinatorial game into game, we can define a
function by well-founded recursion over the proof that games are
finite. In order to do this, we need a higher-order function
map_game that allows us to perform a well-founded recursive call on
a list of next moves. map_game acts like map, but passes to its
argument function a proof that the element is a valid_move.

Fixpoint map_In {A B} (l : list A) : (forall x, In x l -> B) -> list B :=

match l with

| [] => fun _ => []

| x :: l' => fun f =>

f x (or_introl _ eq_refl)

:: map_In l' (fun x P => f x (or_intror _ P))

end.

Definition map_game {A} (cg : combinatorial_game)

(pos : position cg) (p : player)

(f : forall pos', valid_move cg pos' pos -> A) : list A :=

map_In (moves cg p pos) (fun pos' P => f pos' (ex_intro _ p P)).

Using this function and the Fix combinator in the standard
library, we write a generic embedding function embed_in_game. Like a
regular fixpoint combinator, Fix takes a function that does a
recursive call by applying its argument (here, F). The difference is
that this argument must take a *proof* that shows that the recursive
call is valid.
The behavior of embed_in_game is simple: it calls itself recursively
for each possible next position, and includes that position in the set
of moves of the appropriate player.

Definition embed_in_game cg (pos : position cg) : game :=

Fix (finite_game cg)

(fun _ => position game_as_cg)

(fun pos F =>

Game (map_game cg pos Left F)

(map_game cg pos Right F))

pos.

Definitions that use Fix can be hard to manipulate directly, so
we need to prove some equations that describe the reduction behavior
of the function. I've hidden some of the auxiliary lemmas and proofs
for clarity; as usual, you can find them in the original .v file.
The proof that we can unfold embed_in_game once uses the Fix_eq
lemma in the standard library.

Lemma embed_in_game_eq cg (pos : position cg) :

embed_in_game cg pos =

Game (map (embed_in_game cg) (moves cg Left pos))

(map (embed_in_game cg) (moves cg Right pos)).

Proof.

unfold embed_in_game in *.

rewrite Fix_eq.

(* ... *)

Qed.

With this lemma, we can show that moves and embed_in_game commute.

Lemma embed_in_game_moves cg (p : position cg) :

forall s, moves game_as_cg s (embed_in_game cg p) =

map (embed_in_game cg) (moves cg s p).

Proof.

intros.

rewrite embed_in_game_eq.

destruct s; reflexivity.

Qed.

We are now ready to state and prove our theorem: every
combinatorial game can be embedded in game.

Theorem embed_in_game_correct cg :

game_embedding cg game_as_cg (embed_in_game cg).

Proof.

unfold game_embedding.

intros first winner m MATCH.

induction MATCH as [winner p H|s winner p p' m IN MATCH IH];

simpl; constructor; eauto.

- rewrite embed_in_game_moves, H. reflexivity.

- rewrite embed_in_game_moves. auto using in_map.

Qed.