Where the Coq lives

| Comments GitHub

Automatic naming considered harmful

Whenever a user doesn't name a variable in an Ltac script, Coq will automatically generate a fresh name for it. After doing some proofs, however, you start to realize that this behavior is more of a nuisance than a feature. The heuristics used by Coq when choosing new names are hard to predict and depend on the current proof state, making proofs likely to break whenever a small change is introduced. As every experienced Coq user knows, trying to figure out whether hypothesis H22 now corresponds to H9 or H24 when fixing a complicated proof script cannot exactly be described as a pleasant experience, specially when the proof script is 2000 lines long and you've spent hours dealing with similar horrors.
When writing robust proof scripts, one must ensure that the code doesn't depend on weird automatically chosen names. Here are some tips for making the naming problem disappear.

Name variables explicitly

Most of the times, you can get rid of this problem by not being lazy and naming everything explicitly. Avoid using tactics like intros or destruct in their unnamed form; instead, use intros v1 v2 v3 ... or destruct v as [a1 a2|a1 a2 a3|...]. You can also name hypothesis in the statement of your lemma, using a forall instead of a plain implication arrow (remember, in Coq, these are actually the same thing). If you don't pass any names to intros, it will try to keep the same names that appear in the current goal:

Lemma l1 : forall (A : Prop) (HYP : A), A.
Proof. intros. exact HYP. Qed.

This also works when reintroducing a variable from the context into the goal, for instance with generalize dependent.

Name constructor arguments

When working with complicated inductive predicates, naming all new variables explicitly on every destruct or inversion can be cumbersome. A reasonable compromise is to name all the arguments to your constructors in inductive definitions. This makes the names that are generated by those tactics meaningful, more stable and predictable. Here's an example of a simple inductively defined relation with explicit argument names:

Inductive collatz : nat -> nat -> Prop :=
| collatz_even (n m : nat)
               (EVEN : 2 * m = n) : collatz n m
| collatz_odd (n m : nat)
              (ODD : 2 * m + 1 = n) : collatz n (3 * n + 1)
| collatz_trans (n m p : nat)
                (COLLATZnm : collatz n m)
                (COLLATZmp : collatz m p) : collatz n p.

Don't refer to names in proofs

Weird automatically generated names are not a problem if they do not appear in Ltac scripts. Automated proofs can obviously help here, but there are small things you can do even if you're not an automation expert. By using match, you can avoid mentioning automatically generated names without having to convert an unnamed inversion or destruct into a named one:

Lemma collatz_cycle :
  forall n m,
    (n = 1 \/ n = 4 \/ n = 2) ->
    collatz n m ->
    m = 1 \/ m = 4 \/ m = 2.
  match goal with
    | H : collatz _ _ |- _ => induction H
  - omega.
  - repeat match goal with
             | H : _ \/ _ |- _ => destruct H
           end; omega.
  - match goal with
      | H1 : ?P,
        H2 : ?P -> ?Q,
        H3 : ?Q -> ?R |- ?R =>
        apply H3; apply H2; apply H1

In the above (admittedly artificial) example, we were able to refer exactly to what we wanted in the context without having to resort to the names that Coq chose. Notice how in the third case we were able to specify how premises and conclusions should match when picking the hypotheses. In most cases, a simple match suffices to remove explicit names from proofs.

| Comments GitHub

Summing combinatorial games

In the previous post, we've introduced the concept of combinatorial game and saw how we can employ a simple formalism to model all such games uniformly, the game type. To see what this point of view can bring us, we'll discuss the idea of summing combinatorial games, which can be used to analyze games by decomposing them into smaller, simpler ones. As we'll see, this notion has a straightforward computational interpretation for the game type, making it convenient to manipulate and reason about.
The intuition behind summing games is that they often evolve to a point where they can be seen as several smaller ones being played "simultaneously". In go, for instance, the board is progressively partioned into different regions, and each player must choose in which of those to play during their turn. This suggests a representation where the state of a game is given by the product of the states of the subgames. To make a move, a player modifies the state of one of the subgames according to its rules, leaving the rest of the state untouched.
To define the sum of two combinatorial games (as given by the combinatorial_game type), we can simply combine the lists of moves available for each game component. We must still show that this results in a terminating game, but it suffices to observe that the valid_move relation in this case is equivalent to the product of two well-founded relations, and the Coq standard library provides lemmas for dealing with this case.

Definition cg_pair_order {cg1 cg2} :=
  symprod _ _ (valid_move cg1) (valid_move cg2).
Definition cg_pair_order_wf {cg1 cg2} : well_founded cg_pair_order :=
  wf_symprod _ _ _ _ (finite_game cg1) (finite_game cg2).

Program Definition sum_cg (cg1 cg2 : combinatorial_game)
                          : combinatorial_game :=
  {| position := position cg1 * position cg2;
     moves p pos := map (fun pos1' => (pos1', snd pos))
                        (moves cg1 p (fst pos)) ++
                    map (fun pos2' => (fst pos, pos2'))
                        (moves cg2 p (snd pos)) |}.
Next Obligation.
  match goal with
    | |- well_founded ?R =>
      assert (EQ : RelationClasses.relation_equivalence R cg_pair_order)
  (* ... *)
  rewrite EQ.
  apply cg_pair_order_wf.

A problem with this definition is that it is not directly amenable to computation. We can overcome this problem by defining game sums directly over the game type. Since game is universal, we can hope this should be enough to define what a sum of games is generically. A naive adaptation doesn't work, as recursive calls to sum don't have a single decreasing argument, e.g.

Fail Fixpoint sum (g1 g2 : game) : game :=
  Game (map_game game_as_cg g1 Left (fun g1' P => sum g1' g2) ++
        map_game game_as_cg g2 Left (fun g2' P => sum g1 g2'))
       (map_game game_as_cg g1 Right (fun g1' P => sum g1' g2) ++
        map_game game_as_cg g2 Right (fun g2' P => sum g1 g2')).
(* Error: Cannot guess decreasing argument of fix. *)

One solution is again to pair both arguments and use the Fix combinator with cg_pair_order_wf. Manipulating proof terms in the recursive calls can be made less awkward by using the refine tactic:

Definition sum (g1 g2 : game) : game.
  refine (
    Fix cg_pair_order_wf (fun _ => game)
        (fun gs =>
           match gs with
           | (g1, g2) => fun sum =>
             let sum_fst g1' P := sum (g1', g2) _ in
             let sum_snd g2' P := sum (g1, g2') _ in
             Game (map_game game_as_cg g1 Left sum_fst ++
                   map_game game_as_cg g2 Left sum_snd)
                  (map_game game_as_cg g1 Right sum_fst ++
                   map_game game_as_cg g2 Right sum_snd)
           end) (g1, g2));
  clear - P; constructor; eauto.

As with all definitions involving Fix, we must now prove a lemma that shows how sum unfolds. The proof is very similar to the one of embed_in_game_eq, and is thus omitted.

Lemma sum_eq (g1 g2 : game) :
  sum g1 g2 =
  Game (map (fun g1' => sum g1' g2) (left_moves g1) ++
        map (fun g2' => sum g1 g2') (left_moves g2))
       (map (fun g1' => sum g1' g2) (right_moves g1) ++
        map (fun g2' => sum g1 g2') (right_moves g2)).

The name sum suggests that combinatorial games could behave like numbers. We won't discuss this correspondence in much detail for now, but some interesting identities do show up already:

Lemma zero_plus_zero : sum zero zero = zero.

Lemma one_plus_zero : sum one zero = one.

Showing that sum correctly computes what it should is not difficult: we proceed by well-founded induction combining simple lemmas about the behavior of sum, embed_in_game, and map. We need an auxiliary result that allows us to apply our inductive hypothesis:

Lemma map_ext_strong :
  forall {A B}
         (l : list A)
         (f g : A -> B)
         (EXT : forall x, In x l -> f x = g x),
    map f l = map g l.

The statement of lemma just says that embed_in_game exchanges sum and sum_cg.

Lemma sum_is_sum (cg1 cg2 : combinatorial_game)
                 (pos1 : position cg1) (pos2 : position cg2) :
  embed_in_game (sum_cg cg1 cg2) (pos1, pos2) =
  sum (embed_in_game cg1 pos1) (embed_in_game cg2 pos2).
  remember (pos1, pos2) as pos.
  replace pos1 with (fst pos) by (destruct pos; simpl; congruence).
  replace pos2 with (snd pos) by (destruct pos; simpl; congruence).
  induction pos as [[pos1 pos2] IH]
                using (well_founded_ind cg_pair_order_wf).
  rewrite embed_in_game_eq, sum_eq. simpl.
  repeat rewrite (embed_in_game_moves _ _ Left).
  repeat rewrite (embed_in_game_moves _ _ Right).
  repeat rewrite map_app. repeat rewrite map_map.
  do 2 f_equal;
  apply map_ext_strong; intros pos IN; apply IH; constructor; eexists; eauto.

In the next posts, we will see how to use this machinery when decomposing games as sums and comparing subgames.

| Comments GitHub

An introduction to combinatorial game theory

In this post, I will begin to formalize a small part of combinatorial game theory using Coq. Combinatorial game theory attempts to model sequential, deterministic games between two players, both of which take turns causing the game state to change. It restricts itself to perfect information games, where the current configuration of the game is known to both players. Thus, it can be used to study games such as chess, tic-tac-toe, and go, but not games such as poker or blackjack.
The power of combinatorial game theory comes from abstracting away details that are too specific to each game, such as whether it is played moving pieces on a board, how the pieces can move, etc. It defines a single mathematical object that can represent all games uniformily, allowing us to study general situations that could occur in many kinds of games. In this post, I will present this representation and discuss why it makes sense. I will start with a more intuitive formalization of combinatorial games, and then show how combinatorial game theory applies to all of those.

Defining combinatorial games

In combinatorial games, players are traditionally called left and right.

Inductive player : Type := Left | Right.

Each combinatorial game has a set of positions. In chess, for instance, a position is a chess board with black and white pieces on it. The rules of the game determine which moves are available to each player at a given position. They also describe how the game ends, and which player wins in that case. To simplify our analysis, we will assume that games end when some player must play but has no moves left, in which case the other player wins. This frees us from having to model how matches end separately for each game. We will also consider only finite games, i.e. ones that can't be played indefinitely. Notice that these two assumptions taken together, strictly speaking, rule out many interesting games: chess can end in a draw, something that can't happen in our model.
Translating the above requirements into code results in the following definition:

Inductive combinatorial_game := CombinatorialGame {
  position : Type;
  moves : player -> position -> list position;
  valid_move next current := exists s, In next (moves s current);
  finite_game : well_founded valid_move

To formalize how games are played, we define a predicate Match cg first winner m that represents a match of game cg where player first starts and player winner wins. m is the sequence of positions traversed during the match, from first to last.

Definition other (s : player) : player :=
  match s with
    | Left => Right
    | Right => Left

Inductive Match cg : forall (first winner : player), list (position cg) -> Prop :=
| Match_end : forall pl pos,
                moves cg pl pos = [] ->
                Match cg pl (other pl) [pos]
| Match_move : forall pl winner pos next m,
                 In next (moves cg pl pos) ->
                 Match cg (other pl) winner (next :: m) ->
                 Match cg pl winner (pos :: next :: m).

In the Match_end clause, we check that the current player has no moves left. In Match_move, we check that the current player can make a move and that the match can then proceed with the other positions.

A universal game

We will now define a combinatorial game that is, in a precise sense to be explained later, the most general one. This is what combinatorial game theory uses to study combinatorial games.
The crucial observation is that in the definition of Match we only care about the moves that can be made from a given position, but not about what the positions themselves are. This suggests a definition where each position is just a pair of sets, one for each player's moves. This forms the type game of games.

Inductive game := Game {
  left_moves : list game;
  right_moves : list game

Each position in this game can be pictured as an arbitrarily-branching tree with two sets of children. On each player's turn, they choose one child in their set of moves to be the new position, and lose if they can't choose anything.
The simplest game is zero, where both players have no moves available. It is a game where the first player always loses.

Definition zero : game := Game [] [].

Using zero we can define star, where the first player always wins by making a move to zero.

Definition star : game := Game [zero] [zero].

A small variation gives us one and minus_one, where Left and Right always win, respectively.

Definition one : game := Game [zero] [].
Definition minus_one : game := Game [] [zero].

It should be possible to encapsulate game in a combinatorial_game record. Defining moves is simple, but proving that games are finite requires some additional work. The nested inductive in the definition of game makes Coq generate an induction principle that is too weak to be useful:

     : forall P : game -> Prop,
       (forall left_moves right_moves : list game,
        P {| left_moves := left_moves; right_moves := right_moves |}) ->
       forall g : game, P g
The usual solution is to define a new one by hand that gives us induction hypotheses to use:

Lemma lift_forall :
  forall T (P : T -> Prop),
    (forall t, P t) ->
    forall l, Forall P l.
Proof. induction l; auto. Defined.

Definition game_ind' (P : game -> Prop)
                     (H : forall l r, Forall P l -> Forall P r -> P (Game l r)) :
  forall g : game, P g :=
  fix F (g : game) : P g :=
  match g with
    | Game l r =>
      H l r (lift_forall _ P F l) (lift_forall _ P F r)

Using this principle, we can now prove that games always terminate and define a combinatorial_game for game.

Definition game_as_cg : combinatorial_game.
  refine ({| position := game;
             moves s := if s then left_moves else right_moves |}).
  intros p1.
  induction p1 as [l r IHl IHr] using game_ind'.
  (* ... *)

Game embeddings

I claimed that game is the most general combinatorial game. One way of seeing this is that we lose no information by representing each position in a combinatorial game as the tree of all possible moves, and such a tree can always be encoded as a game. To make this intuition formal, we can define a notion of game embedding between two combinatorial games. This will be a mapping between the positions of each combinatorial game that preserves matches. Thus, if we have an embedding of cg1 into cg2, then we can study cg1 matches by regarding them as cg2 matches.

Definition game_embedding (cg1 cg2 : combinatorial_game)
           (embedding : position cg1 -> position cg2) : Prop :=
  forall first winner (m : list (position cg1)),
    Match cg1 first winner m ->
    Match cg2 first winner (map embedding m).

With this notion of game embedding, combinatorial games form a category. I will now show that every combinatorial game can be embedded in game, making game a terminal object in this category and the most general combinatorial game. In this formulation, it is only a weakly terminal object (i.e., embeddings are not unique), as we are using Coq lists to represent sets.
To embed an arbitrary combinatorial game into game, we can define a function by well-founded recursion over the proof that games are finite. In order to do this, we need a higher-order function map_game that allows us to perform a well-founded recursive call on a list of next moves. map_game acts like map, but passes to its argument function a proof that the element is a valid_move.

Fixpoint map_In {A B} (l : list A) : (forall x, In x l -> B) -> list B :=
  match l with
    | [] => fun _ => []
    | x :: l' => fun f =>
                   f x (or_introl _ eq_refl)
                     :: map_In l' (fun x P => f x (or_intror _ P))

Definition map_game {A} (cg : combinatorial_game)
                    (pos : position cg) (p : player)
                    (f : forall pos', valid_move cg pos' pos -> A) : list A :=
  map_In (moves cg p pos) (fun pos' P => f pos' (ex_intro _ p P)).

Using this function and the Fix combinator in the standard library, we write a generic embedding function embed_in_game. Like a regular fixpoint combinator, Fix takes a function that does a recursive call by applying its argument (here, F). The difference is that this argument must take a proof that shows that the recursive call is valid.
The behavior of embed_in_game is simple: it calls itself recursively for each possible next position, and includes that position in the set of moves of the appropriate player.

Definition embed_in_game cg (pos : position cg) : game :=
  Fix (finite_game cg)
      (fun _ => position game_as_cg)
      (fun pos F =>
         Game (map_game cg pos Left F)
              (map_game cg pos Right F))
Definitions that use Fix can be hard to manipulate directly, so we need to prove some equations that describe the reduction behavior of the function. I've hidden some of the auxiliary lemmas and proofs for clarity; as usual, you can find them in the original .v file.
The proof that we can unfold embed_in_game once uses the Fix_eq lemma in the standard library.

Lemma embed_in_game_eq cg (pos : position cg) :
  embed_in_game cg pos =
  Game (map (embed_in_game cg) (moves cg Left pos))
       (map (embed_in_game cg) (moves cg Right pos)).
  unfold embed_in_game in *.
  rewrite Fix_eq.
  (* ... *)

With this lemma, we can show that moves and embed_in_game commute.

Lemma embed_in_game_moves cg (p : position cg) :
  forall s, moves game_as_cg s (embed_in_game cg p) =
            map (embed_in_game cg) (moves cg s p).
  rewrite embed_in_game_eq.
  destruct s; reflexivity.

We are now ready to state and prove our theorem: every combinatorial game can be embedded in game.

Theorem embed_in_game_correct cg :
  game_embedding cg game_as_cg (embed_in_game cg).
  unfold game_embedding.
  intros first winner m MATCH.
  induction MATCH as [winner p H|s winner p p' m IN MATCH IH];
  simpl; constructor; eauto.
  - rewrite embed_in_game_moves, H. reflexivity.
  - rewrite embed_in_game_moves. auto using in_map.


We've developed the foundations of combinatorial game theory, showing how it can model combinatorial games in a simple yet general way. We haven't explored yet how to use this representation in practice to study games, something I plan to do on future posts.
Update: I'll leave the list of posts in this series here.