Where the Coq lives

| Comments GitHub

Searching with eggs

Consider the following problem. Suppose we are in a 100-story building. We know that, when dropping an egg from the window, the egg will stay intact if we are below a certain floor. However, if we repeat the same exercise above that critical floor, the egg will break. How can we find this floor and minimize the number of egg drops performed in the worst case if we have only two eggs? We suppose that we are allowed to reuse eggs that fall without breaking.
We will see how we can model this problem in Coq and find a correct solution. We model a playing strategy as a decision tree:

Inductive strategy : Type :=
| Guess (floor : nat)
| Drop (floor : nat) (broken intact : strategy).

In the above definition, Guess floor represents the end of the algorithm, when we try to guess the target floor. If floor is equal to the target, we win the game; otherwise, we lose. Drop floor broken intact represents an egg drop at floor. If the egg breaks, we continue playing with strategy broken; otherwise, we continue with intact.
Simulating an egg-drop game is just a matter of performing a tree search. play target s returns true if and only if strategy s succeeds in guessing floor target.

Fixpoint play (target : nat) (s : strategy) : bool :=
  match s with
  | Guess floor =>
    beq_nat floor target
  | Drop floor broken intact =>
    play target (if leb target floor then broken
                 else intact)

We can also find how many eggs a strategy needs and how many drops are performed in the worst case. drops just computes the strategy tree height, whereas eggs computes a "skewed" height, where right branches do not add to the final value.

Fixpoint eggs (s : strategy) : nat :=
  match s with
  | Guess _ => 0
  | Drop _ broken intact => max (S (eggs broken)) (eggs intact)

Fixpoint drops (s : strategy) : nat :=
  match s with
  | Guess _ => 0
  | Drop _ broken intact => S (max (drops broken) (drops intact))

Finally, using these concepts, we can describe what the solution for our problem is. winning lower range s says that strategy s is able to find range target floors starting at lower, while is_optimal range e d states that there is a winning strategy for guessing range target floors, uses at most e eggs and performing at most d drops, such that d is the smallest possible.

Definition winning (lower range : nat) (s : strategy) : Prop :=
  forall target, lower <= target < lower + range ->
                 play target s = true.

Definition is_optimal (range e d : nat) : Prop :=
  exists s : strategy,
    eggs s <= e /\
    drops s = d /\
    winning 0 range s /\
    forall s', eggs s' <= e ->
               winning 0 range s' ->
               d <= drops s'.

A simple first strategy

Before trying to find an optimal solution, let us try to find one that works. A simple strategy is to perform linear search, starting at the bottom and going up one floor at a time. As soon as the egg breaks, we know that we have found our target.

Fixpoint linear (lower range : nat) : strategy :=
  match range with
  | 0 => Guess lower
  | S range' => Drop lower (Guess lower) (linear (S lower) range')

linear lower range works on a range of S range floors, using at most one egg. Because of this, it is not very efficient, performing n drops in the worst case.

Lemma linear_winning lower range :
  winning lower (S range) (linear lower range).

Lemma linear_eggs lower range :
  eggs (linear lower range) = min 1 range.

Lemma linear_drops lower range :
  drops (linear lower range) = range.

Improving our strategy

Now that we know a naive search strategy, we will try to find an optimal one. Intuitively, it seems that it should be possible to perform some form of binary search, starting at the middle of our floor range, and continuing recursively depending on the outcome of our drop.
Although much better than a pure linear strategy, this is still far from being optimal: if our egg doesn't break on our first drop, we will still be using at most only one egg on that upper range. If we allowed ourselves to break one of them, presumably, we would be able to solve the upper range in less than 50 drops, which would in turn allow us to perform our first drop at a lower floor, e.g.

Definition solution_take_2 : strategy :=
  Drop 33 (linear 0 33)
          (Drop 66 (linear 34 32) (linear 67 32)).

Lemma solution_take_2_winning :
  winning 0 100 solution_take_2.

Lemma solution_take_2_eggs :
  eggs solution_take_2 = 2.

Lemma solution_take_2_drops :
  drops solution_take_2 = 34.

Our new solution performs much better, but there is still room for improvement. Once again, if our first two drops are below the target floor, we will be using only one egg to search through 33 floors, which could be done better if we had used both of them. This suggests that the optimal strategy should be some form of skewed binary search, where the upper range that is produced by an egg drop should use its available eggs in an optimal way.

Finding the optimum

How can we formalize the intuition that we just developed? The key insight is to reason by duality and, instead, ask "what is the largest range of floors we can scan using at most some number of eggs and drops?" When looking at the problem this way, it becomes clear that optimality has a recursive structure that is easy to describe: to find a floor using at most e eggs and d drops, we need to combine two optimal strategies: one using at most e-1 eggs and d-1 drops, for the case where our first drop causes the egg to break, and another using at most e eggs and d-1 drops, for the case where our egg does not break at first. We can readily express this idea as code. optimal_range e d computes the maximal range size that can be solved using e eggs and d drops at most, while optimal lower e d computes a strategy for doing so starting from floor lower.

Fixpoint optimal_range_minus_1 (e d : nat) {struct d} : nat :=
  match d, e with
  | S d', S e' => S (optimal_range_minus_1 e' d' +
                     optimal_range_minus_1 (S e') d')
  | _, _ => 0

Definition optimal_range e d := S (optimal_range_minus_1 e d).

Fixpoint optimal (lower e d : nat) {struct d} : strategy :=
  match d, e with
  | S d', S e' =>
    let floor := lower + optimal_range_minus_1 e' d' in
    Drop floor
         (optimal lower e' d')
         (optimal (S floor) (S e') d')
  | _, _ => Guess lower

It is easy to show that optimal lower e d scans optimal_range e d floors, as well that it uses the resources that we expect.

Lemma optimal_winning lower e d :
  winning lower (optimal_range e d) (optimal lower e d).

Lemma optimal_eggs lower e d :
  eggs (optimal lower e d) = min e d.

Lemma optimal_drops lower e d :
  drops (optimal lower e d) = min 1 e * d.

To actually show optimality, we need to show that optimal_range indeed computes what it is supposed to. We start by showing two inversion lemmas, relating the range that is scanned by a winning strategy to the range scanned by its sub-strategies.

Lemma winning_inv_guess lower range floor :
  winning lower range (Guess floor) -> range <= 1.

Lemma winning_inv_drop lower range floor broken intact :
  winning lower range (Drop floor broken intact) ->
  exists r1 r2 lower',
    range = r1 + r2 /\
    winning lower r1 broken /\
    winning lower' r2 intact.

Lemma optimal_range_correct :
  forall lower e d s range,
    eggs s <= e ->
    drops s <= d ->
    winning lower range s ->
    range <= optimal_range e d.

Combining this lemma with the ranges we had derived for linear and optimal, we can prove useful results about optimal_range.

Lemma optimal_range_monotone :
  forall e e' d d',
    e <= e' ->
    d <= d' ->
    optimal_range e d <= optimal_range e' d'.
  intros e e' d d' He Hd.
  apply (optimal_range_correct 0 e' d' (optimal 0 e d));
    [ rewrite optimal_eggs; lia
    | rewrite optimal_drops; destruct e; simpl; lia
    | apply optimal_winning ].

Lemma optimal_range_lower_bound :
  forall e d, d <= (optimal_range (S e) d).
  intros e d.
  cut (S d <= optimal_range (S e) d); try lia.
  apply (optimal_range_correct 0 (S e) d (linear 0 d));
    [ rewrite linear_eggs
    | rewrite linear_drops
    | apply linear_winning ]; lia.

Given that optimal_range is monotone, we can find what the optimal number of drops for a given range is by picking the smallest value of t such that range <= optimal_range e t. We formalize this idea by writing a generic function find_root that can find such values for any monotone function f, given a suitable initial guess.

Fixpoint find_root (f : nat -> nat) (target n : nat) : nat :=
  match n with
  | 0 => 0
  | S n' =>
    if leb target (f n') then
      find_root f target n'
      S n'

Lemma find_root_correct :
  forall f target n,
    (forall x y, x <= y -> f x <= f y) ->
    target <= f n ->
    let x := find_root f target n in
    target <= f x /\
    forall y, y < x -> f y < target.

By instantiating this theorem with optimal_range and applying the appropriate theorems, we obtain our final result. The proof of optimality goes by contradiction. Let t = find_optimum (S e) range. if we find another strategy s such that eggs s <= S e and drops s < t, we know that range <= optimal_range (S e) t by optimal_range_correct, but we must also have optimal_range (S e) t < range by the correctness of find_root.

Definition find_optimum e target :=
  find_root (optimal_range e) target target.

Lemma find_optimum_correct :
  forall e range,
    let d := find_optimum (S e) range in
    is_optimal range (S e) d.
  intros e range d.
  assert (H : range <= optimal_range (S e) d /\
              forall d', d' < d -> optimal_range (S e) d' < range).
  (* By correctness of find_root *)
  destruct H as [H1 H2].
  exists (optimal 0 (S e) d).
  rewrite optimal_drops, optimal_eggs.
  repeat split; try lia; simpl; try lia.
  - intros x Hx.
    apply optimal_winning. lia.
  - intros s Hs WIN.
    destruct (le_lt_dec d (drops s)) as [LE | LT]; trivial.
    assert (Hd : drops s <= drops s) by lia.
    assert (He : eggs s <= S e) by lia.
    (* optimal_range < range *)
    pose proof (H2 _ LT).
    (* range <= optimal_range *)
    pose proof (optimal_range_correct _ _ _ _ _ He Hd WIN).

Using this result, we can find the answer for our original problem.

Lemma solution :
  is_optimal 100 2 14.
Proof. apply find_optimum_correct. Qed.
comments powered by Disqus