Get AI summaries of any video or article — Sign up free
Solution to The Impossible Bet | The 100 Prisoners Problem thumbnail

Solution to The Impossible Bet | The 100 Prisoners Problem

minutephysics·
4 min read

Based on minutephysics's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Independent random guessing yields an all-100 success probability of (1/2)^100, which is effectively zero.

Briefing

The “impossible bet” in the 100 prisoners problem looks unwinnable because each person is limited to checking only 50 boxes, and naive random choices make the chance that all 100 find their own marked bills astronomically small—about 2^-100, essentially zero. Even small coordination tweaks (like making two people avoid picking the same box) only help marginally, and the advantage shrinks as the number of prisoners grows.

The winning insight is that the boxes aren’t just hiding information—they encode it. The room is identical every time: box numbers are fixed, and the bill inside each box points deterministically to another box. That means a prisoner can treat the contents as a set of directed links: if a prisoner opens box i and finds the bill labeled j, the prisoner should next open box j. Starting from the prisoner’s own number, this creates a “scavenger hunt” path that follows the chain formed by the bills’ labels.

These chains behave like loops in a finite directed graph. Every chain eventually cycles back to a previously visited box, and the longest possible chain can include all 100 boxes. The crucial win condition is whether the cycle closes within the first 50 steps. If a prisoner’s path returns to the starting box within 50 boxes, then the prisoner must have encountered the bill that points back to their own number—meaning the prisoner has found their own marked bill. If the cycle takes longer than 50, the prisoner runs out of allowed checks and fails.

Why this lifts the group’s odds so dramatically comes down to how random box arrangements generate chain lengths. For a random permutation of bills across boxes, the probability that the relevant cycle containing a given starting point stays within 50 is high enough that, when all 100 prisoners follow the same “follow the bill” strategy from their own starting boxes, the probability that everyone succeeds lands at a little over 30%. That’s far larger than the near-zero probability from independent random guessing, and it’s also better than the baseline of only two prisoners choosing randomly (which would give 25% for both to succeed).

The strategy’s real power isn’t that any one prisoner becomes more likely to succeed in isolation; it’s that outcomes become synchronized. Because every prisoner’s fate depends on the same underlying chain structure, successes and failures occur together: either the chain lengths stay short enough for all starting points to close within 50, or they don’t. In effect, the prisoners’ linked paths through the boxes force the entire group to win or lose together, turning an “impossible” collective bet into one with a meaningful chance of success.

Cornell Notes

Random guessing makes the 100-prisoners problem effectively unwinnable because the probability that all 100 independently find their own bills is about 2^-100. The key fix is to use information inside the boxes: each bill label tells the next box to open. Starting at the prisoner’s own number, the prisoner follows a deterministic path defined by the permutation of bills, forming a chain that eventually cycles. The group wins exactly when every cycle that contains a starting point closes within 50 steps. For a random arrangement of bills, that “all cycles within 50” condition happens a little over 30% of the time, so everyone wins together or loses together.

Why does independent random selection make the collective win probability essentially zero?

Each prisoner must find their own bill within 50 openings. If everyone chooses 50 boxes at random, the chance a given prisoner finds their own bill is 1/2. Assuming independence, the chance all 100 succeed is (1/2)^100, which is about 0.0000000000000000000000000000008—effectively zero.

What does it mean to “obtain information from the boxes,” and how is it used?

The room is fixed: box i always contains the same labeled bill each time. When a prisoner opens box i and sees a bill labeled j, that label indicates the next box to open. So the contents act like pointers, turning the search into a guided walk rather than a blind selection.

How does the strategy create “chains” and why do those chains always cycle?

Following the bill labels links boxes into directed chains: box 30 might point to box 82, which points to box 5, which points back to box 30, forming a loop. Because there are only 100 boxes, any path must eventually revisit a box, so every chain cycles.

What exact condition guarantees a prisoner finds their own bill?

A prisoner starts at their own number. They win if their path returns to the starting box within 50 openings. Returning within 50 implies that the previous box’s bill label pointed back to the starting number—so the prisoner has encountered the bill that is theirs.

Why does the group win probability jump to over 30% even though each prisoner still has only 50 checks?

The jump comes from the distribution of chain lengths in a random permutation. About 30% of random arrangements produce no chain (relevant to the starting points) longer than 50, so every prisoner’s cycle closes in time. The strategy synchronizes outcomes: everyone succeeds together when chain lengths are short enough, and everyone fails together when they aren’t.

Review Questions

  1. Compute the collective win probability under the assumption that each prisoner independently has a 1/2 chance to find their bill. How does it compare to the strategy’s ~30% success rate?
  2. Describe the step-by-step rule for choosing the next box under the chain-following strategy, starting from a prisoner’s own number.
  3. Explain why the group’s success depends on cycle lengths rather than on individual random choices.

Key Points

  1. 1

    Independent random guessing yields an all-100 success probability of (1/2)^100, which is effectively zero.

  2. 2

    The room’s fixed arrangement lets prisoners extract information: each bill label indicates the next box to open.

  3. 3

    Starting at a prisoner’s own number and following bill labels creates deterministic chains that eventually cycle.

  4. 4

    A prisoner succeeds if their chain returns to the starting box within 50 steps; otherwise they fail.

  5. 5

    For random box arrangements, the probability that all relevant cycles close within 50 is a little over 30%.

  6. 6

    The strategy links everyone’s outcomes: successes and failures occur together because all paths depend on the same chain structure.

Highlights

The naive probability of all 100 succeeding under random box selection is about (1/2)^100—essentially zero.
Following the bill labels turns the search into a deterministic walk through a permutation’s cycles.
The group wins when every cycle closes within 50 openings, which happens a little over 30% of the time.

Topics