Solution to The Impossible Bet | The 100 Prisoners Problem
Based on minutephysics's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Independent random guessing yields an all-100 success probability of (1/2)^100, which is effectively zero.
Briefing
The “impossible bet” in the 100 prisoners problem looks unwinnable because each person is limited to checking only 50 boxes, and naive random choices make the chance that all 100 find their own marked bills astronomically small—about 2^-100, essentially zero. Even small coordination tweaks (like making two people avoid picking the same box) only help marginally, and the advantage shrinks as the number of prisoners grows.
The winning insight is that the boxes aren’t just hiding information—they encode it. The room is identical every time: box numbers are fixed, and the bill inside each box points deterministically to another box. That means a prisoner can treat the contents as a set of directed links: if a prisoner opens box i and finds the bill labeled j, the prisoner should next open box j. Starting from the prisoner’s own number, this creates a “scavenger hunt” path that follows the chain formed by the bills’ labels.
These chains behave like loops in a finite directed graph. Every chain eventually cycles back to a previously visited box, and the longest possible chain can include all 100 boxes. The crucial win condition is whether the cycle closes within the first 50 steps. If a prisoner’s path returns to the starting box within 50 boxes, then the prisoner must have encountered the bill that points back to their own number—meaning the prisoner has found their own marked bill. If the cycle takes longer than 50, the prisoner runs out of allowed checks and fails.
Why this lifts the group’s odds so dramatically comes down to how random box arrangements generate chain lengths. For a random permutation of bills across boxes, the probability that the relevant cycle containing a given starting point stays within 50 is high enough that, when all 100 prisoners follow the same “follow the bill” strategy from their own starting boxes, the probability that everyone succeeds lands at a little over 30%. That’s far larger than the near-zero probability from independent random guessing, and it’s also better than the baseline of only two prisoners choosing randomly (which would give 25% for both to succeed).
The strategy’s real power isn’t that any one prisoner becomes more likely to succeed in isolation; it’s that outcomes become synchronized. Because every prisoner’s fate depends on the same underlying chain structure, successes and failures occur together: either the chain lengths stay short enough for all starting points to close within 50, or they don’t. In effect, the prisoners’ linked paths through the boxes force the entire group to win or lose together, turning an “impossible” collective bet into one with a meaningful chance of success.
Cornell Notes
Random guessing makes the 100-prisoners problem effectively unwinnable because the probability that all 100 independently find their own bills is about 2^-100. The key fix is to use information inside the boxes: each bill label tells the next box to open. Starting at the prisoner’s own number, the prisoner follows a deterministic path defined by the permutation of bills, forming a chain that eventually cycles. The group wins exactly when every cycle that contains a starting point closes within 50 steps. For a random arrangement of bills, that “all cycles within 50” condition happens a little over 30% of the time, so everyone wins together or loses together.
Why does independent random selection make the collective win probability essentially zero?
What does it mean to “obtain information from the boxes,” and how is it used?
How does the strategy create “chains” and why do those chains always cycle?
What exact condition guarantees a prisoner finds their own bill?
Why does the group win probability jump to over 30% even though each prisoner still has only 50 checks?
Review Questions
- Compute the collective win probability under the assumption that each prisoner independently has a 1/2 chance to find their bill. How does it compare to the strategy’s ~30% success rate?
- Describe the step-by-step rule for choosing the next box under the chain-following strategy, starting from a prisoner’s own number.
- Explain why the group’s success depends on cycle lengths rather than on individual random choices.
Key Points
- 1
Independent random guessing yields an all-100 success probability of (1/2)^100, which is effectively zero.
- 2
The room’s fixed arrangement lets prisoners extract information: each bill label indicates the next box to open.
- 3
Starting at a prisoner’s own number and following bill labels creates deterministic chains that eventually cycle.
- 4
A prisoner succeeds if their chain returns to the starting box within 50 steps; otherwise they fail.
- 5
For random box arrangements, the probability that all relevant cycles close within 50 is a little over 30%.
- 6
The strategy links everyone’s outcomes: successes and failures occur together because all paths depend on the same chain structure.