This Paradox Splits Smart People 50/50

TL;DR

Newcomb’s paradox splits people because evidential decision theory treats the predictor’s accuracy as decision-relevant evidence, while causal decision theory treats only causal influence as decision-relevant.

Briefing Cornell Notes

Briefing

Newcomb’s paradox—where a near-perfect predictor offers a choice between taking one “mystery” box or taking both a mystery box plus $1,000—splits people almost evenly because two reasonable ways of calculating “what you should do” treat correlation and causation differently. The setup is simple: the predictor has already placed $1,000,000 in the mystery box if it expects you to take only that box; if it expects you to take both, it leaves the mystery box empty. Since the prediction happens before you enter and your decision can’t change what’s already been arranged, the puzzle becomes a test of how people reason under uncertainty.

One camp—often called “one-boxers”—leans on evidential decision theory: the predictor’s past accuracy is evidence about what’s in the mystery box. If the predictor is highly reliable, then choosing the one-box option is strongly correlated with walking away with $1,000,000. In that view, expected value calculations favor one-boxing once the predictor’s accuracy is even slightly above random. The argument is essentially: “My choice is a signal of what the predictor already expects, and the predictor’s track record makes that signal meaningful.”

The other camp—“two-boxers”—leans on causal decision theory: the decision you make now can’t cause the contents of the already-set boxes to change. With that assumption, the mystery box’s contents are fixed, so taking both boxes guarantees an extra $1,000 on top of whatever the mystery box contains. In expected-utility terms, two-boxing dominates one-boxing because it adds $1,000 without reducing the chance of getting the $1,000,000. The paradox persists because both camps treat different probability models as the “right” one: one-boxers use probabilities tied to evidence about the predictor’s accuracy; two-boxers use probabilities tied to what the choice can causally influence.

The disagreement isn’t just math—it points to deeper questions about rationality and free will. If a predictor is perfect, then the only way to get the $1,001,000 outcome would be to be the kind of person who one-boxes in advance, then switch at the last second. That raises whether free will exists in a way that can matter once prediction is already locked in. The discussion connects this to “Why Ain’tcha Rich?”—the observation that if one-boxing is the rational strategy for maximizing money, then one-boxers should systematically win more—and to philosopher arguments (including Gibbard and Harper) that sometimes “irrationality” can be rewarded when prediction is strong.

Finally, the conversation reframes the paradox as a problem about rules, not moment-to-moment choices. If people could pre-commit—through reputation across repeated trials, through mechanisms that let choices affect the past, or through explicit commitments—then the “worse” option (one-boxing) could become the best rule to live by. The same logic is compared to deterrence and game-theory commitments in Cold War “assured destruction” (MAD): stability can depend on committing to an outcome that you hope never occurs. The core takeaway is that Newcomb’s paradox isn’t meaningless; it forces clarity about whether decisions should track evidence or causation, and what “rational” behavior means when your action is entangled with prediction.

Cornell Notes

Newcomb’s paradox asks whether to take only a “mystery” box or to take both the mystery box and a guaranteed $1,000 when a supercomputer has already predicted the choice. One-boxers use evidential decision theory: the predictor’s accuracy is evidence that the mystery box contains $1,000,000 if they choose one-box. Two-boxers use causal decision theory: the choice can’t change what was already placed in the boxes, so taking both always adds $1,000 without reducing the mystery-box payoff. The split reveals a deeper conflict between treating correlation as decision-relevant evidence versus treating only causal influence as decision-relevant. It also links to free will, rationality, and the value of pre-commitment.

Why do one-boxers think one-boxing can be optimal even though the boxes are already set?

One-boxers treat the predictor’s accuracy as evidence about what’s in the mystery box. If the supercomputer reliably predicts thousands of similar cases, then choosing one-box is strongly correlated with the computer having placed $1,000,000 in the mystery box. In expected-utility terms, if the predictor’s correctness probability is C, one-boxing yields expected value C·$1,000,000 (and 1−C times $0). When C is even slightly above 50%, that expected value can exceed the two-box expected value, which includes a guaranteed $1,000 but also depends on whether the mystery box is empty or full.

Why do two-boxers say two-boxing dominates one-boxing?

Two-boxers argue that the decision now cannot causally affect the contents of the already-prepared boxes. Since the mystery box’s contents are fixed before the choice, taking both boxes guarantees an extra $1,000 on top of whatever the mystery box contains. In causal decision theory, the relevant probability is the chance that the computer predicted the action you will take, but the key move is that two-boxing adds $1,000 regardless of which prediction occurred. That makes two-boxing’s expected utility always higher than one-boxing’s.

What hidden assumption divides the camps: evidence or causation?

Both sides accept two facts: (1) the predictor’s track record makes the outcomes highly correlated with the choice, and (2) the boxes are set up before the decision, so the choice can’t change the already-determined contents. The divide is which probability model to use. One-boxers use evidential decision theory, where the predictor’s past accuracy functions as evidence about the payoff conditional on choosing one-box. Two-boxers use causal decision theory, where only what the choice can cause should affect the probability model; since the choice can’t cause the contents, two-boxing simply adds $1,000.

How does the paradox connect to free will?

If a predictor is perfect, then the choice you make is already determined by what the predictor expects, leaving no causal room for your decision to change the past arrangement. That leads to the claim that free will may be an illusion in the relevant sense: your action doesn’t alter what was already set. Yet the discussion also argues that even if free will is illusory, people must act as though it exists because social and moral systems depend on treating agency as real.

Why does pre-commitment matter, and how does it change the “best rule”?

The discussion argues that the paradox can be reframed as a problem about rules you live by. If someone can pre-commit to one-boxing—through mechanisms that let choices affect earlier states, through repeated trials that build a reputation, or through explicit commitments that the predictor can know—then one-boxing becomes the best strategy to be the kind of person who will be predicted correctly. The same theme appears in deterrence: stable outcomes can depend on committing to a worse action in advance so that the other side believes retaliation is inevitable.

Review Questions

In Newcomb’s paradox, what changes in the expected-utility calculation when switching from evidential decision theory to causal decision theory?
What does it mean to say that two-boxing “dominates” one-boxing in the causal framework, and why doesn’t that settle the paradox for one-boxers?
How do pre-commitment, repeated trials, and reputation potentially reconcile the “best rule” with the last-moment choice?

Key Points

1
Newcomb’s paradox splits people because evidential decision theory treats the predictor’s accuracy as decision-relevant evidence, while causal decision theory treats only causal influence as decision-relevant.
2
One-boxing can win in expected value when the predictor’s correctness probability is slightly above random, because one-boxing is correlated with the $1,000,000 outcome.
3
Two-boxing can dominate in the causal framework because the choice can’t change the already-set contents, so taking both adds $1,000 without reducing the mystery-box payoff.
4
The paradox highlights a real distinction between correlation and causation: the camps disagree on whether correlation should drive the probability model used for decisions.
5
The disagreement connects to free will: if prediction is perfect and happens before the choice, the choice may be unable to affect what was already arranged.
6
The discussion reframes the puzzle as a rules problem: pre-commitment (via reputation, repeated trials, or explicit commitments) can make the “worse” last-moment option the best long-run rule.
7
Deterrence strategies like MAD illustrate how committing in advance to an outcome can stabilize a system even when the committed action is undesirable.

Highlights

The core fight is not about money amounts; it’s about whether a decision should respond to correlation (evidence) or to causal influence.

One-boxers use the predictor’s track record as evidence that choosing one-box aligns with the $1,000,000 placement.

Two-boxers treat the boxes as fixed and argue that taking both always adds $1,000, making two-boxing the causal dominant strategy.

Pre-commitment and repeated interaction can turn the paradox into a practical “rules to live by” problem rather than a one-shot dilemma.

Topics

Newcomb’s Paradox
Evidential vs Causal Decision Theory
Free Will
Rationality
Pre-commitment

Mentioned

Robert Nozick
William Newcomb
Gibbard
Harper
Robert McNamara
Henry
Casper
Derek
Gregor