Get AI summaries of any video or article — Sign up free
Oh, wait, actually the best Wordle opener is not “crane”… thumbnail

Oh, wait, actually the best Wordle opener is not “crane”…

3Blue1Brown·
5 min read

Based on 3Blue1Brown's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

A repeated-letter coloring bug in the simulation changed pattern frequencies and therefore shifted the computed “optimal” opening word.

Briefing

A subtle bug in the Wordle-simulation code changed which opening word comes out “optimal,” overturning the earlier claim that “crane” is the best opener. The mistake came from how the program assigns Wordle colors when a guess contains repeated letters—using the wrong yellow/gray pattern in edge cases. After fixing that convention-handling bug and rerunning the full set of information-theory and score simulations, the theoretically best first guess for the (official) answer list shifted.

The correction matters less for the core lesson about information and entropy than for the headline “best opener.” The original analysis ranked candidate first guesses by expected information gained after one step. Under that metric, “soar” topped the list, even though it looks like an odd choice. A deeper two-step search—evaluating the best possible second guess after each possible pattern from the first guess—reordered the rankings again: “soar” dropped to 14th, and “slain” rose to first. But the final ranking still wasn’t based purely on information. When the analysis switched from information heuristics to actual gameplay performance—simulating thousands of complete Wordle games across the official answer set restricted to top candidates—the best average score came from “salé” (an alternate spelling of “salé,” described as a light medieval helmet). For players who prefer real, common words, “trace” and “crate” performed almost as well, and both have the practical advantage of being valid answers themselves.

The transcript also clarifies how the earlier “crane” result happened: it was only “true” because the algorithms were effectively playing a slightly different version of the game due to the bug. Once the color-assignment logic matched Wordle’s conventions correctly, the optimal first guess for the specific list changed.

Beyond the opener debate, the more durable takeaway is methodological. The analysis starts by treating each opening guess as a way to partition the remaining candidate answers into buckets defined by the observed pattern of greens, yellows, and grays. Expected information is computed by averaging a log-based quantity across those buckets, which corresponds to how much the guess is expected to shrink uncertainty. Then comes the key warning: greedy, one-step “expected information” rankings can miss the globally best strategy, which is why the two-step exhaustive search can reshuffle the leaderboard.

Finally, the transcript pushes back on the idea that this kind of optimization should dictate how humans play. The algorithmic approach is intentionally overfit to the official answer list and assumes exhaustive knowledge of the set and uniform probabilities. Human play is different: people rely on intuition (vowels, letter placement, word familiarity) rather than memorizing lists or computing best responses to every pattern. The real value of the exercise, the transcript argues, is training algorithmic “muscle”—learning how to quantify information and recognizing when deeper search beats greedy heuristics—rather than memorizing a single technically optimal opener.

Cornell Notes

A fixed bug in the simulation’s handling of repeated letters changed which opening word ranks as optimal in a Wordle information-theory analysis. The error affected how yellow/gray colors are assigned when a guess contains multiple instances of the same letter, so the earlier “crane” result applied to a slightly different game. After correcting the convention logic and rerunning everything, the best opener depends on the scoring method: one-step expected information favors “soar,” two-step search elevates “slain,” and full-game simulations across the official answer list put “salé” at the top, with “trace” and “crate” nearly tied. The broader lesson is that greedy, one-step heuristics can fail, and that exhaustive search and real-score simulation can reorder conclusions.

What exactly went wrong in the earlier Wordle simulation that produced the wrong “best opener” ranking?

The code mis-assigned Wordle colors for guesses containing repeated letters in certain edge cases. Example: if the guess is “speed” and the true answer is “abide,” the first “e” should be yellow (it exists in the answer but in a different position) while the second “e” should be gray (no second “e” in the answer). The buggy convention sometimes swapped these outcomes in a way that effectively made the algorithm play a slightly different version of Wordle. Fixing this repeated-letter coloring logic changed the computed pattern frequencies and therefore the information/score rankings.

How does the one-step “expected information” method rank opening guesses?

For a candidate first guess, the program counts how many official answers produce each possible feedback pattern (greens/yellows/grays). It then computes a log-based information quantity per bucket—interpretable as how much uncertainty would be reduced if that pattern appears—averages those values across buckets, and uses the result as the expected information gained from the opening guess. Under this one-step metric (with uniform probability over the official answer list), “soar” comes out best, even though it’s not a typical modern word.

Why does a two-step search change the leaderboard, and what does it do differently?

One-step ranking assumes you stop after the first feedback pattern and only estimates learning from that single partition. Two-step search goes further: after choosing an opening word (like “soar”) and observing a specific pattern (like “all grays”), it restricts attention to only the remaining candidate answers consistent with that pattern. Then it repeats the analysis to find the best possible second guess for that restricted set, and it does this across all possible first-step patterns. Averaging those second-step results (weighted by how likely each first-step bucket is) yields a different ranking; “soar” drops to 14th and “slain” rises to first in the transcript’s results.

Why isn’t “expected information” the same as the best actual Wordle performance?

Expected information is a heuristic about uncertainty reduction, not a direct measure of the number of guesses needed in real play. The transcript describes switching from information-based ranking to simulation-based ranking: it runs complete game simulations across 2,315 possible Wordle games (using the official answer set) with strategies restricted to top candidates from the information metrics. That full-score evaluation pushes “salé” to the top, while “trace” and “crate” land almost identically close behind.

What does the transcript say about using these “optimal” openers in real human play?

It argues that the technically optimal opener is unlikely to be the best human strategy. The algorithm is intentionally overfit to the official answer list with uniform probabilities and assumes exhaustive computation, while humans don’t memorize the list or compute best responses to every feedback pattern. Instead, people use intuition—like vowel usage and letter placement—so the practical value of the computed opener is limited. The more important takeaway is learning how to quantify information and when greedy methods fall short of globally best performance.

Review Questions

  1. In the transcript’s example with repeated letters, what distinguishes the correct yellow/gray coloring when the true answer contains one instance versus two instances of the repeated letter?
  2. What is the conceptual difference between ranking by one-step expected information and ranking by two-step expected information (including the best second guess after each first-step pattern)?
  3. Why does simulation of full games across the official answer list produce a different “best opener” than information-only metrics?

Key Points

  1. 1

    A repeated-letter coloring bug in the simulation changed pattern frequencies and therefore shifted the computed “optimal” opening word.

  2. 2

    Fixing the bug overturned the earlier headline that “crane” is best, because the algorithms were effectively playing a slightly different Wordle.

  3. 3

    One-step expected information (based on bucket counts from the official answer list) ranks “soar” highest.

  4. 4

    Two-step exhaustive search (choosing the best second guess after each first-step pattern) reshuffles rankings, with “slain” taking top spot and “soar” dropping to 14th.

  5. 5

    Full-game simulations using actual average score across the official answer set put “salé” at the top, with “trace” and “crate” nearly tied.

  6. 6

    Greedy, one-step heuristics can miss globally better strategies that emerge only when deeper search considers future choices.

  7. 7

    The practical lesson is less about memorizing an opener and more about quantifying information and recognizing when deeper search beats greedy methods.

Highlights

The “crane” claim was only correct under a flawed repeated-letter coloring convention; fixing the bug changed the optimal opener.
Expected information after one step can crown “soar,” but adding a second-step search flips the top choice to “slain.”
When ranking by simulated average score rather than information heuristics, “salé” edges out the field, with “trace” and “crate” almost indistinguishable.

Topics