These Illusions Fool Almost Everyone
Based on Veritasium's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Hearing depends on pattern inference: harmonic mixtures can produce a perceived “missing fundamental” even when that single low frequency isn’t present as a pure tone.
Briefing
A string of classic audio illusions shows that hearing isn’t a simple matter of detecting frequencies; it’s an active construction that depends on context, expectations, and how sound is filtered by the body. The most striking early example pits a pure 100 hertz tone against a “lower” sound built from 100 hertz harmonics plus 150 hertz and 200 hertz. Listeners reliably judge the harmonic-rich sound as higher even though it lacks the original single-frequency pitch—because the brain can infer a “missing fundamental” from the pattern of overtones, effectively turning higher frequencies into a lower perceived note.
That missing-fundamental effect links directly to how instruments create timbre. In the Sydney Town Hall pipe organ, each pipe produces a fundamental frequency plus overtones (often integer multiples known as harmonics). Those quieter harmonics don’t stand out as separate notes, but they shape timbre—why a trumpet and a flute can share a pitch yet sound different. The organ’s massive 64-foot pipe illustrates how low notes are physically produced: a 64-foot pipe can generate about 8 hertz, which is more felt than heard, while the lowest typical organ note around 16 hertz sits near the edge of human hearing. Historically, organist Georg Joseph Vogler used a clever workaround: instead of building huge pipes for 16 hertz, he played harmonics of 16 hertz using shorter pipes, letting the brain reconstruct the missing low pitch.
Other illusions push the idea further. The Shepard tone creates the impression of an endlessly rising pitch by stacking multiple tones separated by octaves while continuously swapping their loudness—fading out the high components as new low ones fade in—so the listener hears perpetual ascent even though frequencies never exceed hearing limits. A separate “scrambled melody” challenge shows how quickly the brain locks onto patterns: once the correct tune is known, the same scrambled notes become obvious on the second listen.
Language-based illusions add another layer: the phantom word effect (credited to Dr. Diana Deutsch) demonstrates how mixed audio can be rearranged mentally into recognizable words, and how priming changes what people report hearing. Mondegreens—misheard lyrics—highlight how familiarity and linguistic expectations steer perception. Visual cues can also override audio: identical speech sounds can be heard as different words depending on mouth movement, and even silent-looking animations can flip what people perceive once sound is added.
Finally, the transcript ties perception to real-world hearing mechanics. The “cocktail party effect” explains how people can focus on one voice amid overlap by using prediction from language structure and by tracking where sound arrives from. Sound localization relies on multiple cues—volume differences between ears, frequency-dependent “shadowing,” tiny time delays across the head, and phase differences—while the pinna’s ridges filter frequencies in location-specific ways. Experiments that physically altered ear shape showed that localization can recover through brain adaptation, and modern spatial audio systems (like those from Apple and Sony) scan ears to personalize that filtering.
Across all these examples, the throughline is consistent: the auditory system is designed to make sense of ambiguity. Illusions don’t prove hearing is broken; they reveal the rules the brain uses to fill gaps, infer missing information, and decide what’s most likely true.
Cornell Notes
Audio illusions reveal that hearing is an inference engine, not a direct readout of sound frequency. A key example is the “missing fundamental,” where listeners perceive a low pitch even when it isn’t present—because harmonics supply enough information for the brain to reconstruct the fundamental. Shepard tones create the illusion of an endlessly rising pitch by using octave-separated components whose volumes fade in and out, keeping the perceived ascent going without exceeding hearing limits. Language and expectation further reshape perception through effects like the phantom word illusion and mondegreens, while visual cues can even change what identical audio is heard as. Real-world hearing depends on both physics (interference, localization cues) and learned processing, including how the pinna filters sound and how the brain adapts to altered ear shapes.
Why can a sound built from higher harmonics feel like it contains a lower pitch that isn’t physically present?
How do pipe organs demonstrate the difference between a fundamental tone and timbre?
What makes the Shepard tone sound like it rises forever?
How do expectation and language change what people hear in ambiguous audio?
How does the brain locate where a sound comes from?
What is the difference between beating and binaural beats?
Review Questions
- In the missing-fundamental example, what specific relationship between harmonics allows listeners to infer a low pitch that isn’t present as a single tone?
- How do Shepard tones use octave separation and changing loudness to maintain the illusion of continuous pitch ascent?
- Which localization cues become less reliable when a sound is directly in front of or behind a listener, and why does the pinna matter more then?
Key Points
- 1
Hearing depends on pattern inference: harmonic mixtures can produce a perceived “missing fundamental” even when that single low frequency isn’t present as a pure tone.
- 2
Timbre is shaped less by the loud fundamental and more by the overtone structure; different instruments sound distinct because their overtones have different frequencies and relative amplitudes.
- 3
The Shepard tone illusion relies on octave-stacked components whose volumes fade in and out, preventing listeners from detecting the point where the sound would otherwise “reset.”
- 4
Language familiarity and priming can steer perception in ambiguous audio, as seen in the phantom word effect and mondegreens.
- 5
Sound localization uses multiple cues—interaural volume, frequency-dependent shadowing, time delay, and phase—while the pinna’s ridges provide direction-specific filtering that the brain learns.
- 6
Beating comes from interference when close frequencies are in the same ear space, while binaural beats arise when the brain compares different frequencies delivered to left and right ears.
- 7
Audio illusions illustrate how the auditory system compensates for real-world ambiguity rather than proving hearing is fundamentally unreliable.