Get AI summaries of any video or article — Sign up free
These Illusions Fool Almost Everyone thumbnail

These Illusions Fool Almost Everyone

Veritasium·
6 min read

Based on Veritasium's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Hearing depends on pattern inference: harmonic mixtures can produce a perceived “missing fundamental” even when that single low frequency isn’t present as a pure tone.

Briefing

A string of classic audio illusions shows that hearing isn’t a simple matter of detecting frequencies; it’s an active construction that depends on context, expectations, and how sound is filtered by the body. The most striking early example pits a pure 100 hertz tone against a “lower” sound built from 100 hertz harmonics plus 150 hertz and 200 hertz. Listeners reliably judge the harmonic-rich sound as higher even though it lacks the original single-frequency pitch—because the brain can infer a “missing fundamental” from the pattern of overtones, effectively turning higher frequencies into a lower perceived note.

That missing-fundamental effect links directly to how instruments create timbre. In the Sydney Town Hall pipe organ, each pipe produces a fundamental frequency plus overtones (often integer multiples known as harmonics). Those quieter harmonics don’t stand out as separate notes, but they shape timbre—why a trumpet and a flute can share a pitch yet sound different. The organ’s massive 64-foot pipe illustrates how low notes are physically produced: a 64-foot pipe can generate about 8 hertz, which is more felt than heard, while the lowest typical organ note around 16 hertz sits near the edge of human hearing. Historically, organist Georg Joseph Vogler used a clever workaround: instead of building huge pipes for 16 hertz, he played harmonics of 16 hertz using shorter pipes, letting the brain reconstruct the missing low pitch.

Other illusions push the idea further. The Shepard tone creates the impression of an endlessly rising pitch by stacking multiple tones separated by octaves while continuously swapping their loudness—fading out the high components as new low ones fade in—so the listener hears perpetual ascent even though frequencies never exceed hearing limits. A separate “scrambled melody” challenge shows how quickly the brain locks onto patterns: once the correct tune is known, the same scrambled notes become obvious on the second listen.

Language-based illusions add another layer: the phantom word effect (credited to Dr. Diana Deutsch) demonstrates how mixed audio can be rearranged mentally into recognizable words, and how priming changes what people report hearing. Mondegreens—misheard lyrics—highlight how familiarity and linguistic expectations steer perception. Visual cues can also override audio: identical speech sounds can be heard as different words depending on mouth movement, and even silent-looking animations can flip what people perceive once sound is added.

Finally, the transcript ties perception to real-world hearing mechanics. The “cocktail party effect” explains how people can focus on one voice amid overlap by using prediction from language structure and by tracking where sound arrives from. Sound localization relies on multiple cues—volume differences between ears, frequency-dependent “shadowing,” tiny time delays across the head, and phase differences—while the pinna’s ridges filter frequencies in location-specific ways. Experiments that physically altered ear shape showed that localization can recover through brain adaptation, and modern spatial audio systems (like those from Apple and Sony) scan ears to personalize that filtering.

Across all these examples, the throughline is consistent: the auditory system is designed to make sense of ambiguity. Illusions don’t prove hearing is broken; they reveal the rules the brain uses to fill gaps, infer missing information, and decide what’s most likely true.

Cornell Notes

Audio illusions reveal that hearing is an inference engine, not a direct readout of sound frequency. A key example is the “missing fundamental,” where listeners perceive a low pitch even when it isn’t present—because harmonics supply enough information for the brain to reconstruct the fundamental. Shepard tones create the illusion of an endlessly rising pitch by using octave-separated components whose volumes fade in and out, keeping the perceived ascent going without exceeding hearing limits. Language and expectation further reshape perception through effects like the phantom word illusion and mondegreens, while visual cues can even change what identical audio is heard as. Real-world hearing depends on both physics (interference, localization cues) and learned processing, including how the pinna filters sound and how the brain adapts to altered ear shapes.

Why can a sound built from higher harmonics feel like it contains a lower pitch that isn’t physically present?

Listeners can hear a “missing fundamental.” In the opening test, sound A is a pure 100 Hz sine wave, while sound B includes 100 Hz plus higher components (150 Hz and 200 Hz). Even though the fundamental pitch isn’t present as a single tone in sound B, the brain infers it from the harmonic pattern. The transcript links this to waveform timing: adding harmonics changes the overall period so the combined signal’s repeating structure matches the missing fundamental’s period. That’s why higher-frequency mixtures can be perceived as lower notes when they align as harmonics of a common base.

How do pipe organs demonstrate the difference between a fundamental tone and timbre?

Each organ pipe produces a fundamental frequency plus overtones. When pipes are the same length, they share the same fundamental frequency, but different materials and resonators change the overtone content. Those overtones are quieter and don’t register as separate pitches, yet they strongly affect timbre—how a listener distinguishes, for example, a trumpet from a flute. The transcript also notes that many instruments’ overtones fall at integer multiples of the fundamental, called harmonics, which is why the harmonic structure can drive pitch perception tricks like the missing fundamental.

What makes the Shepard tone sound like it rises forever?

A Shepard tone isn’t one note; it’s multiple tones separated by octaves. As the set shifts upward, the volumes are adjusted so the highest components fade out while new lower components fade in. This continuous loudness swapping prevents the listener from noticing the “reset” that would normally occur when frequencies move beyond the audible range. The result is a persistent illusion of upward motion, analogous to a barbershop pole effect in audio.

How do expectation and language change what people hear in ambiguous audio?

The phantom word illusion (Dr. Diana Deutsch) shows that when two different words are played simultaneously by different speakers, listeners can still “choose” a coherent word from the mixed signals—because the brain receives a pile of candidates and selects likely patterns. Priming shifts that selection: near exam week, reported words included “No brain,” “I’m tired,” and “No time.” Mondegreens work similarly: people mishear lyrics in ways that fit familiar language patterns, such as hearing “Pulitzer Prize” as “pullet surprise,” or interpreting football chants differently depending on what a listener expects to hear.

How does the brain locate where a sound comes from?

Sound localization uses several cues: (1) volume differences—sounds on the right are louder in the right ear due to head shadowing; (2) frequency-dependent shadowing—high frequencies attenuate more than low frequencies; (3) time delay—sound reaches one ear slightly before the other (about half a millisecond across the head); and (4) phase differences—whether a wave arrives at a peak or trough differs between ears. When sound is directly in front or behind, these cues can weaken because distances match, so the pinna’s shape becomes crucial: ridges reflect and filter frequencies differently depending on direction. Experiments with ear molds showed localization can be impaired immediately but improves over days as the brain adapts, and modern spatial audio systems personalize this filtering by scanning ear shape.

What is the difference between beating and binaural beats?

Beating occurs when two close frequencies play together in the same place, letting their wave peaks and troughs periodically align and cancel, producing a pulsing sensation at the difference frequency (e.g., 261 Hz and 263 Hz produce two louder pulses per second). Binaural beats happen when one frequency is played in one ear and the other in the other ear; the tones don’t physically interfere in the air, but the brain combines them and produces a beat perception based on the phase difference it computes.

Review Questions

  1. In the missing-fundamental example, what specific relationship between harmonics allows listeners to infer a low pitch that isn’t present as a single tone?
  2. How do Shepard tones use octave separation and changing loudness to maintain the illusion of continuous pitch ascent?
  3. Which localization cues become less reliable when a sound is directly in front of or behind a listener, and why does the pinna matter more then?

Key Points

  1. 1

    Hearing depends on pattern inference: harmonic mixtures can produce a perceived “missing fundamental” even when that single low frequency isn’t present as a pure tone.

  2. 2

    Timbre is shaped less by the loud fundamental and more by the overtone structure; different instruments sound distinct because their overtones have different frequencies and relative amplitudes.

  3. 3

    The Shepard tone illusion relies on octave-stacked components whose volumes fade in and out, preventing listeners from detecting the point where the sound would otherwise “reset.”

  4. 4

    Language familiarity and priming can steer perception in ambiguous audio, as seen in the phantom word effect and mondegreens.

  5. 5

    Sound localization uses multiple cues—interaural volume, frequency-dependent shadowing, time delay, and phase—while the pinna’s ridges provide direction-specific filtering that the brain learns.

  6. 6

    Beating comes from interference when close frequencies are in the same ear space, while binaural beats arise when the brain compares different frequencies delivered to left and right ears.

  7. 7

    Audio illusions illustrate how the auditory system compensates for real-world ambiguity rather than proving hearing is fundamentally unreliable.

Highlights

A harmonic-rich sound can feel higher or contain a low pitch that isn’t physically present because the brain reconstructs the missing fundamental from overtone patterns.
The Shepard tone’s “endless rise” is engineered by fading octave components in and out, keeping the perceived ascent continuous.
The pinna isn’t just an outer ear shape—it acts like a frequency-direction filter, and the brain can relearn localization after ear shape changes.
Even when tones don’t interact in the air, binaural beats can still create a beat sensation because the brain mixes left-right inputs.
Visual information can flip what identical audio is heard as, showing that perception fuses senses rather than treating them independently.

Topics

  • Missing Fundamental
  • Shepard Tone
  • Timbre and Harmonics
  • Cocktail Party Effect
  • Sound Localization
  • Binaural Beats
  • Pinna Adaptation

Mentioned

  • Apple
  • Sony
  • Georg Joseph Vogler
  • Diana Deutsch
  • Alfred Mayer