Distributions 1 | Motivation and Delta Function [dark version]

TL;DR

The Heaviside step function H(x) has a jump at x=0, so its classical derivative fails to exist there.

Briefing Cornell Notes

Briefing

Distributions were introduced to make sense of derivatives and other operations that break down at “sharp” features—especially jumps like the Heaviside step function. In classical calculus, the derivative of the Heaviside function H(x) (0 for x<0 and 1 for x>0) fails to exist at x=0 because of the jump. Yet many applications—most notably differential equations and Fourier analysis—need a derivative-like object that still behaves correctly even when solutions have discontinuities or corners.

The motivation traces back to Paul Dirac’s 1927 work. Dirac wanted a new “delta function” δ(x) that would act as the derivative of the Heaviside function. Away from x=0, this is straightforward: H is constant on each side, so any derivative-like object should be zero there. The real challenge is the single point at x=0. Dirac also demanded that a fundamental theorem of calculus–type identity remain valid in an integral form: integrating δ(x) over a tiny interval around 0 should reproduce the jump in H. Concretely, for arbitrarily small ε, the integral of δ(x) from −ε to ε should equal H(ε)−H(−ε), which is 1.

But those requirements clash with what ordinary functions can do. If δ(x) were an ordinary function that is zero everywhere except at one point, then its integral over any interval would be zero—because changing a function at a single point does not affect integrals (in the sense of “almost everywhere,” i.e., with respect to the Lebesgue measure). That produces a contradiction: the integral would have to be 0, not 1. The situation worsens when one tries to differentiate δ again; even writing down δ′ requires a framework that can interpret such expressions without relying on pointwise values.

The response is to replace “functions” with a broader concept: distributions, also called generalized functions. Instead of treating δ as a literal function with a graph, distributions are treated as linear functionals acting on test functions. The guiding picture is that distributions behave like mass densities. For an ordinary function f(x), integrating f(x) against a test function φ(x) over a region produces a real number that can be interpreted as “mass measured” in the bump-shaped region where φ is nonzero. For the delta distribution, all the mass is concentrated at x=0. When a test function φ is used, the resulting number corresponds to φ evaluated at the point of concentration—so δ “picks out” the value at 0.

This shift—from pointwise objects to measurement-by-integration—lets δ satisfy the integral jump condition and supports further algebraic operations like differentiation in a consistent way. The next step is to formalize what counts as a test function and to define precisely how distributions act as linear maps on them.

Cornell Notes

The Heaviside step function H(x) jumps from 0 to 1 at x=0, so its classical derivative fails to exist at that point. Dirac introduced the delta function δ(x) as a generalized derivative of H, requiring that integrating δ over any small interval around 0 reproduces the jump: ∫_{-ε}^{ε} δ(x) dx = 1. That can’t happen with an ordinary function that is zero everywhere except at a single point, because such a function would integrate to 0. Distributions fix this by treating δ not as a function with a graph, but as a linear functional acting on test functions φ: the delta distribution returns φ(0), reflecting a “point mass” density at x=0. This framework also sets up meaningful operations like differentiating δ.

Why does the classical derivative of the Heaviside function break down at x=0?

H(x) equals 0 for x<0 and 1 for x>0, so it has a jump discontinuity at x=0. Classical differentiation requires the function to be smooth enough for the limit defining the derivative to exist, but the jump means the slope is not defined at that single point. The derivative is fine away from 0 (it’s 0 on each side) but fails at x=0.

What integral condition does Dirac want δ(x) to satisfy, and why does it matter?

Dirac wants δ to behave like the derivative of H in an integral form: for small ε, ∫_{-ε}^{ε} δ(x) dx should equal H(ε)−H(−ε). Since H(ε)=1 for ε>0 and H(−ε)=0, the difference is always 1. This preserves a fundamental theorem of calculus–style relationship between differentiation and integration, even across a discontinuity.

Why can’t δ(x) be an ordinary function that is zero everywhere except at x=0?

If δ were an ordinary function that vanishes except at one point, then its integral over any interval would be 0. In measure-theoretic terms, changing a function at a single point doesn’t affect integrals because the point has measure zero. That contradicts the required condition ∫_{-ε}^{ε} δ(x) dx = 1.

How do distributions reinterpret δ so the integral condition becomes possible?

Distributions treat δ as a generalized object: a linear functional acting on test functions φ. In the “point mass” picture, δ concentrates all mass at x=0. When δ acts on φ, the output corresponds to φ(0). So instead of relying on δ’s pointwise values or a literal graph, the framework defines δ through how it produces numbers via integration against φ.

What are test functions, and what role do they play in measuring a distribution?

Test functions φ are chosen to be nice enough (the transcript mentions continuity and localization) and typically look like a small bump: φ is zero outside a small region. A distribution is then evaluated by integrating the product of the distribution with φ (conceptually), producing a real number that represents the “mass” the distribution assigns to the region. For δ, this measurement collapses to the value at the concentration point: φ(0).

Review Questions

What specific contradiction arises if δ is assumed to be an ordinary function that is zero almost everywhere?
How does the “point mass” interpretation explain why δ acting on a test function yields φ(0)?
What additional difficulty appears when trying to differentiate δ again, and why does that push toward distributions?

Key Points

1
The Heaviside step function H(x) has a jump at x=0, so its classical derivative fails to exist there.
2
Dirac’s delta function δ(x) was introduced to act like the derivative of H while preserving an integral jump condition.
3
The requirement ∫_{-ε}^{ε} δ(x) dx = H(ε)−H(−ε) forces the integral to equal 1 for all small ε.
4
An ordinary function that is zero everywhere except at one point would still integrate to 0, creating a contradiction.
5
Distributions replace pointwise functions with linear functionals that act on test functions.
6
The delta distribution behaves like a point mass at x=0: when applied to a test function φ, it returns φ(0).
7
The framework is designed to support operations like differentiation even when classical derivatives are undefined at sharp features.

Highlights

The delta function is motivated by the need to differentiate a jump: H(x) changes from 0 to 1 at x=0, but classical derivatives break at that point.

Dirac’s integral condition demands ∫_{-ε}^{ε} δ(x) dx = 1, matching the jump H(ε)−H(−ε).

A single-point “ordinary function” can’t satisfy that integral requirement because single points have no effect on integrals.

Distributions treat δ as a measurement device: it outputs φ(0) when tested against a bump function φ.

Topics

Distributions
Heaviside Function
Dirac Delta
Test Functions
Generalized Derivatives

Mentioned

Paul Dirac