Get AI summaries of any video or article — Sign up free
Distributions 10 | Distributional Derivative thumbnail

Distributions 10 | Distributional Derivative

4 min read

Based on The Bright Side of Mathematics's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Distributional derivatives are defined via duality: derivatives are transferred from the distribution onto test functions using integration-by-parts logic.

Briefing

Distributional derivatives turn differentiation into an operation that always exists for generalized functions, even when classical derivatives fail at jumps or singularities. Starting from the duality pairing between a distribution T and a smooth test function Φ, differentiation is defined by shifting derivatives off T and onto Φ via integration by parts—producing a new distribution that remains well-defined for every T in D′.

For smooth functions, this definition matches the classical derivative. If f is continuously differentiable (f ∈ C¹) and T_f denotes the regular distribution associated with f, then applying the distributional derivative to a test function gives an ordinary integral of f′(x)Φ(x). Integration by parts moves the derivative from f′ onto Φ, introduces a minus sign, and boundary terms vanish because test functions have compact support (so Φ is zero outside a large interval). The result is exactly the distribution associated with f′. In higher dimensions, the same mechanism extends to partial derivatives: for a multi-index α, the distributional partial derivative ∂^α T is defined so that ⟨∂^α T, Φ⟩ equals (−1)^|α| ⟨T, ∂^α Φ⟩, where |α| counts how many partial derivatives are applied.

The payoff is that distributional derivatives exist without requiring pointwise differentiability. Classical differentiation demands that derivatives exist as functions, but distributional differentiation only requires that T act continuously and linearly on test functions. That means one can differentiate objects like the Heaviside step function, which has a jump at 0 and therefore lacks a classical derivative there. Using the definition with T = Θ (the heavy-side/Heaviside distribution), the pairing ⟨∂Θ, Φ⟩ reduces to an integral involving Φ′. Because Θ(x)=0 for x<0 and Θ(x)=1 for x>0, the integral collapses to the positive side and the fundamental theorem of calculus yields −Φ(0). This matches the defining action of the Dirac delta distribution δ, so the distributional derivative of the Heaviside function is the Dirac delta: ∂Θ = δ.

A second example shows how singularities behave under differentiation. Differentiating the Dirac delta distribution itself produces another distribution that is not regular: ⟨∂δ, Φ⟩ = −Φ′(0). In other words, the derivative of δ is again well-defined in D′, but it no longer corresponds to an ordinary function. The trade-off is clear: distributional derivatives always exist, yet the resulting object may become “less regular” than the original.

Overall, the core insight is operational: differentiation is redefined through duality and integration by parts so that derivatives can be computed for any distribution, and classical derivatives are recovered automatically when the underlying object is smooth enough.

Cornell Notes

Distributional derivatives redefine differentiation using duality: a distribution T is differentiated by transferring derivatives from T onto a test function Φ, with a sign determined by the order of differentiation. For a multi-index α, the rule is ⟨∂^α T, Φ⟩ = (−1)^|α| ⟨T, ∂^α Φ⟩, which guarantees the result is again a distribution. When T comes from a smooth function f, this construction reproduces the classical partial derivatives of f. The method also works for non-smooth and singular objects: the Heaviside step function’s distributional derivative equals the Dirac delta, and differentiating δ yields a new (non-regular) distribution acting as −Φ′(0).

How does the definition of a distributional derivative avoid needing classical derivatives of T?

It never differentiates T pointwise. Instead, it defines the derivative through the pairing with test functions: for any Φ in the test-function space, ⟨∂^α T, Φ⟩ is defined by moving ∂^α onto Φ. This uses integration by parts (conceptually) and the compact support of Φ to eliminate boundary terms. As a result, the derivative exists for every T in D′, regardless of whether T corresponds to a differentiable function.

Why do boundary terms disappear when deriving the sign rule (−1)^|α|?

Test functions Φ are smooth and have compact support, so Φ vanishes outside some interval [−a, a]. When integration by parts shifts derivatives from f′ onto Φ, the boundary contribution involves Φ(±a). Since Φ(−a)=Φ(a)=0 by compact support, the boundary term is zero. Each integration-by-parts step contributes a minus sign, so applying |α| derivatives yields (−1)^|α|.

What is the distributional derivative of the Heaviside step function Θ?

Using ⟨∂Θ, Φ⟩ = −⟨Θ, Φ′⟩, the integral reduces because Θ(x)=0 for x<0 and Θ(x)=1 for x>0. The calculation becomes −∫_0^a Φ′(x) dx, which equals −(Φ(a)−Φ(0)). With a chosen outside the support of Φ, Φ(a)=0, leaving −(−Φ(0))=Φ(0) (matching the delta pairing). Therefore ∂Θ equals the Dirac delta distribution δ.

How does differentiating the Dirac delta δ change the type of object?

The derivative of δ is defined by ⟨∂δ, Φ⟩ = −Φ′(0). This produces a distribution that acts on test functions via their derivatives at the origin, not via values of a regular function. So the derivative exists in D′, but it is no longer a regular distribution (it does not correspond to an ordinary function).

When do distributional derivatives coincide with classical derivatives?

When the distribution is regular and comes from a sufficiently smooth function. For example, if f ∈ C^∞ (or at least C¹ for first derivatives), the associated regular distribution T_f satisfies that the distributional partial derivative ∂^α T_f corresponds exactly to the classical partial derivative ∂^α f. The duality/integration-by-parts computation reproduces the same result.

Review Questions

  1. State the general formula for the distributional partial derivative ∂^α T in terms of the pairing with a test function Φ.
  2. Why does the distributional derivative of the Heaviside step function produce the Dirac delta? Outline the key steps in the pairing calculation.
  3. What does ⟨∂δ, Φ⟩ equal, and why does this show the derivative of δ is not a regular distribution?

Key Points

  1. 1

    Distributional derivatives are defined via duality: derivatives are transferred from the distribution onto test functions using integration-by-parts logic.

  2. 2

    For a multi-index α, the sign in the definition is (−1)^|α| because each derivative transfer introduces a minus sign.

  3. 3

    Compact support of test functions eliminates boundary terms, making the definition consistent and well-defined.

  4. 4

    For regular distributions associated with smooth functions, distributional derivatives match classical derivatives.

  5. 5

    The Heaviside step function Θ has no classical derivative at the jump, but its distributional derivative exists and equals the Dirac delta δ.

  6. 6

    Differentiating δ yields a new, non-regular distribution acting as −Φ′(0), illustrating that regularity can decrease under distributional differentiation.

Highlights

The distributional partial derivative is defined by ⟨∂^α T, Φ⟩ = (−1)^|α| ⟨T, ∂^α Φ⟩, guaranteeing existence for every distribution T in D′.
Integration by parts works cleanly because test functions vanish outside a compact set, killing boundary terms.
The derivative of the Heaviside step function is the Dirac delta: ∂Θ = δ.
Even though δ is already singular, its distributional derivative is still well-defined: ⟨∂δ, Φ⟩ = −Φ′(0).

Topics

Mentioned

  • D′