Get AI summaries of any video or article — Sign up free
Distributions 10 | Distributional Derivative [dark version] thumbnail

Distributions 10 | Distributional Derivative [dark version]

4 min read

Based on The Bright Side of Mathematics's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Distributional derivatives are defined by transferring derivatives from a distribution T to a test function Φ using ⟨∂αT, Φ⟩ = (−1)|α|⟨T, ∂αΦ⟩.

Briefing

The core breakthrough is a definition of derivatives that works for every distribution, even when classical differentiation fails—by shifting derivatives onto test functions through integration by parts. Starting from a smooth function f, the usual derivative satisfies a duality identity: pairing the distribution associated with f′ against a test function Φ equals pairing the distribution associated with f against −Φ′. That calculation motivates the general rule: for any distribution T, its distributional derivative is defined so that ⟨∂αT, Φ⟩ = (−1)|α|⟨T, ∂αΦ⟩, where α is a multi-index and |α| counts how many partial derivatives occur. This definition guarantees the derivative exists in the space of distributions D′ without requiring any pointwise differentiability of T.

The practical payoff is twofold. First, the definition automatically produces a new distribution in D′ (linearity and continuity follow from the corresponding properties of T and the test-function space). Second, for regular distributions—those coming from sufficiently smooth functions—the distributional derivative matches the classical partial derivative. In other words, the distributional framework extends classical differentiation rather than replacing it: when T corresponds to a C∞ function, the two notions coincide.

The video then demonstrates why this matters by working two canonical examples. The first is the Heaviside step function H(x), which jumps from 0 to 1 at the origin and therefore has no classical derivative there. Using the distributional definition in one dimension, the pairing ⟨(d/dx)H, Φ⟩ becomes −∫ H(x)Φ′(x) dx. Because H(x)=0 for x<0 and H(x)=1 for x>0, the integral reduces to −∫0^a Φ′(x) dx, which evaluates to Φ(0) after choosing a large enough so that Φ(a)=0. Since Φ(0) is exactly how the Dirac delta distribution δ acts on test functions, the derivative of the Heaviside function is identified as δ.

The second example starts with the Dirac delta distribution δ and differentiates it distributionally. Applying the same rule yields ⟨δ′, Φ⟩ = −⟨δ, Φ′⟩ = −Φ′(0). This shows a key trade-off: distributional derivatives always exist, but the result may stop being a regular distribution. Even though δ itself is not regular, its derivative δ′ is still perfectly well-defined as a distribution that acts by evaluating the derivative of the test function at the origin.

Overall, the message is that differentiation can be made universally well-defined in distribution theory by transferring derivatives from the potentially singular object to the smooth test functions, with the sign (−1)|α| tracking how many times integration by parts is performed.

Cornell Notes

Distributional derivatives are defined for every distribution T by moving derivatives onto test functions via integration by parts. For a multi-index α, the distributional partial derivative ∂αT is the unique distribution satisfying ⟨∂αT, Φ⟩ = (−1)|α|⟨T, ∂αΦ⟩ for all test functions Φ. This guarantees existence in D′ without needing classical differentiability of T. When T comes from a smooth (regular) function, the distributional derivative agrees with the classical partial derivative. The examples show the payoff: the derivative of the Heaviside step function is the Dirac delta, and differentiating δ produces a new (non-regular) distribution that acts by −Φ′(0).

Why does integration by parts lead to the sign (−1)|α| in the definition of distributional derivatives?

In the smooth case, pairing f′ with a test function Φ gives ∫ f′(x)Φ(x) dx. Integration by parts shifts the derivative from f′ to Φ and introduces a minus sign: ∫ f′Φ = −∫ fΦ′, with boundary terms vanishing because test functions have compact support (so Φ is zero at the endpoints). For higher-order partial derivatives, repeating this shift for each derivative contributes one minus sign per derivative, so the total factor becomes (−1)|α|, where |α| is the sum of the entries of the multi-index α.

How does the definition ensure the derivative of a distribution always exists, even when classical derivatives do not?

The distributional derivative ∂αT is defined by its action on test functions: ⟨∂αT, Φ⟩ = (−1)|α|⟨T, ∂αΦ⟩. Since ∂αΦ is again a test function, the right-hand side is always meaningful for any T in D′. Linearity and continuity follow from the corresponding properties of T and the test-function space, so ∂αT is guaranteed to be a valid element of D′.

What calculation shows that the distributional derivative of the Heaviside function H is the Dirac delta δ?

Using the one-dimensional definition, ⟨H′, Φ⟩ = −∫ H(x)Φ′(x) dx. Because H(x)=0 for x<0 and H(x)=1 for x>0, the integral reduces to −∫0^a Φ′(x) dx (choosing a large enough so Φ(a)=0). By the fundamental theorem of calculus, −∫0^a Φ′(x) dx = −(Φ(a)−Φ(0)) = Φ(0). Since δ acts by ⟨δ, Φ⟩ = Φ(0), it follows that H′ = δ.

What does it mean that the derivative of δ is not a regular distribution?

A regular distribution corresponds to integration against an ordinary function. The derivative of δ is defined by ⟨δ′, Φ⟩ = −⟨δ, Φ′⟩ = −Φ′(0). This action depends on the derivative of Φ at a point rather than on Φ itself via an integrable function, so δ′ cannot be represented as a regular distribution. The framework still guarantees existence, but regularity can be lost.

When do distributional derivatives match classical derivatives?

When T is a regular distribution associated with a sufficiently smooth function (the transcript emphasizes C∞). In that case, the distributional derivative defined through ⟨∂αT, Φ⟩ = (−1)|α|⟨T, ∂αΦ⟩ coincides with the classical partial derivative of the underlying function. So distributional differentiation extends classical differentiation rather than contradicting it.

Review Questions

  1. State the defining formula for the distributional partial derivative ∂αT in terms of the pairing with a test function Φ.
  2. Compute ⟨H′, Φ⟩ for the Heaviside function H and explain why it equals Φ(0).
  3. What is ⟨δ′, Φ⟩ and how does it differ from ⟨δ, Φ⟩?

Key Points

  1. 1

    Distributional derivatives are defined by transferring derivatives from a distribution T to a test function Φ using ⟨∂αT, Φ⟩ = (−1)|α|⟨T, ∂αΦ⟩.

  2. 2

    Test functions have compact support, so boundary terms vanish when integration by parts is used to motivate the definition.

  3. 3

    For regular distributions coming from smooth functions, distributional partial derivatives match classical partial derivatives.

  4. 4

    The derivative of the Heaviside step function H is the Dirac delta δ because ⟨H′, Φ⟩ evaluates to Φ(0).

  5. 5

    Differentiating δ yields a new distribution δ′ that acts by ⟨δ′, Φ⟩ = −Φ′(0).

  6. 6

    Distributional derivatives always exist in D′, but the result may cease to be regular (as with δ′).

Highlights

A universal derivative rule emerges: differentiate distributions by differentiating test functions instead, with the factor (−1)|α|.
Even a jump discontinuity becomes differentiable in distribution theory: H′ = δ.
The delta function’s derivative is defined cleanly: δ′ acts by −Φ′(0).
Regularity is not guaranteed: distributional differentiation can turn a regular object into a singular one.

Topics

  • Distributional Derivative
  • Test Functions
  • Heaviside and Dirac Delta
  • Multi-Index Derivatives
  • Integration by Parts

Mentioned

  • D′