Functional Analysis 19 | Hölder's Inequality [dark version]

TL;DR

Hölder’s inequality for vectors in ℓ^p spaces takes the compact form ‖X·Y‖_1 ≤ ‖X‖_p ‖Y‖_{p′} using componentwise multiplication.

Briefing Cornell Notes

Briefing

Hölder’s inequality for vectors in ℓ^p spaces is proved using a two-step strategy: first establish Young’s inequality for positive numbers, then apply it term-by-term inside the ℓ^1-sum that appears after normalizing vectors. The payoff is a compact, memorable form of Hölder’s inequality that links three norms—‖·‖_p, ‖·‖_{p′}, and ‖·‖_1—through componentwise multiplication.

The setup fixes p>1 and defines the Hölder conjugate p′ by the reciprocal relation 1/p + 1/p′ = 1. For a vector X, the ℓ^q norm is written as ‖X‖_q = (∑_j |X_j|^q)^{1/q} for 1 ≤ q < ∞ (with the usual understanding that q=1 is included). Using a shorthand where X·Y denotes the vector with components (X_j Y_j), Hölder’s inequality takes the form

‖X·Y‖_1 ≤ ‖X‖_p · ‖Y‖_{p′}.

To prove it, the argument first handles trivial cases: if X=0 or Y=0, then both sides are zero and the inequality holds automatically. Otherwise, the proof normalizes by dividing X by ‖X‖_p and Y by ‖Y‖_{p′}, so the normalized quantities have ℓ^p and ℓ^{p′} “mass” equal to 1.

The key tool is Young’s inequality, which asserts that for positive a and b,

a b ≤ a^p/p + b^{p′}/p′.

Young’s inequality is derived from convexity of the exponential function. By choosing a convex combination parameter λ = 1/p (so 1−λ = 1/p′) and applying the convexity inequality f(λx+(1−λ)y) ≤ λ f(x) + (1−λ) f(y) to carefully selected inputs involving ln(a^p) and ln(b^{p′}), the proof converts the left-hand side into ab using logarithm and exponential inverse rules. The right-hand side simplifies directly to a^p/p + b^{p′}/p′.

With Young’s inequality in hand, the Hölder proof proceeds inside the ℓ^1 norm: after normalization, the ℓ^1 sum becomes ∑_j of products of two positive terms. Young’s inequality is applied to each summand, producing two sums that match the denominators used in the normalization. Those sums collapse to 1 because of how ‖X‖_p and ‖Y‖_{p′} were defined, leaving exactly ‖X·Y‖_1 ≤ ‖X‖_p ‖Y‖_{p′}.

The result is positioned as a stepping stone toward Minkowski’s inequality, described as the triangle inequality in ℓ^p spaces, which will be tackled next.

Cornell Notes

For p>1, Hölder’s inequality links three norms via componentwise multiplication: ‖X·Y‖_1 ≤ ‖X‖_p ‖Y‖_{p′}, where the Hölder conjugate p′ satisfies 1/p + 1/p′ = 1. The proof reduces to a sharper inequality for positive numbers, Young’s inequality: ab ≤ a^p/p + b^{p′}/p′. Young’s inequality is obtained from convexity of the exponential function by applying a convexity inequality with a carefully chosen parameter λ=1/p and logarithmic substitutions that turn the left side into ab. Hölder then follows by normalizing X and Y by their ℓ^p and ℓ^{p′} norms and applying Young’s inequality term-by-term inside the ℓ^1 sum. This structure also sets up later derivations like Minkowski’s inequality.

How is the Hölder conjugate p′ determined, and why does it matter in the inequality?

For any p>1, p′ is defined by the reciprocal condition 1/p + 1/p′ = 1. This relation is what makes the coefficients in Young’s inequality add up correctly: after applying Young’s inequality, the remaining constants become 1/p + 1/p′ = 1, which is crucial for the normalization step in the Hölder proof.

What is Young’s inequality, and how does convexity of the exponential function produce it?

Young’s inequality says that for positive a and b, ab ≤ a^p/p + b^{p′}/p′. The derivation uses that the exponential function is convex, so for a convex combination parameter λ in [0,1], f(λx+(1−λ)y) ≤ λ f(x) + (1−λ) f(y). Choosing λ=1/p and substituting inputs built from ln(a^p) and ln(b^{p′}) turns the left side into ab (via log/exponential inverse rules) and the right side into a^p/p + b^{p′}/p′.

Why does the Hölder proof normalize vectors by ‖X‖_p and ‖Y‖_{p′}?

Normalization ensures that the sums appearing after applying Young’s inequality match the denominators used to scale X and Y. Concretely, dividing X by ‖X‖_p makes ∑_j |X_j|^p/‖X‖_p^p = 1, and dividing Y by ‖Y‖_{p′} makes ∑_j |Y_j|^{p′}/‖Y‖_{p′}^{p′} = 1. After term-by-term application of Young’s inequality, these identities collapse the expression to exactly ‖X·Y‖_1 ≤ ‖X‖_p ‖Y‖_{p′}.

How does componentwise multiplication connect to the ℓ^1 norm in Hölder’s inequality?

Using the shorthand X·Y for the vector with components (X_j Y_j), the ℓ^1 norm becomes ‖X·Y‖_1 = ∑_j |X_j Y_j|. After normalization, each summand can be written as a product of two positive quantities, letting Young’s inequality apply directly inside the sum.

What role do the zero-vector cases play?

If X=0 or Y=0, then X·Y=0, so ‖X·Y‖_1=0. Also ‖X‖_p or ‖Y‖_{p′} is 0, making the right-hand side ‖X‖_p ‖Y‖_{p′} equal to 0. The inequality holds without further work, so the proof can focus on nonzero vectors where division by norms is valid.

Review Questions

State Hölder’s inequality in the form involving ‖X·Y‖_1 and explain how p′ is defined from p.
Derive Young’s inequality from convexity of the exponential function: what substitutions turn the convexity inequality into ab ≤ a^p/p + b^{p′}/p′?
In the Hölder proof, where exactly do the identities ∑_j |X_j|^p/‖X‖_p^p = 1 and ∑_j |Y_j|^{p′}/‖Y‖_{p′}^{p′} = 1 enter?

Key Points

1
Hölder’s inequality for vectors in ℓ^p spaces takes the compact form ‖X·Y‖_1 ≤ ‖X‖_p ‖Y‖_{p′} using componentwise multiplication.
2
The Hölder conjugate p′ is defined by 1/p + 1/p′ = 1, and this identity is essential when constants combine.
3
Young’s inequality for positive numbers, ab ≤ a^p/p + b^{p′}/p′, is the core algebraic tool behind Hölder’s inequality.
4
Young’s inequality can be proved via convexity of the exponential function, using a convex combination parameter λ=1/p and logarithmic substitutions.
5
The Hölder proof handles X=0 or Y=0 separately, since both sides become 0 immediately.
6
For nonzero vectors, dividing X by ‖X‖_p and Y by ‖Y‖_{p′} forces the relevant power sums to equal 1, making the final bound collapse cleanly.
7
Hölder’s inequality is presented as a stepping stone toward Minkowski’s inequality (the triangle inequality in ℓ^p spaces).

Highlights

Hölder’s inequality is expressed as ‖X·Y‖_1 ≤ ‖X‖_p ‖Y‖_{p′}, where p′ satisfies 1/p + 1/p′ = 1.

Young’s inequality ab ≤ a^p/p + b^{p′}/p′ is derived from the convexity of the exponential function.

Normalization by ‖X‖_p and ‖Y‖_{p′} makes the power sums equal to 1, allowing Young’s inequality to be applied term-by-term inside ∑_j |X_j Y_j|.

The proof’s structure is: convexity → Young → apply inside the ℓ^1 sum → constants cancel using the conjugate relation.

Topics

Hölder's Inequality
Young's Inequality
Convexity
Lp Norms
Minkowski Inequality