Functional Analysis 19 | Hölder's Inequality [dark version]
Based on The Bright Side of Mathematics's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Hölder’s inequality for vectors in ℓ^p spaces takes the compact form ‖X·Y‖_1 ≤ ‖X‖_p ‖Y‖_{p′} using componentwise multiplication.
Briefing
Hölder’s inequality for vectors in ℓ^p spaces is proved using a two-step strategy: first establish Young’s inequality for positive numbers, then apply it term-by-term inside the ℓ^1-sum that appears after normalizing vectors. The payoff is a compact, memorable form of Hölder’s inequality that links three norms—‖·‖_p, ‖·‖_{p′}, and ‖·‖_1—through componentwise multiplication.
The setup fixes p>1 and defines the Hölder conjugate p′ by the reciprocal relation 1/p + 1/p′ = 1. For a vector X, the ℓ^q norm is written as ‖X‖_q = (∑_j |X_j|^q)^{1/q} for 1 ≤ q < ∞ (with the usual understanding that q=1 is included). Using a shorthand where X·Y denotes the vector with components (X_j Y_j), Hölder’s inequality takes the form
‖X·Y‖_1 ≤ ‖X‖_p · ‖Y‖_{p′}.
To prove it, the argument first handles trivial cases: if X=0 or Y=0, then both sides are zero and the inequality holds automatically. Otherwise, the proof normalizes by dividing X by ‖X‖_p and Y by ‖Y‖_{p′}, so the normalized quantities have ℓ^p and ℓ^{p′} “mass” equal to 1.
The key tool is Young’s inequality, which asserts that for positive a and b,
a b ≤ a^p/p + b^{p′}/p′.
Young’s inequality is derived from convexity of the exponential function. By choosing a convex combination parameter λ = 1/p (so 1−λ = 1/p′) and applying the convexity inequality f(λx+(1−λ)y) ≤ λ f(x) + (1−λ) f(y) to carefully selected inputs involving ln(a^p) and ln(b^{p′}), the proof converts the left-hand side into ab using logarithm and exponential inverse rules. The right-hand side simplifies directly to a^p/p + b^{p′}/p′.
With Young’s inequality in hand, the Hölder proof proceeds inside the ℓ^1 norm: after normalization, the ℓ^1 sum becomes ∑_j of products of two positive terms. Young’s inequality is applied to each summand, producing two sums that match the denominators used in the normalization. Those sums collapse to 1 because of how ‖X‖_p and ‖Y‖_{p′} were defined, leaving exactly ‖X·Y‖_1 ≤ ‖X‖_p ‖Y‖_{p′}.
The result is positioned as a stepping stone toward Minkowski’s inequality, described as the triangle inequality in ℓ^p spaces, which will be tackled next.
Cornell Notes
For p>1, Hölder’s inequality links three norms via componentwise multiplication: ‖X·Y‖_1 ≤ ‖X‖_p ‖Y‖_{p′}, where the Hölder conjugate p′ satisfies 1/p + 1/p′ = 1. The proof reduces to a sharper inequality for positive numbers, Young’s inequality: ab ≤ a^p/p + b^{p′}/p′. Young’s inequality is obtained from convexity of the exponential function by applying a convexity inequality with a carefully chosen parameter λ=1/p and logarithmic substitutions that turn the left side into ab. Hölder then follows by normalizing X and Y by their ℓ^p and ℓ^{p′} norms and applying Young’s inequality term-by-term inside the ℓ^1 sum. This structure also sets up later derivations like Minkowski’s inequality.
How is the Hölder conjugate p′ determined, and why does it matter in the inequality?
What is Young’s inequality, and how does convexity of the exponential function produce it?
Why does the Hölder proof normalize vectors by ‖X‖_p and ‖Y‖_{p′}?
How does componentwise multiplication connect to the ℓ^1 norm in Hölder’s inequality?
What role do the zero-vector cases play?
Review Questions
- State Hölder’s inequality in the form involving ‖X·Y‖_1 and explain how p′ is defined from p.
- Derive Young’s inequality from convexity of the exponential function: what substitutions turn the convexity inequality into ab ≤ a^p/p + b^{p′}/p′?
- In the Hölder proof, where exactly do the identities ∑_j |X_j|^p/‖X‖_p^p = 1 and ∑_j |Y_j|^{p′}/‖Y‖_{p′}^{p′} = 1 enter?
Key Points
- 1
Hölder’s inequality for vectors in ℓ^p spaces takes the compact form ‖X·Y‖_1 ≤ ‖X‖_p ‖Y‖_{p′} using componentwise multiplication.
- 2
The Hölder conjugate p′ is defined by 1/p + 1/p′ = 1, and this identity is essential when constants combine.
- 3
Young’s inequality for positive numbers, ab ≤ a^p/p + b^{p′}/p′, is the core algebraic tool behind Hölder’s inequality.
- 4
Young’s inequality can be proved via convexity of the exponential function, using a convex combination parameter λ=1/p and logarithmic substitutions.
- 5
The Hölder proof handles X=0 or Y=0 separately, since both sides become 0 immediately.
- 6
For nonzero vectors, dividing X by ‖X‖_p and Y by ‖Y‖_{p′} forces the relevant power sums to equal 1, making the final bound collapse cleanly.
- 7
Hölder’s inequality is presented as a stepping stone toward Minkowski’s inequality (the triangle inequality in ℓ^p spaces).