Measure Theory 16 | Proof of the Substitution Rule for Measure Spaces [dark version]
Based on The Bright Side of Mathematics's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
The substitution rule equates ∫_Y G d(H_*μ) with ∫_X (G∘H) dμ when H is measurable and the relevant integrals exist.
Briefing
The substitution rule for measure spaces lets integrals be transferred across a measurable map: integrating a function on Y with respect to the image measure can be replaced by integrating the same function composed with the map on X. The key requirement is a measurable function H: X → Y between measure spaces (X, μ) and (Y, ν), where ν is the image measure H_*μ. Under the mild condition that at least one of the two integrals exists, the rule guarantees equality of the two sides—so long as measurability and integrability assumptions line up.
The proof starts with the simplest measurable functions on Y: characteristic functions χ_C of measurable sets C ⊂ Y. For such functions, the integral on the Y-side is just the measure of C. Because the measure on Y is the image measure H_*μ, that quantity equals μ(H^{-1}(C)). On the X-side, the composed function χ_C∘H takes value 1 exactly when H(x) ∈ C, which is the same as x ∈ H^{-1}(C). Therefore, integrating χ_C∘H over X again produces μ(H^{-1}(C)). This establishes the substitution identity for characteristic functions.
Next comes the extension to simple functions. Any nonnegative simple function can be written as a finite linear combination of characteristic functions, say G = Σ_{i=1}^n λ_i χ_{C_i}. Linearity of the integral allows the substitution rule to be applied term-by-term, moving from the characteristic-function case to the full simple-function case. At this stage, the equality holds for all simple measurable G by combining the already-proved set-level identity with the integral’s linearity.
To reach general nonnegative measurable functions, the proof uses the definition of the (extended) Lebesgue integral via the supremum over simple functions. For a nonnegative measurable G on Y, one considers simple functions G-tilde on Y that lie pointwise below G. Composing with H transfers these simple functions to X: G-tilde∘H becomes a simple function on X and remains bounded above by G∘H. Since the substitution rule already holds for each such simple function, taking the supremum over all admissible G-tilde yields equality of the integrals for G∘H and G with respect to the image measure.
Finally, the argument handles arbitrary measurable G by splitting it into positive and negative parts. The substitution rule applies separately to the nonnegative components, and if one of the two integrals exists (in the extended sense), the other exists as well. The result is a complete proof of the substitution rule: integrating G over Y with respect to H_*μ equals integrating G∘H over X with respect to μ, provided the relevant integrability condition is satisfied.
Cornell Notes
The substitution rule transfers integrals across a measurable map H: X → Y. When Y carries the image measure H_*μ, the identity ∫_Y G d(H_*μ) = ∫_X (G∘H) dμ holds whenever one side exists (then the other does too). The proof begins with characteristic functions χ_C of measurable sets C ⊂ Y, where both sides reduce to μ(H^{-1}(C)). It then extends to simple functions by writing them as finite linear combinations of characteristic functions and using linearity of the integral. For general nonnegative measurable G, the integral is defined as a supremum over simple functions below G; composing those simple functions with H preserves simplicity and the inequality, so the supremum matches on both sides. Arbitrary measurable G follows by splitting into positive and negative parts.
Why does the substitution rule become a set-measure identity for characteristic functions χ_C?
How does the proof extend from characteristic functions to simple functions?
What role does the supremum definition of the integral play for nonnegative measurable functions?
Why can the proof handle an arbitrary measurable G by splitting into positive and negative parts?
What is the minimal measurability/integrability structure needed for the rule to work?
Review Questions
- In the characteristic-function case, which set on X determines the value of ∫_X (χ_C∘H) dμ?
- How does linearity of the integral combine with the characteristic-function result to prove the substitution rule for simple functions?
- Why does the supremum over simple functions below G on Y translate into the same supremum behavior for G∘H on X?
Key Points
- 1
The substitution rule equates ∫_Y G d(H_*μ) with ∫_X (G∘H) dμ when H is measurable and the relevant integrals exist.
- 2
For characteristic functions χ_C, both sides reduce to μ(H^{-1}(C}) via the definition of image measure and the preimage characterization of χ_C∘H.
- 3
Simple functions follow by writing them as finite linear combinations of characteristic functions and applying linearity of the integral term-by-term.
- 4
Nonnegative measurable functions use the integral’s definition as a supremum over simple functions below G, with composition preserving simplicity and inequalities.
- 5
Arbitrary measurable functions are handled by decomposing into positive and negative parts and applying the nonnegative case separately.
- 6
If one of the two integrals exists (extended sense), the other exists as well due to the positive/negative decomposition and the supremum construction.