Probability Theory 14 | Expectation and Change-of-Variables [dark version]
Based on The Bright Side of Mathematics's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Expectation E[X] is defined as an abstract integral over the sample space: E[X] = ∫_Ω X dP (when it exists).
Briefing
Expectation turns a random variable into a single number: the average value it fluctuates around. For a random variable X, the expectation E[X] is defined as an “abstract integral” over the probability space, written as ∫_Ω X dP (when it exists). In continuous settings, this same idea becomes the familiar mean computed from a probability density function (PDF): E[X] = ∫ x f_X(x) dx, where f_X(x) is nonnegative and integrates to 1. The expectation lands at the “center of mass” of the distribution—like the midpoint of a symmetric density—making it intuitive as the typical value.
The core technical tool behind transforming expectations is the change-of-variables formula for integrals under probability measures. When a new function g is applied to a random variable—forming a new random variable g(X)—the expectation of g(X) can be rewritten using a substitution rule that tracks how probability mass moves. Concretely, the integral over the original sample space can be converted into an integral over the image space (the real line) by using the distribution of X. This is expressed through the pushforward measure: the measure induced on real numbers by X, often denoted P_X. In practice, that means the abstract integral ∫_Ω g(X(ω)) dP(ω) can be rewritten as an integral (or sum) over x ∈ ℝ with respect to P_X.
Once everything is expressed in terms of P_X, the continuous and discrete cases become straightforward. If X has a PDF, the expectation reduces to an ordinary integral with f_X(x). If X is discrete, it becomes a series using the probability mass function (PMF) p_X(x), where p_X(x) = P(X = x). This unified viewpoint—abstract integration plus change of variables—explains why the same expectation formulas appear across different probability models.
The video closes with a concrete example: rolling a fair die once. Here X takes values {1,2,3,4,5,6} with probability 1/6 each, so E[X] is the finite sum 1·(1/6) + 2·(1/6) + … + 6·(1/6). Computing that sum yields 3.5. A key takeaway is that E[X] need not be an actual outcome of the random variable; it can be a value outside the set {1,…,6}. That distinction matters: expectation is a statistical mean of all possible outcomes weighted by their probabilities, not a guaranteed result.
Overall, the message is that expectation is defined abstractly but computed concretely—either by integrals with PDFs or sums with PMFs—because change-of-variables lets probability be transferred from the sample space to the real line via the distribution of X.
Cornell Notes
Expectation E[X] is the average value a random variable fluctuates around, defined (when it exists) by the abstract integral E[X] = ∫_Ω X dP. In continuous cases, it becomes E[X] = ∫ x f_X(x) dx using the PDF f_X, while in discrete cases it becomes a sum over x using the PMF p_X(x). Applying a function g to X creates a new random variable g(X), and the expectation of g(X) is handled by a change-of-variables rule that replaces integration over Ω with integration over ℝ using the pushforward distribution P_X. This framework explains why expectation formulas look different in continuous versus discrete settings but share the same underlying structure. The die example shows E[X] = 3.5, even though 3.5 is not a possible die roll.
How is expectation E[X] defined in full generality, and what does it mean intuitively?
Why does the change-of-variables formula matter when computing E[g(X)]?
What do the continuous and discrete expectation formulas look like once everything is expressed using the distribution P_X?
In the die example, how is E[X] computed and why is the result 3.5?
Why can the expectation lie outside the set of possible outcomes?
Review Questions
- What is the relationship between the abstract integral ∫_Ω X dP and the integral/sum formulas involving f_X or p_X?
- How does applying a function g to a random variable change the expectation, and what role does the pushforward distribution P_X play?
- In the discrete case, how do you compute E[X] from the PMF, and why does the sum become finite for a die?
Key Points
- 1
Expectation E[X] is defined as an abstract integral over the sample space: E[X] = ∫_Ω X dP (when it exists).
- 2
In continuous settings, expectation is computed using the PDF: E[X] = ∫ x f_X(x) dx.
- 3
In discrete settings, expectation is computed using the PMF: E[X] = Σ x p_X(x).
- 4
For a transformed random variable g(X), change-of-variables rewrites the expectation using the distribution P_X induced by X on ℝ.
- 5
The pushforward measure P_X is the probability measure on real numbers that makes integration over ℝ match integration over Ω.
- 6
Expectation is a weighted average and does not have to be an actual outcome of the random variable (e.g., a die’s expectation is 3.5).