Probability Theory 14 | Expectation and Change-of-Variables [dark version]

TL;DR

Expectation E[X] is defined as an abstract integral over the sample space: E[X] = ∫_Ω X dP (when it exists).

Briefing Cornell Notes

Briefing

Expectation turns a random variable into a single number: the average value it fluctuates around. For a random variable X, the expectation E[X] is defined as an “abstract integral” over the probability space, written as ∫_Ω X dP (when it exists). In continuous settings, this same idea becomes the familiar mean computed from a probability density function (PDF): E[X] = ∫ x f_X(x) dx, where f_X(x) is nonnegative and integrates to 1. The expectation lands at the “center of mass” of the distribution—like the midpoint of a symmetric density—making it intuitive as the typical value.

The core technical tool behind transforming expectations is the change-of-variables formula for integrals under probability measures. When a new function g is applied to a random variable—forming a new random variable g(X)—the expectation of g(X) can be rewritten using a substitution rule that tracks how probability mass moves. Concretely, the integral over the original sample space can be converted into an integral over the image space (the real line) by using the distribution of X. This is expressed through the pushforward measure: the measure induced on real numbers by X, often denoted P_X. In practice, that means the abstract integral ∫_Ω g(X(ω)) dP(ω) can be rewritten as an integral (or sum) over x ∈ ℝ with respect to P_X.

Once everything is expressed in terms of P_X, the continuous and discrete cases become straightforward. If X has a PDF, the expectation reduces to an ordinary integral with f_X(x). If X is discrete, it becomes a series using the probability mass function (PMF) p_X(x), where p_X(x) = P(X = x). This unified viewpoint—abstract integration plus change of variables—explains why the same expectation formulas appear across different probability models.

The video closes with a concrete example: rolling a fair die once. Here X takes values {1,2,3,4,5,6} with probability 1/6 each, so E[X] is the finite sum 1·(1/6) + 2·(1/6) + … + 6·(1/6). Computing that sum yields 3.5. A key takeaway is that E[X] need not be an actual outcome of the random variable; it can be a value outside the set {1,…,6}. That distinction matters: expectation is a statistical mean of all possible outcomes weighted by their probabilities, not a guaranteed result.

Overall, the message is that expectation is defined abstractly but computed concretely—either by integrals with PDFs or sums with PMFs—because change-of-variables lets probability be transferred from the sample space to the real line via the distribution of X.

Cornell Notes

Expectation E[X] is the average value a random variable fluctuates around, defined (when it exists) by the abstract integral E[X] = ∫_Ω X dP. In continuous cases, it becomes E[X] = ∫ x f_X(x) dx using the PDF f_X, while in discrete cases it becomes a sum over x using the PMF p_X(x). Applying a function g to X creates a new random variable g(X), and the expectation of g(X) is handled by a change-of-variables rule that replaces integration over Ω with integration over ℝ using the pushforward distribution P_X. This framework explains why expectation formulas look different in continuous versus discrete settings but share the same underlying structure. The die example shows E[X] = 3.5, even though 3.5 is not a possible die roll.

How is expectation E[X] defined in full generality, and what does it mean intuitively?

Expectation is defined as an abstract integral over the sample space: E[X] = ∫_Ω X dP (assuming it exists). Intuitively, it is the mean of the random variable—where the values of X “balance” when weighted by their probabilities. In continuous examples with a symmetric PDF, the expectation sits at the center of the density, reflecting that balancing idea.

Why does the change-of-variables formula matter when computing E[g(X)]?

When a function g is applied to X, the new random variable is g(X). The change-of-variables rule lets the integral over Ω be rewritten as an integral over the real line using the distribution of X (the pushforward measure P_X). This substitution tracks how probability mass moves under the mapping ω ↦ X(ω), so computations can be done using PDFs/PMFs on ℝ instead of working directly on Ω.

What do the continuous and discrete expectation formulas look like once everything is expressed using the distribution P_X?

For continuous X with PDF f_X, the abstract integral becomes an ordinary integral: E[g(X)] = ∫ g(x) f_X(x) dx. For discrete X with PMF p_X, it becomes a sum (series): E[g(X)] = Σ g(x) p_X(x), where p_X(x) = P(X = x). The key ingredient is that the measure on ℝ is exactly the distribution induced by X.

In the die example, how is E[X] computed and why is the result 3.5?

For a fair die, X ∈ {1,2,3,4,5,6} and each value has probability 1/6. So E[X] = Σ_{k=1}^6 k·(1/6) = (1+2+3+4+5+6)/6 = 21/6 = 3.5. The calculation is a finite sum because the support of X is finite.

Why can the expectation lie outside the set of possible outcomes?

Expectation is a probability-weighted average of all outcomes, not a guaranteed outcome itself. For the die, the average of {1,…,6} is 3.5, but 3.5 cannot occur as a single die roll. The expectation is therefore a summary statistic of the distribution, not a value that must be realized.

Review Questions

What is the relationship between the abstract integral ∫_Ω X dP and the integral/sum formulas involving f_X or p_X?
How does applying a function g to a random variable change the expectation, and what role does the pushforward distribution P_X play?
In the discrete case, how do you compute E[X] from the PMF, and why does the sum become finite for a die?

Key Points

1
Expectation E[X] is defined as an abstract integral over the sample space: E[X] = ∫_Ω X dP (when it exists).
2
In continuous settings, expectation is computed using the PDF: E[X] = ∫ x f_X(x) dx.
3
In discrete settings, expectation is computed using the PMF: E[X] = Σ x p_X(x).
4
For a transformed random variable g(X), change-of-variables rewrites the expectation using the distribution P_X induced by X on ℝ.
5
The pushforward measure P_X is the probability measure on real numbers that makes integration over ℝ match integration over Ω.
6
Expectation is a weighted average and does not have to be an actual outcome of the random variable (e.g., a die’s expectation is 3.5).

Highlights

Expectation is the probability-weighted mean of a random variable, defined abstractly as ∫_Ω X dP.

Change-of-variables transfers integration from the sample space to ℝ using the pushforward distribution P_X.

Continuous expectations use PDFs (integrals), while discrete expectations use PMFs (sums).

A fair die’s expected value is 3.5, even though 3.5 cannot be rolled.

Topics

Expectation
Change of Variables
Pushforward Measure
Probability Density Functions
Probability Mass Functions