Get AI summaries of any video or article — Sign up free
Probability Theory 10 | Random Variables [dark version] thumbnail

Probability Theory 10 | Random Variables [dark version]

5 min read

Based on The Bright Side of Mathematics's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

A random variable is a measurable map X: (Ω, A) → (Ω̃, Ã), not just any function from outcomes to numbers.

Briefing

Random variables turn the outcomes of a random experiment into a single, well-defined numerical object—by requiring that the mapping from outcomes to numbers respects the event structure of probability theory. Concretely, a random variable is a function X from a sample space Ω into another set Ω̃ (often the real numbers), but it only counts as a random variable if every “measurable” event in Ω̃ pulls back to a measurable event in Ω. This measurability condition is what makes probabilities like P(X ∈ Ã) mathematically legitimate.

The discussion starts with a familiar experiment: rolling two distinguishable dice (a red one and a green one). The sample space is the Cartesian product {1,…,6}×{1,…,6}, with the σ-algebra taken as the power set and probabilities given by the uniform distribution. If the game only cares about the sum of the two dice, the relevant object is the random variable X defined by X(ω1, ω2)=ω1+ω2. Here, the input is an outcome pair from Ω, and the output is a number—exactly the kind of “information extraction” random variables are meant to provide.

To formalize the idea, the transcript gives the general definition using measurable (event) spaces. One starts with a measurable space (Ω, A) and another (Ω̃, Ã). A map X:Ω→Ω̃ is called a random variable if it is measurable in the measure-theoretic sense: for every event à in Ã, the pre-image X^{-1}(Ã) must lie in A. That requirement ensures that once a probability measure P is fixed on (Ω, A), the probability of events described in terms of X can be computed.

Two examples show why the condition matters. In the dice-sum setup, the σ-algebra on Ω is the full power set, so any pre-image of a set in Ω̃ automatically lands in A. That makes X a random variable without any real work. But if the σ-algebra on Ω is shrunk to the smallest possible one—{∅, Ω}—then measurability can fail. For instance, consider the event “the sum equals 2.” Its pre-image is the set of outcomes where both dice show 1, which is neither empty nor all of Ω, so it is not in the reduced σ-algebra. In that case, X is not a random variable.

The closing notation ties measurability to probability calculations. Once X is a random variable and P is defined on (Ω, A), probabilities of the form P(X ∈ Ã) are well-defined because X^{-1}(Ã) belongs to A. The transcript also introduces shorthand notation like P(X ∈ Ã) written as P(X ∈ Ã) or even P(X ∈ ã)-style expressions, emphasizing that the left-hand side is a compact way to refer to the probability of the corresponding pre-image set in Ω. The key takeaway: random variables are not just functions to numbers—they are functions whose event structure matches the σ-algebras so probabilities can be assigned consistently.

Cornell Notes

A random variable is a function X from a sample space (Ω, A) to another measurable space (Ω̃, Ã) that is measurable: for every measurable set à in Ã, the pre-image X^{-1}(Ã) must be in A. This measurability requirement is what allows probabilities like P(X ∈ Ã) to be computed using the probability measure P on Ω. The dice-sum example shows the idea in practice: with A as the full power set, the sum map X(ω1, ω2)=ω1+ω2 is automatically measurable. But if A is reduced to {∅, Ω}, then events such as “the sum equals 2” have pre-images that are neither empty nor all of Ω, so measurability fails and X is not a random variable. The shorthand notation for probabilities involving X is justified because it stands for probabilities of these pre-image sets.

What makes a function X:Ω→Ω̃ qualify as a random variable, beyond simply mapping outcomes to numbers?

It must be measurable. For every measurable event à in the target σ-algebra Ã, the pre-image X^{-1}(Ã) must be an event in the source σ-algebra A. This ensures that once a probability measure P is defined on (Ω, A), the probability of events described through X is well-defined.

How does the two-dice “sum” example illustrate the definition of a random variable?

The sample space is Ω={1,…,6}×{1,…,6} with A as the power set, and probabilities are uniform. The random variable is X(ω1, ω2)=ω1+ω2, mapping each outcome pair to a real number (the sum). Because A is the full power set, any pre-image of any set in the real σ-algebra automatically lies in A, so measurability is satisfied.

Why does shrinking the σ-algebra on Ω from the power set to {∅, Ω} break measurability in the dice example?

With A={∅, Ω}, only two events are measurable. Consider the target event “X=2” (sum equals 2). Its pre-image is the set of outcomes where both dice show 1. That set is a proper nontrivial subset of Ω, so it is neither ∅ nor Ω, meaning it is not in A. Since X^{-1}(Ã) must lie in A for all measurable Ã, measurability fails.

What is the role of pre-images X^{-1}(Ã) in probability calculations involving X?

Pre-images translate events about the output of X back into events about the original outcomes in Ω. Probabilities are computed on Ω using P, so P(X ∈ Ã) is really P(X^{-1}(Ã)). The measurability condition guarantees that X^{-1}(Ã) is in A, so P can be applied.

Why does the transcript introduce notation like P(X ∈ Ã), and what does it really mean?

The notation is a shorthand. The expression P(X ∈ Ã) stands for the probability of the set of outcomes ω in Ω such that X(ω) lies in Ã. Formally, it corresponds to P(X^{-1}(Ã)). The shorthand is convenient but relies on the fact that measurability makes the pre-image an event in A.

Review Questions

  1. In terms of σ-algebras, what exact condition must hold for X^{-1}(Ã) for every à in Ã?
  2. Give an example of how changing the σ-algebra A on Ω can turn a previously valid random variable into an invalid one.
  3. Explain, using pre-images, what probability P(X ∈ Ã) actually refers to.

Key Points

  1. 1

    A random variable is a measurable map X: (Ω, A) → (Ω̃, Ã), not just any function from outcomes to numbers.

  2. 2

    Measurability means: for every measurable set à in Ã, the pre-image X^{-1}(Ã) must belong to the σ-algebra A.

  3. 3

    With A equal to the full power set of Ω, measurability becomes automatic for any function into a measurable space.

  4. 4

    If A is reduced to {∅, Ω}, many output events (like “X equals a specific value”) can have pre-images that are not measurable, so X may fail to be a random variable.

  5. 5

    Probabilities involving X, such as P(X ∈ Ã), are computed as P(X^{-1}(Ã)) using the probability measure on Ω.

  6. 6

    Shorthand notation for P(X ∈ Ã) is justified because measurability guarantees the corresponding pre-image is an event in A.

Highlights

Random variables are functions whose pre-images of measurable sets stay measurable—this is the technical requirement that makes probability assignments consistent.
The sum of two dice is automatically a random variable when the σ-algebra on outcomes is the full power set.
Reducing the σ-algebra on the outcome space to {∅, Ω} can make “sum equals 2” non-measurable, so the same sum map stops being a random variable.
Notation like P(X ∈ Ã) is shorthand for the probability of the pre-image set {ω ∈ Ω : X(ω) ∈ Ã}.
Measurability is the bridge between events about outputs (in Ω̃) and events about original outcomes (in Ω).

Topics