Probability Theory 13 | Independence for Random Variables [dark version]

TL;DR

Independence of real-valued random variables X and Y is defined by independence of threshold events {X ≤ x} and {Y ≤ y} for all real x and y.

Briefing Cornell Notes

Briefing

Independence for random variables is defined by checking whether the events created from their values behave independently—then that property can be expressed compactly using distribution functions. For two real-valued random variables X and Y on a probability space (Ω, 𝒜, P), independence means that for any real numbers x and y, the events {X ≤ x} and {Y ≤ y} are independent. Because these events come from pre-images of intervals under X and Y, the definition is grounded in the measurable structure of random variables: pre-images of sets like (−∞, x] generate σ-algebras, and independence requires that every pair of events drawn from the σ-algebras generated by X and by Y factorize in probability.

This event-based requirement translates into a clean statement about cumulative distribution functions. Independence holds exactly when the joint cumulative distribution function can be written as a product of the marginals: the probability of {X ≤ x, Y ≤ y} equals P(X ≤ x)·P(Y ≤ y). In notation, this is expressed as F_{X,Y}(x, y) = F_X(x)F_Y(y), where F_{X,Y} is the joint CDF (a “generalized” CDF because it depends on two inputs) and F_X, F_Y are the usual CDFs of X and Y. The practical takeaway is that independence can be verified by checking that the joint CDF factorizes into the product of the individual CDFs.

A standard situation where this factorization appears is when X and Y come from separate components of a product probability space. Suppose Ω = Ω₁ × Ω₂, with X depending only on the Ω₁ coordinate and Y depending only on the Ω₂ coordinate. Concretely, if Ω₁ represents the outcome of the first die and Ω₂ represents the outcome of the second die, then X can be defined as “the first die’s number” and Y as “the second die’s number.” Because X ignores the second coordinate and Y ignores the first, the events {X ≤ x} and {Y ≤ y} correspond to conditions on disjoint parts of the sample space, making the factorization—and thus independence—automatic.

The same logic extends beyond pairs to whole families of random variables. For an indexed collection {X_i}_{i∈I}, the family is independent if, for every finite subset J ⊂ I, the probability of simultaneously satisfying {X_j ≤ x_j for all j∈J} equals the product of the individual probabilities ∏_{j∈J} P(X_j ≤ x_j). This definition generalizes two-variable independence by requiring the same factorization property for every finite group of variables, with the understanding that the index set I may be infinite. The result is a scalable criterion: independence for many random variables is enforced by consistent product behavior across all finite subcollections.

Cornell Notes

Independence for real-valued random variables X and Y is defined by whether the events generated by their values factorize in probability. Specifically, X and Y are independent when for all real x and y, P(X ≤ x, Y ≤ y) = P(X ≤ x)·P(Y ≤ y). This condition is equivalent to the joint CDF factoring as F_{X,Y}(x,y) = F_X(x)F_Y(y), where the joint CDF depends on two inputs. A common way independence arises is from a product sample space Ω = Ω₁×Ω₂ where X depends only on Ω₁ and Y depends only on Ω₂ (e.g., two dice). The idea generalizes to families {X_i}: every finite subcollection must satisfy the same product rule for probabilities of simultaneous threshold events.

How does independence for random variables relate to independence for events?

Independence for random variables is built from event independence by using pre-images of value thresholds. For real-valued X and Y, consider events of the form {X ≤ x} and {Y ≤ y}. X and Y are independent exactly when these events are independent for every choice of real numbers x and y, meaning P(X ≤ x, Y ≤ y) = P(X ≤ x)·P(Y ≤ y).

Why do cumulative distribution functions (CDFs) appear in the independence definition?

Because the probabilities in the event-based condition are precisely CDF values. P(X ≤ x) is the CDF F_X(x). Likewise, P(Y ≤ y) is F_Y(y). The joint probability P(X ≤ x, Y ≤ y) is the joint CDF F_{X,Y}(x,y). Independence becomes the factorization rule F_{X,Y}(x,y) = F_X(x)F_Y(y).

What structural setup guarantees independence for X and Y?

A product sample space Ω = Ω₁×Ω₂ with coordinate-separable random variables. If X depends only on the Ω₁ coordinate (X(ω₁,ω₂)=f(ω₁)) and Y depends only on the Ω₂ coordinate (Y(ω₁,ω₂)=g(ω₂)), then conditions on X and conditions on Y correspond to disjoint parts of the sample space. That separation yields P(X ≤ x, Y ≤ y) = P(X ≤ x)·P(Y ≤ y).

How does the definition of independence extend from two variables to many?

For a family {X_i}_{i∈I}, independence requires the same factorization for every finite subcollection. For any finite J ⊂ I and any real thresholds {x_j}_{j∈J}, the probability of all threshold events occurring together must equal the product of the individual probabilities: P(∩_{j∈J}{X_j ≤ x_j}) = ∏_{j∈J} P(X_j ≤ x_j).

Why does the definition only demand checking finite subsets J, even if I is infinite?

The rule is stated for all finite groups of variables because independence is enforced through finite-dimensional joint behavior. Even when I is infinite, the definition only requires the product factorization for every finite J, which is enough to formalize “no dependence” across any finite set of variables.

Review Questions

State the event-based condition for independence of two real-valued random variables X and Y.
Write the CDF factorization condition that is equivalent to independence of X and Y.
For an indexed family {X_i}, what equality must hold for every finite subset J ⊂ I to declare the family independent?

Key Points

1
Independence of real-valued random variables X and Y is defined by independence of threshold events {X ≤ x} and {Y ≤ y} for all real x and y.
2
Independence is equivalent to the joint CDF factorizing: F_{X,Y}(x,y) = F_X(x)F_Y(y).
3
Pre-images of intervals like (−∞, x] under X and Y generate σ-algebras used to formalize measurable independence.
4
Independence often becomes automatic when the sample space splits as Ω = Ω₁×Ω₂ and X depends only on Ω₁ while Y depends only on Ω₂ (e.g., two dice).
5
For an independent family {X_i}_{i∈I}, the product rule must hold for every finite subcollection J ⊂ I.
6
Checking independence for all finite subsets J is the core generalization from two variables to infinitely many.

Highlights

Independence for random variables is exactly the factorization of probabilities for all threshold events: P(X ≤ x, Y ≤ y) = P(X ≤ x)P(Y ≤ y).

The joint CDF becomes a product of marginals under independence: F_{X,Y}(x,y) = F_X(x)F_Y(y).

Two dice provide a canonical example: each variable depends on a different coordinate of the product sample space, making independence immediate.

For many variables, independence means every finite subcollection satisfies the same product rule for simultaneous inequalities.

Topics

Independence
Joint CDF
Product Sample Spaces
Random Variable Families
σ-Algebras