Probability Theory 12 | Cumulative Distribution Function [dark version]

Q: How is the CDF defined, and what does it mean probabilistically?

The CDF is written F_X(x) and defined by F_X(x)=P(X≤x). For any real threshold x, it adds up all probability mass for outcomes of X that fall at or below that threshold—equivalently, it is the probability of the interval (−∞,x] under the distribution of X.

Q: Why must a CDF approach 0 and 1 at the extremes?

As x→−∞, the interval (−∞,x] becomes empty, and probability of the empty set is 0, so F_X(x)→0. As x→+∞, the interval expands to cover all real numbers, so the probability becomes 1 and F_X(x)→1.

Q: What guarantees that a CDF is monotone increasing?

If x1<x2, then (−∞,x1] is a subset of (−∞,x2]. Probability measures are monotone with respect to set inclusion, so P(X≤x1)≤P(X≤x2). That directly yields F_X(x1)≤F_X(x2).

Q: What does right-continuity mean for a CDF, and why do jumps happen?

Right-continuity means F_X(x0)=lim_{x→x0+}F_X(x). A CDF may jump at x0 when P(X=x0)>0 (typical for discrete variables). Those point-mass probabilities are included in P(X≤x) as soon as the threshold reaches x0, which is why the graph’s filled-in value corresponds to the upper side of the jump.

TL;DR

A cumulative distribution function is defined by F_X(x)=P(X≤x), accumulating probability from −∞ to the threshold x.

Briefing Cornell Notes

Briefing

Every real-valued random variable comes with a cumulative distribution function (CDF) that turns probability questions into a single, monotone curve on the real line. The CDF, usually written as F_X(x), is defined by F_X(x)=P(X≤x): it accumulates all probability mass from −∞ up to the threshold x. That “accumulation” is why the function is called cumulative, and why it works the same way whether X is discrete (only certain values matter) or absolutely continuous (probability comes from a density).

The CDF has three key properties that follow directly from the rules of probability. First, as x→−∞, the interval (−∞,x] shrinks to the empty set, so F_X(x)→0. Second, as x→+∞, the interval expands to the entire real line, so F_X(x)→1. Third, F_X(x) is monotone increasing: if x1<x2, then (−∞,x1] is a subset of (−∞,x2], and probabilities can only stay the same or increase. Unlike a probability density function (PDF), which can rise and fall, the CDF never decreases.

A subtle but important detail is right-continuity. CDFs may have jumps, especially for discrete random variables. Right-continuity means the value at x0 matches the limit taken from values slightly larger than x0. Graphically, at a jump point, the filled-in point sits at the upper side of the jump because the event X=x0 contributes nonzero probability, and that mass is included when computing P(X≤x) for thresholds at x0.

The video then grounds the theory in the normal distribution, the central example for CDFs. For the standard normal case with mean μ=0 and standard deviation σ=1, the PDF is the Gaussian bell curve: f(x)= (1/√(2π))·e^(−x^2/2). When the distribution is absolutely continuous, the CDF is obtained by integrating the PDF from −∞ to x: F(x)=∫_{−∞}^{x} f(t) dt. Symmetry of the normal density implies F(0)=1/2, since half the probability lies to the left of zero.

To make these functions concrete, the transcript describes plotting both the PDF and CDF over a grid (for example, from −10 to 10 with a small step size). The PDF appears as a bell curve peaking at x=0, while the CDF rises smoothly from 0 toward 1 as x increases. Finally, it connects the theory to simulation: using R’s normal generator (rnorm) to draw many samples (e.g., 6000) and comparing the resulting histogram to the bell curve. The takeaway is that the CDF provides a unified, threshold-based probability function, while the PDF supplies the density that—when integrated—produces that cumulative curve.

Cornell Notes

A cumulative distribution function (CDF) for a real-valued random variable X is defined as F_X(x)=P(X≤x). It accumulates probability from −∞ up to the cutoff x, so it applies to both discrete and continuous cases. The CDF always satisfies limits F_X(x)→0 as x→−∞ and F_X(x)→1 as x→+∞, and it is monotone increasing. CDFs are right-continuous and may have jumps when X assigns positive probability to single points. For the normal distribution with μ=0 and σ=1, the CDF is the integral of the Gaussian PDF from −∞ to x, and symmetry gives F(0)=1/2. Simulation with rnorm and histograms helps visualize how samples match the bell curve.

How is the CDF defined, and what does it mean probabilistically?

The CDF is written F_X(x) and defined by F_X(x)=P(X≤x). For any real threshold x, it adds up all probability mass for outcomes of X that fall at or below that threshold—equivalently, it is the probability of the interval (−∞,x] under the distribution of X.

Why must a CDF approach 0 and 1 at the extremes?

As x→−∞, the interval (−∞,x] becomes empty, and probability of the empty set is 0, so F_X(x)→0. As x→+∞, the interval expands to cover all real numbers, so the probability becomes 1 and F_X(x)→1.

What guarantees that a CDF is monotone increasing?

If x1<x2, then (−∞,x1] is a subset of (−∞,x2]. Probability measures are monotone with respect to set inclusion, so P(X≤x1)≤P(X≤x2). That directly yields F_X(x1)≤F_X(x2).

What does right-continuity mean for a CDF, and why do jumps happen?

Right-continuity means F_X(x0)=lim_{x→x0+}F_X(x). A CDF may jump at x0 when P(X=x0)>0 (typical for discrete variables). Those point-mass probabilities are included in P(X≤x) as soon as the threshold reaches x0, which is why the graph’s filled-in value corresponds to the upper side of the jump.

How is the normal CDF computed from the normal PDF?

For an absolutely continuous distribution, the CDF is the integral of the PDF. For the standard normal, the PDF is f(x)=(1/√(2π))·e^(−x^2/2). The CDF is F(x)=∫_{−∞}^{x} f(t) dt. The transcript also notes symmetry: because the normal density is symmetric around 0, F(0)=1/2.

How do plots and simulation connect the PDF, CDF, and histograms?

Plotting the PDF over a range (e.g., −10 to 10) shows the bell curve peaking at x=0. Plotting the CDF over the same range shows a curve rising from 0 toward 1. Simulation using R’s rnorm generates many samples; a histogram of those samples (e.g., 6000 draws) visually matches the bell curve, reinforcing the distributional shape.

Review Questions

What three properties must every CDF satisfy, and how do they follow from probability measure behavior?
In what situations does a CDF have jumps, and how does right-continuity determine the plotted value at a jump point?
For the standard normal distribution, how do you express the CDF in terms of the PDF, and why is F(0)=1/2?

Key Points

1
A cumulative distribution function is defined by F_X(x)=P(X≤x), accumulating probability from −∞ to the threshold x.
2
CDFs apply uniformly to discrete and absolutely continuous random variables, even though the underlying probability mechanism differs.
3
Every CDF satisfies F_X(x)→0 as x→−∞ and F_X(x)→1 as x→+∞.
4
A CDF is monotone increasing because (−∞,x1] is always a subset of (−∞,x2] when x1<x2.
5
CDFs are right-continuous and may have jumps when the random variable assigns positive probability to exact points.
6
For the standard normal, the PDF is (1/√(2π))·e^(−x^2/2), and the CDF is the integral of that PDF from −∞ to x.
7
Simulation with rnorm and histograms can visually confirm the bell-curve shape associated with the normal distribution.

Highlights

A CDF is a threshold probability: F_X(x)=P(X≤x), so it turns a distribution into a single “how much is at or below x?” function.

Unlike PDFs, CDFs never decrease; they rise from 0 to 1 as x moves from −∞ to +∞.

Right-continuity matters at jump points: the plotted value at x0 matches the limit from the right because point masses get included in P(X≤x).

For the standard normal, symmetry forces the CDF to hit 1/2 at x=0.

The normal CDF is obtained by integrating the Gaussian PDF from −∞ to x, linking density and cumulative probability directly.

Topics

Cumulative Distribution Function
Right Continuity
Normal Distribution
Probability Measures
R Simulation