Get AI summaries of any video or article — Sign up free
Derivative formulas through geometry | Chapter 3, Essence of calculus thumbnail

Derivative formulas through geometry | Chapter 3, Essence of calculus

3Blue1Brown·
5 min read

Based on 3Blue1Brown's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Derivatives measure the ratio of an infinitesimal output change dF to an infinitesimal input change dx, capturing how tiny nudges propagate.

Briefing

Calculating derivatives stops being a memorization exercise when each rule is tied to a single geometric idea: a derivative measures how a quantity changes under an infinitesimal nudge. The through-line is that tiny changes—dx and the corresponding tiny change in output—determine the slope-like ratio dF/dx. That perspective matters because most real-world models in calculus are built from familiar “pure” functions—polynomials, exponentials, trig functions—so learning how their derivatives behave gives a practical language for rates of change in concrete situations.

The discussion begins with f(x)=x². A tangent-line viewpoint suggests the slope increases with x, but the exact formula comes from unpacking what x² means geometrically. Interpret x as the side length of a square; then f(x) is the square’s area. Increasing x to x+dx adds a thin strip of area along two sides plus a tiny corner square. The dominant new area scales like 2x·dx, while the corner contributes dx², which becomes negligible compared with dx when forming the ratio dF/dx. That leaves d(x²)/dx = 2x, and the slope interpretation becomes precise: the derivative is the “area gained per unit length added.”

Next comes f(x)=x³, treated as the volume of a cube with side x. Nudging x to x+dx adds a yellow volume that, as dx shrinks, is overwhelmingly made of three thin “face” slabs. Each slab has area x² and thickness dx, so the leading volume change is 3x²·dx; edge and corner contributions scale like dx² or dx³ and vanish after dividing by dx. The result is d(x³)/dx = 3x², matching the idea that the graph’s slope at each point is exactly 3x².

From these cases, the power rule emerges: d(x^n)/dx = n·x^(n−1). The geometric reason is an expansion logic. When x is nudged to x+dx, the product (x+dx)^n has a first term x^n (the “old” volume/area), and the leading change comes from choosing exactly one dx factor and the remaining n−1 factors as x. There are n ways to pick which factor contributes dx, producing n·x^(n−1)·dx as the dominant term; all other terms contain dx² or higher and disappear in the dx→0 limit.

The same geometric mindset is applied to f(x)=1/x. Instead of treating 1/x as x^(−1) and hopping the exponent down, the function is pictured as a rectangle (a “puddle”) of fixed area 1: width x forces height 1/x. Increasing x by dx adds area on the right, so the height must drop by an amount d(1/x) that cancels the gained area. That cancellation yields a negative derivative consistent with the power-rule outcome. The segment also invites the reader to reason similarly for √x.

Finally, sine is handled through the unit circle. Walking an arc length θ on a radius-1 circle makes sin(θ) the vertical height. A tiny step dθ along the circumference changes the height by d(sin θ). Zoomed in, the circle locally resembles a straight line, creating similar right triangles whose geometry shows that d(sin θ)/dθ equals the adjacent-over-hypotenuse ratio—exactly cos(θ). The video closes by suggesting the same geometric approach can extend to derivatives of sums, products, and compositions in the next chapter.

Cornell Notes

Derivatives are presented as the ratio of two infinitesimal changes: the tiny change in output (dF) divided by the tiny change in input (dx). Using geometry, x² is treated as the area of a square; increasing side length by dx adds mostly 2x·dx area, while dx² is negligible, giving d(x²)/dx = 2x. Similarly, x³ is the volume of a cube; increasing side length by dx adds mostly 3x²·dx volume, yielding d(x³)/dx = 3x². The power rule d(x^n)/dx = n·x^(n−1) follows from counting the dominant terms in (x+dx)^n: exactly one dx factor contributes at leading order. For trig, the unit circle shows d(sin θ)/dθ = cos θ by comparing similar triangles formed by a tiny arc step dθ.

Why does the derivative of x² become 2x when dx is “tiny”?

Model x² as the area of a square with side length x. When x increases to x+dx, the added area splits into two thin rectangles of area x·dx each (total 2x·dx) plus a corner square of area dx². Forming the derivative uses the ratio dF/dx, so the dx² term becomes negligible compared with dx as dx→0. The remaining leading contribution is d(x²)/dx = (2x·dx)/dx = 2x.

How does the cube picture produce the derivative of x³ as 3x²?

Interpret x³ as the volume of a cube with side x. Increasing to x+dx adds a new volume that is dominated by three thin slabs corresponding to the cube’s faces. Each slab has base area x² and thickness dx, so the leading volume change is 3x²·dx. Smaller edge/corner pieces scale like dx² or dx³ and vanish after dividing by dx. Therefore d(x³)/dx = (3x²·dx)/dx = 3x².

What geometric/combinatorial idea makes the power rule d(x^n)/dx = n·x^(n−1) work?

When x is nudged to x+dx, the expression (x+dx)^n expands into many terms. The dominant change from x^n comes from terms that contain exactly one dx factor and n−1 factors of x. There are n choices for which of the n factors supplies the dx, so the leading change is n·x^(n−1)·dx. All other terms include dx² or higher, so they disappear in the dx→0 limit after dividing by dx. That leaves d(x^n)/dx = n·x^(n−1).

How can 1/x be differentiated without using the exponent rule directly?

Picture 1/x as the height of a rectangle with fixed area 1. Let the width be x, so the height must be 1/x. Increasing x by dx adds area equal to (height)·dx, but the height must decrease by d(1/x) so the total area stays 1. The required change in height is negative, and the cancellation leads to the same derivative as treating 1/x as x^(−1): d(1/x)/dx = −1/x².

Why does the derivative of sin θ equal cos θ on the unit circle?

On a unit circle, sin θ is the vertical height of the point reached after traveling arc length θ. A tiny increase dθ moves the point slightly along the circle, changing the height by d(sin θ). Zoomed in, the local geometry forms similar right triangles: the ratio d(sin θ)/dθ becomes adjacent-over-hypotenuse for angle θ. By the definition of cosine on a right triangle, that ratio equals cos θ, so d(sin θ)/dθ = cos θ.

Review Questions

  1. For f(x)=x², which part of the added area becomes negligible when computing dF/dx, and why does it vanish as dx→0?
  2. In the expansion of (x+dx)^n, which terms survive when forming the ratio d(x^n)/dx, and how does the count of surviving terms produce the factor n?
  3. Using the unit circle, what geometric ratio corresponds to d(sin θ)/dθ, and how does it match the definition of cosine?

Key Points

  1. 1

    Derivatives measure the ratio of an infinitesimal output change dF to an infinitesimal input change dx, capturing how tiny nudges propagate.

  2. 2

    For x², interpreting the function as square area shows the leading area increase is 2x·dx, while dx² is negligible, giving d(x²)/dx = 2x.

  3. 3

    For x³, interpreting the function as cube volume shows the leading volume increase is 3x²·dx from the three face slabs, giving d(x³)/dx = 3x².

  4. 4

    The power rule follows from counting the leading terms in (x+dx)^n: exactly one dx factor contributes at first order, producing d(x^n)/dx = n·x^(n−1).

  5. 5

    Fixed-area geometry can differentiate 1/x: increasing width forces height to drop so the area stays constant, yielding d(1/x)/dx = −1/x².

  6. 6

    On the unit circle, a small arc step dθ produces similar-triangle geometry that turns d(sin θ)/dθ into cos θ exactly.

  7. 7

    Graph intuition helps with shape, but exact derivatives come from unpacking what the function represents geometrically.

Highlights

The derivative of x² emerges from area bookkeeping: the dominant new area is 2x·dx, while the dx² corner term disappears when dividing by dx.
For x³, the leading change in volume comes from three thin face slabs of volume 3x²·dx; edge and corner contributions are higher order in dx.
The power rule is a “one dx at a time” argument: among the many terms in (x+dx)^n, only those with a single dx survive the dx→0 limit.
A fixed-area rectangle turns 1/x into a geometry problem, forcing a negative height change that matches −1/x².
The unit circle makes d(sin θ)/dθ = cos θ inevitable: the derivative becomes adjacent-over-hypotenuse for angle θ.

Topics

  • Derivative Intuition
  • Geometric Derivatives
  • Power Rule
  • Unit Circle Trig
  • Infinitesimals