Get AI summaries of any video or article — Sign up free
Multivariable Calculus 10 | Directional Derivative [dark version] thumbnail

Multivariable Calculus 10 | Directional Derivative [dark version]

4 min read

Based on The Bright Side of Mathematics's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Directional derivatives measure the rate of change of f at x~ along an arbitrary direction v using the limit of [f(x~ + h v) − f(x~)]/h as h → 0.

Briefing

Directional derivatives extend partial derivatives by measuring how a multivariable function changes when moving in an arbitrary direction, not just along a coordinate axis. For a function f: R^N → R and a point x~ in its domain, the directional derivative along a direction vector v is defined (when the limit exists) as

lim_{h→0} [f(x~ + h v) − f(x~)] / h.

The key idea is that “changing in the direction v” means shifting the input by h times v in all coordinates at once—vector addition in R^N—so the derivative captures the slope of f along the line through x~ pointing in direction v. Because v is typically taken as a unit vector, the magnitude of v doesn’t distort the meaning; only the direction matters.

In two dimensions, contour lines make the intuition concrete: partial derivatives correspond to moving along horizontal or vertical lines (fixing one coordinate and varying the other), while directional derivatives ask what happens when movement follows some other slanted line. That slanted line again turns the multivariable problem into a one-dimensional change along a curve, which motivates the general limit definition above.

Notation varies across textbooks, so confusion is common. Some use D with a subscript V, others use a capital D, and some write the directional derivative as ∂_V f or even use a symbol with index V. A frequent compact form is “v · ∇f,” where ∇f denotes the gradient. The gradient itself is a vector, but its dot product with v produces a scalar directional derivative—so the same symbol can appear in different roles depending on context.

The cleanest computation happens when f is totally differentiable at x~. Under that condition, the directional derivative limit always exists and can be rewritten as an ordinary one-variable derivative. The shift x~ + h v is packaged as a curve γ(t) = x~ + t v, so the directional derivative becomes the derivative of the composition f(γ(t)) at t = 0. Applying the multivariable chain rule yields a product of Jacobian matrices:

J_F(x~) · J_γ(0).

Because γ(t) = x~ + t v has constant derivative J_γ(0) = v, this collapses to a simple formula:

Directional derivative along v = J_F(x~) v.

Since the Jacobian of a scalar-valued function is the gradient, this is equivalently the inner product ∇f(x~) · v. This identity explains why the gradient is central: it encodes the function’s steepest-change information, and dotting it with v extracts the rate of change specifically along direction v—setting up the geometric interpretation for the next step in the series.

Cornell Notes

Directional derivatives generalize partial derivatives by measuring the rate of change of a scalar function f: R^N → R at a point x~ when moving in an arbitrary direction v. The definition uses a difference quotient along the line x~ + h v and takes the limit as h → 0. When f is totally differentiable at x~, the limit always exists and can be computed via the multivariable chain rule by introducing a curve γ(t) = x~ + t v. This yields a compact result: the directional derivative along v equals the Jacobian of f at x~ times v, which is the same as the dot product ∇f(x~) · v. The formula also clarifies why gradient-based notation often appears for directional derivatives.

How is “direction” built into the definition of a directional derivative in R^N?

Direction comes from a vector v in R^N. Moving “by h in direction v” means replacing the input x~ with x~ + h v, using vector addition across all coordinates at once. The directional derivative is then lim_{h→0} [f(x~ + h v) − f(x~)]/h. Taking v as a unit vector ensures only the direction matters, not the vector’s length.

Why do directional derivatives reduce to ordinary derivatives when f is totally differentiable?

With total differentiability at x~, one can define a curve γ(t) = x~ + t v. Then f(x~ + h v) becomes f(γ(t)), turning the multivariable limit into the one-variable derivative d/dt of f(γ(t)) evaluated at t = 0. This makes the directional derivative computable using standard derivative rules plus the chain rule.

What role does the multivariable chain rule play in deriving the final formula?

Applying the chain rule to the composition f(γ(t)) gives a product of Jacobians: J_F(γ(t)) · J_γ(t). Evaluating at t = 0 gives J_F(x~) · J_γ(0). Since γ(t) = x~ + t v has constant derivative J_γ(0) = v, the product simplifies directly to J_F(x~) v.

How does the gradient connect to directional derivatives?

For a scalar-valued function f, the Jacobian J_F(x~) is the gradient ∇f(x~). Therefore J_F(x~) v equals ∇f(x~) · v. The dot product produces a scalar directional derivative, which explains why many notations write the directional derivative as v · ∇f.

Why is it easy to get confused by notation across different books?

Directional derivatives are denoted in multiple ways: some use D with a subscript V, others use a capital D, and some use a symbol with index V. Additionally, the gradient symbol ∇f can appear in the directional derivative formula through the dot product ∇f · v. The meaning stays the same, but the symbols differ, so careful reading is required.

Review Questions

  1. Given f: R^N → R and a unit vector v, write the definition of the directional derivative at x~ and explain what x~ + h v represents.
  2. Assuming f is totally differentiable at x~, derive (or state) the formula for the directional derivative along v in terms of ∇f(x~) and v.
  3. Explain how introducing γ(t) = x~ + t v turns the directional derivative into a one-variable derivative at t = 0.

Key Points

  1. 1

    Directional derivatives measure the rate of change of f at x~ along an arbitrary direction v using the limit of [f(x~ + h v) − f(x~)]/h as h → 0.

  2. 2

    Changing in direction v means shifting the input by h times v in every coordinate: x~ + h v.

  3. 3

    Partial derivatives are special cases of directional derivatives when v points along a coordinate axis.

  4. 4

    When f is totally differentiable at x~, the directional derivative always exists and equals J_F(x~) v.

  5. 5

    For scalar-valued f, J_F(x~) is the gradient ∇f(x~), so the directional derivative becomes ∇f(x~) · v.

  6. 6

    Directional-derivative notation varies across textbooks, so the same concept may appear as D_V f, a capital D, or v · ∇f.

Highlights

Directional derivatives use the line x~ + h v, not a single coordinate change, so they capture slope along any chosen direction.
Under total differentiability, the directional derivative collapses to a simple chain-rule result: ∇f(x~) · v.
The gradient’s dot product with v turns a geometric “steepness” vector into the specific rate of change along v.
Notation for directional derivatives differs widely, but the underlying limit definition is consistent.