Multivariable Calculus 10 | Directional Derivative

TL;DR

Directional derivatives measure the instantaneous change of f at x̃ along an arbitrary direction v, not just along coordinate axes.

Briefing Cornell Notes

Briefing

Directional derivatives extend partial derivatives by measuring how a multivariable function changes when moving in an arbitrary direction, not just along coordinate axes. For a function f: R^n → R and a point x̃, the directional derivative along a direction vector v is defined—when the relevant limit exists—using a difference quotient that moves the input from x̃ to x̃ + h v. The key idea is that “changing in the direction v” means shifting all coordinates at once according to the vector v, rather than changing only one component as in partial derivatives.

In two dimensions, contour lines provide an intuitive picture: partial derivatives correspond to moving along the x1- or x2-axis, turning the multivariable problem into an ordinary one-variable derivative along a line. The directional derivative generalizes this by allowing movement along any line through x̃, determined by v. To make the definition depend only on direction (not on vector length), v is typically taken as a unit vector.

Formally, the directional derivative of f at x̃ along v is the limit as h → 0 of [f(x̃ + h v) − f(x̃)] / h. Different textbooks use different notations for this same quantity, including variants of d with a subscript v, a capital D, or the “∂” symbol with an index v. Confusion can also arise because the gradient notation sometimes doubles as a directional derivative expression: the directional derivative can be written as the dot product of v with the gradient of f.

The computation becomes clean when f is totally differentiable at x̃. Under that condition, the directional derivative exists and can be rewritten as an ordinary derivative of a one-variable function. The method is to define a curve γ(t) = x̃ + t v, so that f(x̃ + h v) becomes f(γ(t)). Applying the multivariable chain rule shows that the derivative of the composition f(γ(t)) at t = 0 equals the Jacobian of f at x̃ multiplied by the derivative of γ at 0. Since γ′(0) is exactly the constant vector v, the result simplifies to: directional derivative along v = (Jacobian of f at x̃) · v.

Because f maps into R, the Jacobian of f at x̃ is the gradient ∇f(x̃). That yields the compact geometric formula: the directional derivative along v equals ∇f(x̃) · v. This identity also clarifies why the gradient is central in multivariable calculus: it packages all directional rates of change into a single vector, and taking a dot product with v extracts the rate in the chosen direction.

Cornell Notes

Directional derivatives measure the instantaneous rate of change of a multivariable function f: R^n → R at a point x̃ when moving in an arbitrary direction v. The definition uses a limit of a difference quotient that shifts the input from x̃ to x̃ + h v, turning the problem into a one-variable derivative along the line determined by v. When f is totally differentiable at x̃, the directional derivative always exists and can be computed via the chain rule using the curve γ(t) = x̃ + t v. The result is (Jacobian of f at x̃)·v, which for real-valued f equals ∇f(x̃)·v. This formula explains the notation and gives the gradient a geometric meaning: it encodes all directional derivatives at a point.

How does the directional derivative generalize partial derivatives?

Partial derivatives change only one coordinate at a time (e.g., x̃ + h e1 or x̃ + h e2). Directional derivatives change all coordinates simultaneously by moving from x̃ to x̃ + h v, where v specifies the direction. In 2D, partial derivatives correspond to moving along the coordinate axes; directional derivatives correspond to moving along any line through x̃.

What is the exact definition of the directional derivative along v at x̃?

It is the limit, if it exists: lim(h→0) [f(x̃ + h v) − f(x̃)] / h. Typically v is chosen as a unit vector so only the direction matters, not the vector’s length.

Why can directional derivatives be written using the gradient and a dot product?

When f is totally differentiable at x̃, define γ(t) = x̃ + t v. Then f(x̃ + h v) becomes f(γ(t)), and the chain rule gives d/dt f(γ(t)) at t=0 as J_f(x̃) · γ′(0). Since γ′(0) = v, the directional derivative equals J_f(x̃)·v. For f: R^n → R, J_f(x̃) is ∇f(x̃), so the directional derivative is ∇f(x̃)·v.

What notational pitfalls should be expected when reading different materials?

Directional derivatives can appear with different symbols (e.g., d with a subscript v, a capital D, or a ∂ with index v). Also, the gradient notation can be used in a way that represents directional derivatives via ∇f(x̃)·v, so it’s important not to confuse the gradient vector itself with the directional derivative quantity.

How does the chain rule enter the computation?

By converting the multivariable change into a one-variable composition. The curve γ(t) = x̃ + t v traces the line in direction v. Then the directional derivative becomes the ordinary derivative of f∘γ at t=0, computed using the multivariable chain rule: J_f(γ(t))·J_γ(t), evaluated at t=0.

Review Questions

Given f: R^n → R and a unit vector v, write the limit definition of the directional derivative at x̃.
Explain why choosing γ(t) = x̃ + t v helps compute the directional derivative using the chain rule.
If f is totally differentiable at x̃, what is the relationship between the directional derivative along v and ∇f(x̃)?

Key Points

1
Directional derivatives measure the instantaneous change of f at x̃ along an arbitrary direction v, not just along coordinate axes.
2
The definition uses the limit lim(h→0) [f(x̃ + h v) − f(x̃)]/h, where x̃ + h v represents moving in direction v.
3
Using a unit vector v ensures the derivative depends only on direction, not on the magnitude of v.
4
Different textbooks use different notations for directional derivatives, so symbol recognition matters.
5
When f is totally differentiable at x̃, the directional derivative exists and equals J_f(x̃)·v.
6
For real-valued functions f: R^n → R, J_f(x̃) is the gradient ∇f(x̃), so the directional derivative equals ∇f(x̃)·v.

Highlights

Directional derivatives turn “move in a direction” into the limit of a difference quotient using x̃ + h v.

For totally differentiable f, the directional derivative along v collapses to a simple formula: ∇f(x̃)·v.

The curve γ(t) = x̃ + t v converts a multivariable rate into an ordinary derivative via the chain rule.

Gradient notation often doubles as a directional-derivative formula through the dot product with v.