Get AI summaries of any video or article — Sign up free
Multivariable Calculus 8 | Gradient thumbnail

Multivariable Calculus 8 | Gradient

5 min read

Based on The Bright Side of Mathematics's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

For f: R^n → R, the gradient ∇f(x̃) is the transpose of the 1×n Jacobian row vector, so its components are the partial derivatives of f at x̃.

Briefing

The gradient in multivariable calculus is built from partial derivatives and turns a scalar function into a vector field—making “direction of steepest change” a geometric object. For a totally differentiable function f: R^n → R, the gradient at a point x̃ is formed from the Jacobian matrix of f, which has one row and n columns; transposing it yields an n-dimensional column vector. In common notation, this gradient is written as ∇f(x̃) (or sometimes using the nabla symbol). This vector-valued viewpoint matters because it connects calculus to geometry: derivatives become arrows you can compare and dot with other vectors.

A concrete example uses f(x1, x2) = x1^2 + x2^2. The function forms a paraboloid in 3D, while its contour lines in the plane are circles. Computing partial derivatives gives ∂f/∂x1 = 2x1 and ∂f/∂x2 = 2x2, so the gradient is ∇f(x1, x2) = (2x1, 2x2). Visualizing this across the plane produces a field of arrows pointing outward from the origin; the arrows are shorter near (0,0) and longer farther away, reflecting how the rate of change grows with distance.

The gradient’s geometric power shows up when the chain rule is applied along a curve. Consider a curve γ: R → R^2 that traces a circle, for instance γ(t) = (cos t, sin t). The composition f∘γ maps each t to the scalar value f(cos t, sin t). Using the multivariable chain rule, the derivative of the composition can be expressed through Jacobians: J_f(γ(t)) · J_γ(t). Here, J_f is the row vector of partial derivatives evaluated at (cos t, sin t), and J_γ(t) is the derivative of (cos t, sin t), which is (−sin t, cos t). Multiplying these yields 0 for every t, meaning the rate of change of f along the circular path is always zero.

That result can be rewritten using the gradient. Since ∇f is the transpose of the Jacobian of f, the chain-rule product becomes an inner product in R^2: ⟨∇f(γ(t)), γ′(t)⟩. In this example, the inner product is zero, so the gradient vector at each point on the circle is orthogonal (perpendicular) to the curve’s tangent direction γ′(t). The immediate geometric takeaway is that moving along a level set (here, circles of constant x1^2 + x2^2) produces no change in f, because the gradient points perpendicular to the motion. The deeper meaning of this orthogonality is saved for the next installment.

Cornell Notes

For a scalar function f: R^n → R, the gradient ∇f(x̃) is the transpose of the Jacobian of f, producing an n-dimensional vector whose components are the partial derivatives of f. In the example f(x1, x2) = x1^2 + x2^2, the gradient is ∇f(x1, x2) = (2x1, 2x2), forming an outward-pointing vector field whose magnitude grows away from the origin. Applying the multivariable chain rule to a circular curve γ(t) = (cos t, sin t) shows that the derivative of f∘γ is always zero. Recasting the chain-rule product as an inner product reveals why: ⟨∇f(γ(t)), γ′(t)⟩ = 0, so the gradient is perpendicular to the curve’s tangent direction. This links calculus to geometry and foreshadows a broader interpretation of level sets.

How is the gradient ∇f(x̃) constructed from the Jacobian for a scalar-valued function?

For f: R^n → R, the Jacobian J_f(x̃) is a 1×n matrix containing the partial derivatives: [∂f/∂x1, ∂f/∂x2, …, ∂f/∂xn] evaluated at x̃. Transposing that row vector gives an n×1 column vector, which is the gradient ∇f(x̃). In other words, ∇f(x̃) = (∂f/∂x1(x̃), …, ∂f/∂xn(x̃))^T. The transcript also notes the common alternative notation using the nabla symbol.

Why does the example f(x1, x2) = x1^2 + x2^2 produce a vector field that points outward from the origin?

The gradient is ∇f(x1, x2) = (2x1, 2x2). At any point (x1, x2), this vector points in the same direction as (x1, x2) itself, scaled by 2. Since (x1, x2) points radially outward from the origin, the gradient arrows also point outward. Near (0,0), both components are small, so arrows are short; farther away, the components grow, so arrows lengthen.

What does the chain rule along a curve γ(t) = (cos t, sin t) reveal about f(x1, x2) = x1^2 + x2^2?

Using the multivariable chain rule, the derivative of the composition f∘γ is computed via Jacobians: J_f(γ(t)) · J_γ(t). For this f, J_f at (cos t, sin t) becomes [2cos t, 2sin t]. The derivative of γ is γ′(t) = (−sin t, cos t), which forms J_γ(t). Multiplying gives 2cos t(−sin t) + 2sin t(cos t) = 0 for all t, so f does not change along the circular path.

How does rewriting the Jacobian product using the gradient turn the result into a geometric statement?

Because ∇f is the transpose of J_f, the chain-rule product can be expressed as an inner product: ⟨∇f(γ(t)), γ′(t)⟩. In this example, that inner product equals 0. In linear algebra, an inner product of zero means the vectors are orthogonal. So ∇f(γ(t)) is perpendicular to the tangent direction γ′(t) at every point on the circle.

What geometric meaning does “inner product equals zero” carry in this context?

It means the gradient points in a direction perpendicular to the curve’s motion. Since the curve γ(t) traces a circle where x1^2 + x2^2 stays constant, moving along that circle produces no change in f. The transcript frames this as the gradient giving an immediate geometric view: the gradient is normal to the level set (the circle), while the curve’s tangent lies along the level set.

Review Questions

  1. For a scalar function f: R^n → R, what are the dimensions of its Jacobian and how does transposing it produce the gradient?
  2. In the example f(x1, x2) = x1^2 + x2^2, compute ∇f(x1, x2) and explain how its direction relates to the point (x1, x2).
  3. Why does ⟨∇f(γ(t)), γ′(t)⟩ = 0 imply that f remains constant along the curve γ(t)?

Key Points

  1. 1

    For f: R^n → R, the gradient ∇f(x̃) is the transpose of the 1×n Jacobian row vector, so its components are the partial derivatives of f at x̃.

  2. 2

    The function f(x1, x2) = x1^2 + x2^2 has gradient ∇f(x1, x2) = (2x1, 2x2), forming an outward-pointing vector field.

  3. 3

    Contour lines of x1^2 + x2^2 are circles, matching the idea that f depends only on distance from the origin.

  4. 4

    Along a curve γ(t) = (cos t, sin t), the multivariable chain rule computes the derivative of f∘γ using Jacobian multiplication.

  5. 5

    For this specific example, the Jacobian product J_f(γ(t)) · J_γ(t) equals 0 for every t, so f does not change along the circle.

  6. 6

    Expressing the chain-rule result as an inner product shows ⟨∇f(γ(t)), γ′(t)⟩ = 0, meaning the gradient is perpendicular to the curve’s tangent direction.

  7. 7

    Orthogonality between the gradient and the tangent direction explains why level-set motion produces no change in the function value.

Highlights

The gradient turns a scalar function into a vector field by collecting all partial derivatives into one vector.
For f(x1, x2) = x1^2 + x2^2, ∇f(x1, x2) = (2x1, 2x2), so arrows point radially outward and grow with distance from the origin.
Along the circular curve γ(t) = (cos t, sin t), the chain rule yields a derivative of f∘γ equal to zero for all t.
Rewriting the chain-rule product as ⟨∇f(γ(t)), γ′(t)⟩ shows the gradient is perpendicular to the circle’s tangent direction.