Multivariable Calculus 8 | Gradient
Based on The Bright Side of Mathematics's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
For f: R^n → R, the gradient ∇f(x̃) is the transpose of the 1×n Jacobian row vector, so its components are the partial derivatives of f at x̃.
Briefing
The gradient in multivariable calculus is built from partial derivatives and turns a scalar function into a vector field—making “direction of steepest change” a geometric object. For a totally differentiable function f: R^n → R, the gradient at a point x̃ is formed from the Jacobian matrix of f, which has one row and n columns; transposing it yields an n-dimensional column vector. In common notation, this gradient is written as ∇f(x̃) (or sometimes using the nabla symbol). This vector-valued viewpoint matters because it connects calculus to geometry: derivatives become arrows you can compare and dot with other vectors.
A concrete example uses f(x1, x2) = x1^2 + x2^2. The function forms a paraboloid in 3D, while its contour lines in the plane are circles. Computing partial derivatives gives ∂f/∂x1 = 2x1 and ∂f/∂x2 = 2x2, so the gradient is ∇f(x1, x2) = (2x1, 2x2). Visualizing this across the plane produces a field of arrows pointing outward from the origin; the arrows are shorter near (0,0) and longer farther away, reflecting how the rate of change grows with distance.
The gradient’s geometric power shows up when the chain rule is applied along a curve. Consider a curve γ: R → R^2 that traces a circle, for instance γ(t) = (cos t, sin t). The composition f∘γ maps each t to the scalar value f(cos t, sin t). Using the multivariable chain rule, the derivative of the composition can be expressed through Jacobians: J_f(γ(t)) · J_γ(t). Here, J_f is the row vector of partial derivatives evaluated at (cos t, sin t), and J_γ(t) is the derivative of (cos t, sin t), which is (−sin t, cos t). Multiplying these yields 0 for every t, meaning the rate of change of f along the circular path is always zero.
That result can be rewritten using the gradient. Since ∇f is the transpose of the Jacobian of f, the chain-rule product becomes an inner product in R^2: ⟨∇f(γ(t)), γ′(t)⟩. In this example, the inner product is zero, so the gradient vector at each point on the circle is orthogonal (perpendicular) to the curve’s tangent direction γ′(t). The immediate geometric takeaway is that moving along a level set (here, circles of constant x1^2 + x2^2) produces no change in f, because the gradient points perpendicular to the motion. The deeper meaning of this orthogonality is saved for the next installment.
Cornell Notes
For a scalar function f: R^n → R, the gradient ∇f(x̃) is the transpose of the Jacobian of f, producing an n-dimensional vector whose components are the partial derivatives of f. In the example f(x1, x2) = x1^2 + x2^2, the gradient is ∇f(x1, x2) = (2x1, 2x2), forming an outward-pointing vector field whose magnitude grows away from the origin. Applying the multivariable chain rule to a circular curve γ(t) = (cos t, sin t) shows that the derivative of f∘γ is always zero. Recasting the chain-rule product as an inner product reveals why: ⟨∇f(γ(t)), γ′(t)⟩ = 0, so the gradient is perpendicular to the curve’s tangent direction. This links calculus to geometry and foreshadows a broader interpretation of level sets.
How is the gradient ∇f(x̃) constructed from the Jacobian for a scalar-valued function?
Why does the example f(x1, x2) = x1^2 + x2^2 produce a vector field that points outward from the origin?
What does the chain rule along a curve γ(t) = (cos t, sin t) reveal about f(x1, x2) = x1^2 + x2^2?
How does rewriting the Jacobian product using the gradient turn the result into a geometric statement?
What geometric meaning does “inner product equals zero” carry in this context?
Review Questions
- For a scalar function f: R^n → R, what are the dimensions of its Jacobian and how does transposing it produce the gradient?
- In the example f(x1, x2) = x1^2 + x2^2, compute ∇f(x1, x2) and explain how its direction relates to the point (x1, x2).
- Why does ⟨∇f(γ(t)), γ′(t)⟩ = 0 imply that f remains constant along the curve γ(t)?
Key Points
- 1
For f: R^n → R, the gradient ∇f(x̃) is the transpose of the 1×n Jacobian row vector, so its components are the partial derivatives of f at x̃.
- 2
The function f(x1, x2) = x1^2 + x2^2 has gradient ∇f(x1, x2) = (2x1, 2x2), forming an outward-pointing vector field.
- 3
Contour lines of x1^2 + x2^2 are circles, matching the idea that f depends only on distance from the origin.
- 4
Along a curve γ(t) = (cos t, sin t), the multivariable chain rule computes the derivative of f∘γ using Jacobian multiplication.
- 5
For this specific example, the Jacobian product J_f(γ(t)) · J_γ(t) equals 0 for every t, so f does not change along the circle.
- 6
Expressing the chain-rule result as an inner product shows ⟨∇f(γ(t)), γ′(t)⟩ = 0, meaning the gradient is perpendicular to the curve’s tangent direction.
- 7
Orthogonality between the gradient and the tangent direction explains why level-set motion produces no change in the function value.