Multivariable Calculus 5 | Total Derivative [dark version]
Based on The Bright Side of Mathematics's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Total differentiability replaces the one-variable derivative’s scalar with a linear map L that best approximates f near a point.
Briefing
Total differentiability in several variables generalizes the one-dimensional idea of “best linear approximation” from a number to a linear map. Instead of approximating a function near a point by a line, a function f: ℝⁿ → ℝᵐ is approximated near a point x̃ by a linear transformation L acting on the increment h, with an error term that becomes negligible compared with the size of h. This matters because partial derivatives alone don’t capture how all variables change together; total derivatives do.
In one dimension, differentiability at x̃ means there exists a constant B (the derivative) and a remainder term R(h) such that f(x̃ + h) = f(x̃) + B·h + R(h), where R(h)/h → 0 as h → 0. The transcript then lifts this structure to higher dimensions by replacing the scalar B·h with a linear map L(h). For f: ℝⁿ → ℝᵐ, total differentiability at x̃ requires a linear map L and a remainder function Φ(h) so that f(x̃ + h) = f(x̃) + L(h) + Φ(h), with the condition that Φ(h)/||h|| → 0 as h → 0 (||h|| is the Euclidean norm of the vector h). The remainder must shrink faster than the vector’s length, ensuring the approximation works along any path: if h approaches the zero vector through any sequence, the normalized error still vanishes.
The linear map L is the multivariable derivative, often written as DF (or sometimes D with capitalization differences depending on convention). When L is represented using matrices, the Jacobian matrix J_F is used: the Jacobian encodes the linear map so that the approximation becomes a matrix–vector multiplication. In the special case n = m = 1, the Jacobian reduces to a 1×1 matrix containing f′(x̃), showing the multivariable definition truly includes the one-dimensional derivative.
A concrete two-dimensional example makes the idea tangible. Consider a function that flips coordinates: f: ℝ² → ℝ² with f(h₁, h₂) = (h₂, h₁). At the origin, f(0,0) = (0,0), and the function is already linear in h. That means the remainder term is identically zero, and the total derivative exists with Jacobian matrix [[0, 1], [1, 0]]. So the total derivative at (0,0) is exactly the coordinate-flipping linear transformation, captured by the Jacobian.
The takeaway is that total derivatives are not just “more partial derivatives.” They are linear maps (or Jacobians) that provide a single best linear approximation to a multivariable function near a point, with a rigorously controlled error term relative to the size of the input change.
Cornell Notes
Total differentiability generalizes one-variable differentiability by replacing the derivative’s single number with a linear map. For f: ℝⁿ → ℝᵐ, total differentiability at x̃ means there exists a linear map L such that f(x̃ + h) = f(x̃) + L(h) + Φ(h), where the remainder satisfies Φ(h)/||h|| → 0 as h → 0. The condition uses the Euclidean norm of h, ensuring the error shrinks faster than the size of the input change along any sequence approaching the zero vector. When L is expressed as a matrix, it is the Jacobian matrix J_F, which turns the approximation into a matrix–vector multiplication. In one dimension, the Jacobian collapses to the usual derivative f′(x̃).
How does total differentiability mirror the one-dimensional definition of differentiability?
Why does the remainder condition use ||h|| instead of dividing by h directly?
What does it mean that the condition must hold for any sequence of vectors h → 0?
How are the multivariable derivative and the Jacobian matrix related?
In the two-dimensional coordinate-flip example, why is the remainder term zero?
Review Questions
- State the formal condition for total differentiability of f: ℝⁿ → ℝᵐ at a point x̃, including the role of the Euclidean norm.
- Explain the difference between partial derivatives and the total derivative in terms of what the linear approximation captures.
- For a function f: ℝ² → ℝ² that flips coordinates, write the Jacobian matrix at (0,0) and justify why the remainder term is zero.
Key Points
- 1
Total differentiability replaces the one-variable derivative’s scalar with a linear map L that best approximates f near a point.
- 2
For f: ℝⁿ → ℝᵐ, total differentiability at x̃ requires f(x̃ + h) = f(x̃) + L(h) + Φ(h) with Φ(h)/||h|| → 0 as h → 0.
- 3
The denominator uses the Euclidean norm ||h|| to measure the size of a vector increment.
- 4
The remainder condition must hold along any sequence h → 0, ensuring the approximation works in all directions.
- 5
When the linear map is written in coordinates, it becomes the Jacobian matrix J_F, turning the approximation into J_F · h.
- 6
In the one-dimensional case, the Jacobian reduces to the usual derivative f′(x̃), showing consistency with classical calculus.