Multivariable Calculus 5 | Total Derivative
Based on The Bright Side of Mathematics's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Total differentiability generalizes the single-variable idea of approximating a function near a point by a linear model.
Briefing
Total differentiability in several variables generalizes the familiar “best linear approximation” from single-variable calculus. Instead of approximating a function near a point by a line with slope, multivariable calculus approximates it near a point by a linear map that takes small input changes (vectors) to output changes. The key requirement is that the difference between the function’s actual change and the linear prediction becomes negligible compared with the size of the input change as the input vector approaches the zero vector.
In one dimension, differentiability at a point x₀ means there exists a number B and a remainder term R(h) such that f(x₀+h) = f(x₀) + B·h + R(h), with the remainder shrinking faster than h as h → 0. The multivariable version keeps the same structure but replaces “multiply by a slope” with “apply a linear map.” For a function f: ℝⁿ → ℝᵐ, total differentiability at a point x₀ requires a linear map L (the multivariable derivative) and an error term that becomes small relative to the length of the input perturbation. Concretely, one writes f(x₀+h) = f(x₀) + L(h) + r(h), where the remainder r(h) satisfies r(h)/||h|| → 0 as h → 0. Because h is a vector, the normalization uses the Euclidean (ℓ₂) norm ||h||, i.e., the distance from h to the zero vector. The definition is pointwise: it must hold at the chosen x₀, and it must work along any way h approaches 0.
This linear map L is the “total derivative” and is often denoted Df or DF at the point x₀. When L is represented using a matrix, that matrix is the Jacobian matrix J_F of f at x₀, and the linear approximation becomes ordinary matrix-vector multiplication. The transcript emphasizes that the derivative is not merely a number in higher dimensions—it is an entire linear transformation capturing how all input directions affect all output components.
The discussion also shows how the definition reduces to the one-dimensional case when n = m = 1: the total derivative becomes a 1×1 matrix whose entry is f′(x₀), and L(h) is just f′(x₀)·h. A two-dimensional example then makes the idea concrete. Consider a function that flips two components: f(h₁, h₂) = (h₂, h₁). At the origin, f(0,0) = (0,0), and the function is already linear, so the remainder term is zero. The corresponding Jacobian matrix at (0,0) is [0 1; 1 0], which, when multiplied by the input vector (h₁, h₂), produces (h₂, h₁). That example illustrates how total differentiability can be checked by finding the correct linear map (or Jacobian) that matches the function’s first-order behavior near the point.
The takeaway is that total differentiability is the formal condition guaranteeing a reliable linear approximation in multiple dimensions, with the Jacobian providing the practical matrix form of the derivative.
Cornell Notes
Total differentiability in several variables is the multivariable version of “best linear approximation.” For f: ℝⁿ → ℝᵐ, being totally differentiable at x₀ means there exists a linear map L such that f(x₀+h) = f(x₀) + L(h) + r(h), where the remainder r(h) satisfies r(h)/||h|| → 0 as h → 0. Because h is a vector, the error is measured relative to the vector’s length (Euclidean norm). The linear map L is the total derivative and can be represented by the Jacobian matrix J_F at x₀. In the 1D case, the Jacobian reduces to the usual derivative f′(x₀), and in a 2D example that swaps coordinates, the Jacobian is the matrix that performs the swap exactly with zero remainder.
How does the definition of differentiability in one variable translate to several variables?
Why does the multivariable definition divide by ||h|| instead of h?
What exactly is the “total derivative” in this setting?
How does the definition recover the usual derivative when n = m = 1?
In the coordinate-swap example f(h₁,h₂) = (h₂,h₁), why is the remainder term zero at the origin?
Review Questions
- State the formal condition for f: ℝⁿ → ℝᵐ to be totally differentiable at x₀, including how the remainder term behaves.
- Explain the role of the Jacobian matrix in representing the total derivative.
- Why does the multivariable definition use the Euclidean norm ||h|| when measuring the size of the error?
Key Points
- 1
Total differentiability generalizes the single-variable idea of approximating a function near a point by a linear model.
- 2
For f: ℝⁿ → ℝᵐ, total differentiability at x₀ requires a linear map L such that f(x₀+h) = f(x₀) + L(h) + r(h).
- 3
The remainder must satisfy r(h)/||h|| → 0 as h → 0, using the Euclidean norm of the input perturbation.
- 4
The total derivative is the linear map L; when expressed as a matrix, it is the Jacobian matrix J_F at x₀.
- 5
In one dimension, the Jacobian reduces to the usual derivative f′(x₀).
- 6
A function that is already linear (like swapping two coordinates) is totally differentiable with zero remainder at the point where the linear form matches exactly.