Multivariable Calculus 16 | Taylor's Theorem [dark version]

TL;DR

For f ∈ C^{k+1}(R^n), values near x̃ satisfy f(x̃ + h) = T_k(x̃, h) + R_k with an error that shrinks as h → 0.

Briefing Cornell Notes

Briefing

Taylor’s theorem for multivariable functions generalizes the familiar 1D idea of approximating a smooth function near an expansion point with a polynomial whose coefficients come from derivatives. The key takeaway is that for a function f: R^n → R that is sufficiently differentiable (specifically f ∈ C^{k+1}(R^n)), the value at a nearby point x̃ + h can be written as a k-th order Taylor polynomial plus a remainder term that becomes negligible faster than ‖h‖^k as h → 0. This matters because it turns local behavior of multivariable functions into an explicit polynomial approximation—central for analysis, optimization, and numerical methods.

The construction starts with the same logic as in one dimension: the first-order (linear) approximation uses differentiability to match the tangent behavior. In multiple dimensions, the linear term is expressed using the Jacobian matrix J_f evaluated at the expansion point x̃, producing a linear map applied to the increment vector h. The quality of this approximation is tracked by an error term Φ(h) that satisfies Φ(h)/‖h‖ → 0 as h → 0.

For the quadratic approximation, the multivariable version replaces “the second derivative” with the Hessian matrix. The quadratic term takes the form h^T H_f(x̃) h, where the vector h appears on both sides so that the scalar output matches the second-order curvature information encoded in the Hessian. As with the linear case, a remainder term Ψ(h) measures the error, and the condition for a true quadratic approximation is that Ψ(h)/‖h‖^2 → 0 as h → 0.

With these building blocks, the full k-th order Taylor theorem is stated using multi-index notation. For multi-indices α, the Taylor polynomial T_k(x̃, h) is a sum over all α with |α| ≤ k of the form (∂^α f(x̃)/α!) h^α. Here, ∂^α denotes the mixed partial derivative corresponding to α, α! is the product of factorials of the components of α, and h^α is the corresponding product of powers of the components of h. The theorem then adds a remainder term R_k that captures the approximation error.

The remainder term mirrors the 1D structure: it uses derivatives of order k+1 evaluated at an intermediate point c, which lies on the line segment between x̃ and x̃ + h. In the multivariable setting, “intermediate” means c is a point on that straight path in R^n, generalizing the single-variable point between x̃ and x̃ + h. The result reduces to the standard 1D Taylor theorem when n = 1, making the multivariable formula a direct extension rather than a new, unrelated concept. The practical implication is clear: once f has enough continuous derivatives, its local behavior near x̃ is well-approximated by a polynomial whose coefficients are determined by partial derivatives at x̃, with a controlled error that shrinks rapidly as h becomes small.

Cornell Notes

Multivariable Taylor’s theorem extends the 1D polynomial approximation idea to functions f: R^n → R near an expansion point x̃. If f has continuous partial derivatives up to order k+1 (f ∈ C^{k+1}(R^n)), then for a small increment h the value f(x̃ + h) equals a k-th order Taylor polynomial T_k(x̃, h) plus a remainder R_k. The polynomial uses multi-index notation: T_k sums terms (∂^α f(x̃)/α!) h^α over all multi-indices α with |α| ≤ k. The remainder depends on derivatives of order k+1 evaluated at an intermediate point c on the line segment between x̃ and x̃ + h. This matters because it provides a systematic way to approximate multivariable functions locally with a controlled error.

How does the linear approximation of a multivariable function near x̃ relate to the Jacobian matrix?

For f: R^n → R, the linear approximation near x̃ uses the Jacobian matrix J_f(x̃). The increment h is an n-dimensional vector, and the linear term is a Jacobian-based linear map applied to h (often written as J_f(x̃)·h via matrix-vector multiplication). The approximation error is tracked by a term Φ(h) such that Φ(h)/‖h‖ → 0 as h → 0, meaning the linear part captures the dominant behavior.

What replaces the “second derivative” in the quadratic approximation, and how does it enter the formula?

The Hessian matrix H_f(x̃) replaces the second derivative. The quadratic term is expressed as h^T H_f(x̃) h, producing a scalar from the curvature information in the Hessian. Because vectors can’t be squared directly, the formula uses h on both sides with a transpose on the left. The quadratic remainder Ψ(h) satisfies Ψ(h)/‖h‖^2 → 0 as h → 0, ensuring the error shrinks faster than the square of the step size.

How is the k-th order Taylor polynomial written using multi-index notation?

The Taylor polynomial T_k(x̃, h) is a sum over all multi-indices α with |α| ≤ k: T_k(x̃, h) = Σ_{|α|≤k} (∂^α f(x̃)/α!) h^α. Here ∂^α denotes the mixed partial derivative corresponding to α, α! is the product of factorials of α’s components, and h^α is the product of the corresponding powers of h’s components. This structure ensures the highest derivative order appearing is k.

What does the remainder term R_k depend on, and what is the role of the intermediate point c?

The remainder term R_k involves derivatives of order k+1 evaluated at an intermediate point c. That point c lies on the line segment between x̃ and x̃ + h, meaning it is on the path c = x̃ + t h for some t between 0 and 1. This mirrors the 1D “between the endpoints” idea, but in R^n it becomes a point on the straight line connecting the two points.

Why does the multivariable theorem reduce to the familiar 1D Taylor theorem when n = 1?

When n = 1, the increment h becomes a scalar and multi-index notation collapses to ordinary derivatives. The Jacobian becomes the usual derivative, the Hessian becomes the second derivative, and the intermediate point c lies between x̃ and x̃ + h exactly as in the standard 1D remainder formula. The multivariable structure is therefore a direct generalization rather than a separate theory.

Review Questions

What conditions on f (in terms of C^{k+1}(R^n)) are needed to guarantee a k-th order Taylor expansion with a controlled remainder?
Write the general form of the k-th order Taylor polynomial T_k(x̃, h) using multi-index notation, including the roles of ∂^α, α!, and h^α.
In the remainder term, where does the intermediate point c lie relative to x̃ and x̃ + h, and why is that geometric description important?

Key Points

1
For f ∈ C^{k+1}(R^n), values near x̃ satisfy f(x̃ + h) = T_k(x̃, h) + R_k with an error that shrinks as h → 0.
2
The linear approximation uses the Jacobian matrix J_f(x̃) applied to the increment vector h, with an error term Φ(h) satisfying Φ(h)/‖h‖ → 0.
3
The quadratic approximation uses the Hessian matrix H_f(x̃) in the scalar form h^T H_f(x̃) h, with a remainder Ψ(h) satisfying Ψ(h)/‖h‖^2 → 0.
4
The k-th order Taylor polynomial is built from mixed partial derivatives at x̃ using multi-index notation: Σ_{|α|≤k} (∂^α f(x̃)/α!) h^α.
5
The remainder term R_k depends on derivatives of order k+1 evaluated at an intermediate point c on the line segment between x̃ and x̃ + h.
6
Setting n = 1 collapses the multivariable formulas back to the standard one-dimensional Taylor theorem, including the “between points” intermediate location.

Highlights

The multivariable Taylor polynomial uses multi-index sums over all derivative orders up to k, with coefficients (∂^α f(x̃)/α!) and monomials h^α.

Quadratic behavior is captured by the Hessian through the scalar expression h^T H_f(x̃) h.

Remainders are evaluated at an intermediate point c located on the straight line between x̃ and x̃ + h, generalizing the 1D intermediate-point idea.

The approximation quality is controlled by remainder terms that vanish faster than the corresponding power of ‖h‖ (linear: faster than ‖h‖; quadratic: faster than ‖h‖^2).

Topics

Taylor’s Theorem
Multivariable Differentiation
Jacobian Approximation
Hessian Quadratic Form
Multi-Index Notation

Mentioned

PDF
R^n
R
C^{k+1}