Get AI summaries of any video or article — Sign up free
Abstract Linear Algebra 38 | Invariant Subspaces thumbnail

Abstract Linear Algebra 38 | Invariant Subspaces

5 min read

Based on The Bright Side of Mathematics's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

A subspace U is invariant under a linear map L exactly when L(U) ⊆ U, so vectors never leave U under the transformation.

Briefing

Invariant subspaces are the key structural tool behind Jordan normal form: a subspace U of a vector space V is called invariant under a linear map L when applying L to any vector in U never leaves U. Formally, invariance means L(U) ⊆ U, so the image of U under L is either equal to U or a smaller subspace contained in it. This property matters because it lets mathematicians restrict attention to U itself—turning L into a new linear map that acts only within U—making complicated linear-algebra problems more manageable.

That restriction becomes especially powerful when studying a square matrix A over C. The path toward Jordan normal form starts with eigenvalues: once an eigenvalue Λ is fixed, the generalized eigenspace machinery builds a nested chain of subspaces using powers of (A − ΛI). The generalized eigenspace chain is organized by kernels: for each exponent k, one considers ker((A − ΛI)^k). As k increases, these kernels grow until they stabilize at the fitting index d (also called the size parameter for the chain). The fitting index satisfies 1 ≤ d ≤ n, where n is the matrix size, and different matrices can have different fitting indices.

A parallel chain exists for ranges: the same stabilization phenomenon occurs for the images of (A − ΛI)^k. As the exponent increases, the ranges shrink, again ending at the fitting index d. The crucial payoff is that the final subspaces from both chains—the stabilized kernel and the stabilized range—are invariant under the matrix A. In other words, once the chain has reached its terminal step, those terminal subspaces behave like “closed worlds” under the action of A.

The invariance proof uses the simplification of working with N = A − ΛI instead of A directly. Since A = N + ΛI, applying A to a vector x can be expressed as Ax = Nx + Λx. If both Nx and x lie in a candidate invariant subspace, then Ax automatically lies there too. For the kernel side, take x in ker(N^d). Then N^d x = 0, and applying N gives N^d(Nx) = N(N^d x) = N·0 = 0, so Nx remains in ker(N^d). That shows ker(N^d) is invariant under N, and therefore invariant under A.

For the range side, take y in ran(N^d). By definition of range, y = N^d x for some x. Then Ny = N(N^d x) = N^d(Nx), which is again an element of ran(N^d) because it has the same “N^d times something” form. This establishes invariance of ran(N^d) under N, and hence under A.

These invariant subspaces—built from the stabilized generalized eigenspace kernel and range—set up the decomposition into Jordan blocks. The details of that decomposition are reserved for the next step, but the structural groundwork is now in place: once the fitting index is reached, the terminal kernel and range provide invariant subspaces that can be used to carve the matrix into its Jordan components.

Cornell Notes

Invariant subspaces are subspaces U such that L(U) ⊆ U for a linear map L; this means applying L to vectors in U never leaves U. For a complex eigenvalue Λ of a matrix A, generalized eigenspaces are built from powers of N = A − ΛI using chains of kernels ker(N^k) and ranges ran(N^k). These chains stabilize at the fitting index d (1 ≤ d ≤ n). The stabilized terminal subspaces ker(N^d) and ran(N^d) are invariant under N, and therefore invariant under A as well. This invariance is the structural ingredient needed to decompose A into Jordan blocks in later work.

What exactly makes a subspace U “invariant” under a linear map L?

U is invariant under L when L(U) ⊆ U. Equivalently, for every vector u ∈ U, the image Lu must also lie in U. In that case, the restriction L|_U defines a new linear map from U to U, allowing analysis to happen entirely inside U.

How do generalized eigenspaces use the operator (A − ΛI), and what role does the fitting index d play?

Fix an eigenvalue Λ and set N = A − ΛI. The generalized eigenspace chain is built from ker(N^k). As k increases, ker(N^k) grows until it stabilizes at the fitting index d, meaning ker(N^d) = ker(N^{d+1}) = … . The fitting index satisfies 1 ≤ d ≤ n and can differ across matrices.

Why can invariance proofs be simplified by switching from A to N = A − ΛI?

Because A = N + ΛI. For any vector x, Ax = Nx + Λx. If x and Nx both lie in a candidate subspace U, then Ax is a linear combination of elements of U and must also lie in U. So it’s enough to prove invariance under N, then invariance under A follows.

How is invariance of ker(N^d) under N shown?

Take x ∈ ker(N^d), so N^d x = 0. Then compute N^d(Nx) = N(N^d x) = N·0 = 0. This means Nx ∈ ker(N^d). Therefore ker(N^d) is invariant under N (and thus under A).

How is invariance of ran(N^d) under N shown?

Take y ∈ ran(N^d). By definition, y = N^d x for some x. Then Ny = N(N^d x) = N^d(Nx), which has the same form N^d(something). Hence Ny ∈ ran(N^d), proving ran(N^d) is invariant under N (and thus under A).

Review Questions

  1. Given a linear map L and subspace U, how would you test whether U is invariant under L using the condition L(U) ⊆ U?
  2. What does it mean for the kernel chain ker((A − ΛI)^k) to stabilize at the fitting index d?
  3. Why do both ker(N^d) and ran(N^d) end up invariant under A, and what algebraic identities make the proofs work?

Key Points

  1. 1

    A subspace U is invariant under a linear map L exactly when L(U) ⊆ U, so vectors never leave U under the transformation.

  2. 2

    Invariant subspaces allow restriction of L to U, turning the problem into a smaller linear-algebra setting.

  3. 3

    For a fixed eigenvalue Λ, define N = A − ΛI and build chains ker(N^k) and ran(N^k) to study generalized eigenstructure.

  4. 4

    The fitting index d is the step where the kernel chain stabilizes (and the range chain stabilizes as well), with 1 ≤ d ≤ n.

  5. 5

    The stabilized kernel ker(N^d) is invariant under N because applying N to a vector in ker(N^d) keeps it in ker(N^d).

  6. 6

    The stabilized range ran(N^d) is invariant under N because multiplying a vector in ran(N^d) by N preserves the form N^d x.

  7. 7

    These invariant subspaces are the structural setup needed for decomposing A into Jordan blocks next.

Highlights

Invariant subspaces are “closed” under the linear map: once a vector is in U, applying L keeps it in U.
The fitting index d marks where the generalized eigenspace chains stop changing: ker(N^d) and ran(N^d) become terminal.
Invariance of ker(N^d) follows from N^d(Nx) = N(N^d x), turning a zero into a zero.
Invariance of ran(N^d) follows from Ny = N(N^d x) = N^d(Nx), preserving the range’s defining structure.
Working with N = A − ΛI simplifies proofs because A = N + ΛI lets Ax be written using Nx and x.

Topics

Mentioned

  • C
  • I