Abstract Linear Algebra 20 | Gram-Schmidt Orthonormalization [dark version]

TL;DR

Gram–Schmidt orthonormalization converts a basis {u1,…,uk} of a k-dimensional inner-product subspace into an orthonormal basis {b1,…,bk}.

Briefing Cornell Notes

Briefing

Gram–Schmidt orthonormalization turns any basis of a finite-dimensional inner-product subspace into an orthonormal basis that matches the geometry of that space—making tasks like orthogonal projections and computations far simpler. Starting with a k-dimensional subspace U inside an inner-product space V, the process takes an arbitrary basis {u1, u2, …, uk} and produces an orthonormal basis {b1, b2, …, bk} where each pair of basis vectors is orthogonal and each vector has length 1 under the given inner product.

The procedure is often described as “orthogonalization plus normalization.” First, each new vector is adjusted to remove components in the directions of the previously constructed orthonormal vectors. Then the remaining “normal component” is scaled to unit length. The key point is that “length” and “norm” must be computed using the inner product’s induced norm: for a vector x, its norm is √⟨x, x⟩. In abstract settings there is no default Euclidean norm—using the correct inner-product-based norm is essential.

The construction begins with u1. One normalizes it to get b1 by scaling u1 by 1/||u1||, where ||u1|| = √⟨u1, u1⟩. Next, to build b2, the method removes from u2 its projection onto the one-dimensional subspace spanned by b1. Because b1 is already an orthonormal basis element, the projection of u2 onto span{b1} is simply ⟨u2, b1⟩ b1. Subtracting this projected component from u2 yields the vector orthogonal to b1; call it b2~ = u2 − ⟨u2, b1⟩ b1. Finally, b2 is obtained by normalizing b2~ to unit length.

For b3, the same pattern repeats but with a larger previously built orthonormal set. Since {b1, b2} spans a two-dimensional subspace, the orthogonal projection of u3 onto that subspace is the sum of its projections onto each basis direction: ⟨u3, b1⟩ b1 + ⟨u3, b2⟩ b2. The orthogonal component is then b3~ = u3 − (⟨u3, b1⟩ b1 + ⟨u3, b2⟩ b2), and normalization produces b3.

After that, the algorithm continues inductively until bk. At the final step, UK is projected onto the (k−1)-dimensional span of {b1, …, b(k−1)} by summing one-dimensional projections: Σ_{j=1}^{k−1} ⟨uK, bj⟩ bj. Subtracting this from uK gives the orthogonal component, and scaling yields bk. The result is a full orthonormal basis of U, built through repeated subtraction of projection components and normalization at each stage—exactly tailored to the inner product’s geometry.

Cornell Notes

Gram–Schmidt orthonormalization converts a basis {u1,…,uk} of a k-dimensional inner-product subspace U into an orthonormal basis {b1,…,bk}. Each step constructs a new vector by removing from ui the components lying in the span of the already-built orthonormal vectors, using orthogonal projections. Because the earlier vectors are orthonormal, projections simplify to one-dimensional terms like ⟨ui, bj⟩ bj, and the orthogonal component becomes ui minus the sum of those projections. The final step at every stage normalizes the orthogonal component using the inner-product norm ||x|| = √⟨x,x⟩. The payoff is an orthonormal basis that makes later projection calculations straightforward.

Why does Gram–Schmidt require the norm ||x|| = √⟨x,x⟩ rather than a default Euclidean length?

In an abstract inner-product space, “length” is defined by the inner product itself. The induced norm is ||x|| = √⟨x,x⟩, and normalization must use that quantity. If a special inner product is used, the correct norm changes accordingly; using the standard Euclidean norm would generally be wrong.

How is b2 constructed from u2 once b1 is already orthonormal?

With b1 orthonormal, the projection of u2 onto span{b1} is ⟨u2,b1⟩ b1. The orthogonal component is b2~ = u2 − ⟨u2,b1⟩ b1, which is orthogonal to b1 by construction. Then b2 is obtained by normalizing: b2 = b2~/||b2~||.

What changes when moving from b2 to b3?

Instead of projecting onto a one-dimensional span, the method projects onto the two-dimensional span of {b1,b2}. The projection of u3 onto that span is the sum of its one-dimensional projections: ⟨u3,b1⟩ b1 + ⟨u3,b2⟩ b2. Subtracting this from u3 gives the orthogonal component b3~ = u3 − (⟨u3,b1⟩ b1 + ⟨u3,b2⟩ b2), followed by normalization to get b3.

What is the general projection formula used at the final step to build bk?

At step k, the already-built vectors {b1,…,b(k−1)} form an orthonormal basis for a (k−1)-dimensional subspace. The orthogonal projection of uK onto that subspace is Σ_{j=1}^{k−1} ⟨uK,bj⟩ bj. The orthogonal component is uK − Σ_{j=1}^{k−1} ⟨uK,bj⟩ bj, and normalizing it produces bk.

Why does having an orthonormal set make projection calculations easier?

When the basis vectors bj are orthonormal, each projection onto span{bj} becomes a simple scalar multiple ⟨u, bj⟩ bj. That turns the projection onto a larger span into a sum of these one-dimensional projections, avoiding more complicated coefficient-solving.

Review Questions

Given an inner product ⟨·,·⟩, what formula defines the norm used for normalization in Gram–Schmidt?
In Gram–Schmidt, why does subtracting the orthogonal projection guarantee the new vector is orthogonal to the previously constructed vectors?
Write the expression for the orthogonal projection of uK onto span{b1,…,b(k−1)} when {b1,…,b(k−1)} is orthonormal.

Key Points

1
Gram–Schmidt orthonormalization converts a basis {u1,…,uk} of a k-dimensional inner-product subspace into an orthonormal basis {b1,…,bk}.
2
Each new vector is formed by subtracting its orthogonal projection onto the span of the previously constructed orthonormal vectors.
3
Projections simplify because previously constructed vectors are orthonormal: projecting u onto span{bj} yields ⟨u,bj⟩ bj.
4
The orthogonal component is the “normal part”: ui minus the sum of its projections onto the earlier bj directions.
5
Normalization must use the inner-product-induced norm ||x|| = √⟨x,x⟩, not any default Euclidean length.
6
The process repeats inductively until bk, producing an orthonormal basis tailored to the space’s geometry.

Highlights

The algorithm builds orthonormal vectors one at a time: normalize u1, then repeatedly subtract projection components and normalize the remainder.

Orthogonal projection onto an orthonormal span becomes a clean sum of one-dimensional projections: Σ ⟨u,bj⟩ bj.

Correct normalization in abstract spaces depends on the inner product via ||x|| = √⟨x,x⟩.

At step k, the projection of uK onto span{b1,…,b(k−1)} is Σ_{j=1}^{k−1} ⟨uK,bj⟩ bj, and bk comes from normalizing the leftover orthogonal component.