Three-dimensional linear transformations | Chapter 5, Essence of linear algebra
Based on 3Blue1Brown's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
A 3D linear transformation is completely determined by the images of the three standard basis vectors i-hat, j-hat, and k-hat.
Briefing
Linear transformations in three dimensions are fully determined by where they send the three standard basis vectors—so a 3D “grid-squishing” process can be encoded by just nine numbers in a 3×3 matrix. The key move is to imagine every point in space as the tip of a vector, then require that the transformation keeps grid lines parallel and evenly spaced while fixing the origin. Under those rules, the entire mapping from input vectors to output vectors is captured once the images of i-hat, j-hat, and k-hat are known.
In 3D, the standard basis vectors are the unit directions along the x, y, and z axes: i-hat, j-hat, and k-hat. Instead of tracking the full 3D grid (which quickly becomes visually cluttered), the transformation can be understood by following only these three axes. Record the coordinates of where i-hat lands, where j-hat lands, and where k-hat lands; then place those three coordinate triples as the columns of a 3×3 matrix. That matrix becomes a complete description of the linear transformation.
A concrete example makes the encoding feel tangible: a 90-degree rotation around the y-axis. Under this rotation, i-hat moves to the point (0, 0, −1), j-hat stays put at (0, 1, 0), and k-hat moves to (1, 0, 0). Using those three column vectors, the resulting 3×3 matrix stores the rotation in a compact form.
To apply the transformation to an arbitrary vector with coordinates (x, y, z), the process mirrors the 2D case. Each coordinate acts like an instruction for how much to scale the corresponding basis vector, and the scaled basis vectors add up to form the original vector. Crucially, this “scale-and-add” structure is compatible with the transformation: the same linear combination logic works before and after applying the matrix. Practically, that means multiplying the vector’s coordinates against the matrix’s columns and summing the results to get the output coordinates.
Matrix multiplication then takes on a geometric meaning: when two 3×3 matrices are multiplied, the right-hand matrix’s transformation happens first, followed by the left-hand matrix’s transformation. This ordering matters because the product represents composition of linear maps, not just number crunching. That composition viewpoint is especially important in computer graphics and robotics, where rotations in 3D are easier to understand and implement when they’re broken into successive, simpler rotations.
Finally, the discussion points toward the next topic—determinants—signaling that the next step will be about extracting deeper information from matrices beyond how they move basis vectors and compose transformations.
Cornell Notes
Three-dimensional linear transformations are determined entirely by how they move the standard basis vectors i-hat, j-hat, and k-hat. By recording the destination coordinates of these three vectors as the columns of a 3×3 matrix, the transformation is encoded using nine numbers. To transform any vector (x, y, z), use the same linear “scale-and-add” idea from 2D: scale the basis vectors by x, y, z, and combine them in a way consistent with the matrix. Matrix multiplication corresponds to composing transformations: for A·B, the transformation from B happens first, then A. This framework underpins practical tasks like rotations in computer graphics and robotics.
Why does knowing where i-hat, j-hat, and k-hat go fully determine a 3D linear transformation?
How does a 3×3 matrix encode a 3D transformation?
What does it mean to rotate 90 degrees around the y-axis in terms of basis vectors?
How do you compute the image of an arbitrary vector (x, y, z) using the matrix?
What is the geometric meaning of multiplying two matrices A·B?
Why are these matrix-composition ideas especially useful in computer graphics and robotics?
Review Questions
- Given a 3×3 matrix, how would you recover where it sends i-hat, j-hat, and k-hat?
- If a vector v is transformed by B and then by A, which product represents the combined transformation: A·B or B·A—and why?
- For a linear transformation, why does the scale-and-add structure guarantee that knowing the basis vectors is enough to determine the transformation everywhere?
Key Points
- 1
A 3D linear transformation is completely determined by the images of the three standard basis vectors i-hat, j-hat, and k-hat.
- 2
Placing the destination coordinates of i-hat, j-hat, and k-hat as the columns of a 3×3 matrix encodes the transformation using nine numbers.
- 3
To transform any vector (x, y, z), scale the matrix columns by x, y, z and add the results to get the output coordinates.
- 4
Matrix multiplication corresponds to composition of transformations: for A·B, B acts first, then A.
- 5
Rotations in 3D can be represented and combined efficiently by multiplying matrices, which supports workflows in computer graphics and robotics.
- 6
Understanding 3D transformations through basis vectors avoids the visual clutter of manipulating the entire 3D grid.
- 7
The next step after mastering composition and encoding is extracting additional information from matrices via determinants.