Geometry of determinants
Fri, Feb 07, 2025
Our main mission is to gain some level of understanding as to why determinants behave the way they do.
\[ \newcommand{\rowopswap}[2]{R_{#1}\leftrightarrow R_{#2}} \newcommand{\rowopmult}[2]{#1R_{#2}} \newcommand{\rowopadd}[3]{#1R_{#2}+R_{#3}} \newcommand{\elemswap}[2]{E_{#1,#2}} \newcommand{\elemmult}[2]{E_{#2}\left(#1\right)} \newcommand{\elemadd}[3]{E_{#2,#3}\left(#1\right)} \newcommand{\detname}[1]{\det\left(#1\right)} \newcommand{\submatrix}[3]{#1\left(#2|#3\right)} \newcommand{\matrixentry}[2]{\left\lbrack#1\right\rbrack_{#2}} \newcommand{\detbars}[1]{\left\lvert#1\right\rvert} \newcommand{\transpose}[1]{#1^{T}} \newcommand{\inverse}[1]{#1^{-1}} \]
Here are a couple of nice facts about determinants:
A good way to understand the behavior of just about any kind of function is via the geometric action that it induces. In the case of a linear transformation, we might try to understand the degree to which is stretches out area or, more generally, \(n\)-dimensional volume.
Typically, this is done by envisioning the effect of the transformation on the unit square (or \(n\)-dimensional unit cube).
Press “Apply” to see the action induced on the unit square by the matrix \[\begin{bmatrix}4&2\\2&-2\end{bmatrix}.\]
Ultimately, we’ve got to figure out
Last time, we learned the somewhat crazy but common definition of the determinant, namely:
\[\begin{align*} \detname{A}&= \matrixentry{A}{11}\detname{\submatrix{A}{1}{1}} -\matrixentry{A}{12}\detname{\submatrix{A}{1}{2}} +\matrixentry{A}{13}\detname{\submatrix{A}{1}{3}}-\\ &\quad \matrixentry{A}{14}\detname{\submatrix{A}{1}{4}} +\cdots +(-1)^{n+1}\matrixentry{A}{1n}\detname{\submatrix{A}{1}{n}}\text{.} \end{align*}\]
The formula is an example of a Laplace expansion.
This is not the only way to compute a determinant, though.
Actually, you can expand about any row or column. That is, we have a
Row expansion and a
\[\begin{align*} \detname{A}&= (-1)^{i+1}\matrixentry{A}{i1}\detname{\submatrix{A}{i}{1}}+ (-1)^{i+2}\matrixentry{A}{i2}\detname{\submatrix{A}{i}{2}}\\ &\quad+(-1)^{i+3}\matrixentry{A}{i3}\detname{\submatrix{A}{i}{3}}+ \cdots+ (-1)^{i+n}\matrixentry{A}{in}\detname{\submatrix{A}{i}{n}} \end{align*}\]
Column expansion
\[\begin{align*} \detname{A}&= (-1)^{1+j}\matrixentry{A}{1j}\detname{\submatrix{A}{1}{j}}+ (-1)^{2+j}\matrixentry{A}{2j}\detname{\submatrix{A}{2}{j}}\\ &\quad+(-1)^{3+j}\matrixentry{A}{3j}\detname{\submatrix{A}{3}{j}}+ \cdots+ (-1)^{n+j}\matrixentry{A}{nj}\detname{\submatrix{A}{n}{j}} \end{align*}\]
\[\begin{equation*} \tiny A=\begin{bmatrix} -2 & 3 & 0 & 1\\ 9 & -2 & 0 & 1\\ 1 & 3 & -2 & -1\\ 4 & 1 & 2 & 6 \end{bmatrix}\text{.} \end{equation*}\]
\[\begin{align*} \detbars{A} &= (4)(-1)^{4+1} \begin{vmatrix} 3 & 0 & 1\\ -2 & 0 & 1\\ 3 & -2 & -1 \end{vmatrix} +(1)(-1)^{4+2} \begin{vmatrix} -2 & 0 & 1\\ 9 & 0 & 1\\ 1 & -2 & -1 \end{vmatrix}\\ &\quad\quad+(2)(-1)^{4+3} \begin{vmatrix} -2 & 3 & 1\\ 9 & -2 & 1\\ 1 & 3 & -1 \end{vmatrix} +(6)(-1)^{4+4} \begin{vmatrix} -2 & 3 & 0 \\ 9 & -2 & 0 \\ 1 & 3 & -2 \end{vmatrix}\\ &= (-4)(10)+(1)(-22)+(-2)(61)+6(46)=92\text{.} \end{align*}\]
\[\begin{equation*} \tiny A=\begin{bmatrix} -2 & 3 & 0 & 1\\ 9 & -2 & 0 & 1\\ 1 & 3 & -2 & -1\\ 4 & 1 & 2 & 6 \end{bmatrix}\text{.} \end{equation*}\]
\[\begin{align*} \detbars{A} &= (0)(-1)^{1+3} \begin{vmatrix} 9 & -2 & 1\\ 1 & 3 & -1\\ 4 & 1 & 6 \end{vmatrix} + (0)(-1)^{2+3} \begin{vmatrix} -2 & 3 & 1\\ 1 & 3 & -1\\ 4 & 1 & 6 \end{vmatrix} +\\ &\quad\quad(-2)(-1)^{3+3} \begin{vmatrix} -2 & 3 & 1\\ 9 & -2 & 1\\ 4 & 1 & 6 \end{vmatrix} + (2)(-1)^{4+3} \begin{vmatrix} -2 & 3 & 1\\ 9 & -2 & 1\\ 1 & 3 & -1 \end{vmatrix}\\ &=0+0+(-2)(-107)+(-2)(61)=92\text{.} \end{align*}\]
It’s easy to see these alternate expansions should be true for all \(2\times2\) matrices (try it!).
The result extends to higher dimensions using induction.
Sometimes, the specific form of a matrix may make one row or column easier to expand about.
As we’ll see in our next example…
The determinant of a triangular matrix is always the product of the terms on the diagonal; you can see why by expanding along the first row or column.
\[\begin{align*} &\begin{vmatrix} 2 & 3 & -1 & 3 & 3\\ 0 & -1 & 5 & 2 & -1\\ 0 & 0 & 3 & 9 & 2\\ 0 & 0 & 0 & -1 & 3\\ 0 & 0 & 0 & 0 & 5 \end{vmatrix} =2(-1)^{1+1} \begin{vmatrix} -1 & 5 & 2 & -1\\ 0 & 3 & 9 & 2\\ 0 & 0 & -1 & 3\\ 0 & 0 & 0 & 5 \end{vmatrix} \\ &=2(-1)(-1)^{1+1} \begin{vmatrix} 3 & 9 & 2\\ 0 & -1 & 3\\ 0 & 0 & 5 \end{vmatrix} =2(-1)(3)(-1)^{1+1} \begin{vmatrix} -1 & 3\\ 0 & 5 \end{vmatrix}\\ &=2(-1)(3)(-1)(-1)^{1+1} \begin{vmatrix} 5 \end{vmatrix} =2(-1)(3)(-1)(5)=30 \end{align*}\]
The process of triangulation followed by multiplication along the diagonal is actually a rather efficient way to compute the determinant for large matrices.
The triangular form also connects the determinant to area much more directly.
Our next step, in fact, will be to use a special class of matrices to keep track of the row reduction process to allow us to gain a better understanding of the connection between the determinant and area.
The elementary row operations, it turns out, affect the value of a determinant in predictable ways. In fact,
Thus, another way to compute a determinant is to
Each of the elementary row operations can be expressed in terms of matrices called the elementary matrices, each of which is generated by applying an elementary row operation to the identity matrix.
Furthermore, the effect of multiplication by an elementary matrix is equivalent to applying the elementary row operation in the first place!
Here’s an example illustrating the correspondence between elementary row operations and elementary matrices for a \(3\times4\) matrix.
\[\tiny A= \begin{bmatrix} 2 & 1 & 3 & 1\\ 1 & 3 & 2 & 4\\ 5 & 0 & 3 & 1 \end{bmatrix}\]
\[\begin{align*} \tiny \rowopswap{1}{3}:\ & \tiny \begin{bmatrix} 5 & 0 & 3 & 1\\ 1 & 3 & 2 & 4\\ 2 & 1 & 3 & 1 \end{bmatrix} & \tiny \elemswap{1}{3}:\ & \tiny \begin{bmatrix} 0 & 0 & 1\\ 0 & 1 & 0\\ 1 & 0 & 0 \end{bmatrix} \begin{bmatrix} 2 & 1 & 3 & 1\\ 1 & 3 & 2 & 4\\ 5 & 0 & 3 & 1 \end{bmatrix} = \begin{bmatrix} 5 & 0 & 3 & 1\\ 1 & 3 & 2 & 4\\ 2 & 1 & 3 & 1 \end{bmatrix}\\ \tiny \rowopmult{2}{2}:\ & \tiny \begin{bmatrix} 5 & 0 & 3 & 1\\ 2 & 6 & 4 & 8\\ 2 & 1 & 3 & 1 \end{bmatrix} & \tiny \elemmult{2}{2}:\ & \tiny \begin{bmatrix} 1 & 0 & 0\\ 0 & 2 & 0\\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} 5 & 0 & 3 & 1\\ 1 & 3 & 2 & 4\\ 2 & 1 & 3 & 1 \end{bmatrix} = \begin{bmatrix} 5 & 0 & 3 & 1\\ 2 & 6 & 4 & 8\\ 2 & 1 & 3 & 1 \end{bmatrix}\\ \tiny \rowopadd{2}{3}{1}:\ & \tiny \begin{bmatrix} 9 & 2 & 9 & 3\\ 2 & 6 & 4 & 8\\ 2 & 1 & 3 & 1 \end{bmatrix} & \tiny \elemadd{2}{3}{1}:\ & \tiny \begin{bmatrix} 1 & 0 & 2\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} 5 & 0 & 3 & 1\\ 2 & 6 & 4 & 8\\ 2 & 1 & 3 & 1 \end{bmatrix} = \begin{bmatrix} 9 & 2 & 9 & 3\\ 2 & 6 & 4 & 8\\ 2 & 1 & 3 & 1 \end{bmatrix} \end{align*}\]
Adding a constant times one row to another simply skews the picture, which preserves the area.
viewof add_step = Inputs.button(tex`\text{Add }2R_2 + R_1`);
add_mat = {
const step = add_step % 2;
if (step == 0) {
return tex.block`\begin{bmatrix}1&0\\0&1\end{bmatrix}`;
}
return tex.block`${math
.parse(math.matrix(add_pic.data.steps[0].emInv).toString())
.toTex()}`;
}
add_op = {
const step = add_step % add_pic.data.steps.length;
if (step == 0) {
return md`Initial configuration`;
} else {
return md`${add_pic.data.steps[step - 1].opInv}`;
}
}
Swapping rows preserves area but changes orientation.
viewof swap_step = Inputs.button("Swap rows");
swap_mat = {
const step = swap_step % swap_pic.data.steps.length;
return tex.block`${math
.parse(math.matrix(swap_pic.data.steps[step].em).toString())
.toTex()}`;
}
swap_op = {
const step = swap_step % swap_pic.data.steps.length;
if (step == 0) {
return md`Initial configuration`;
} else {
return md`${swap_pic.data.steps[step - 1].opInv}`;
}
}
Multiplying a row by a constant affects the area by that same multiplicative factor.
viewof mult_step = Inputs.button(tex`\text{One row }\times2`);
mult_mat = {
const step = mult_step % 2;
if (step == 0) {
return tex.block`\begin{bmatrix}1&0\\0&1\end{bmatrix}`;
}
return tex.block`${math
.parse(math.matrix(mult_pic.data.steps[0].emInv).toString())
.toTex()}`;
}
mult_op = {
const step = mult_step % mult_pic.data.steps.length;
if (step == 0) {
return md`Initial configuration`;
} else {
return md`${mult_pic.data.steps[step - 1].opInv}`;
}
}
Let’s think about the fact that a row swap changes the sign of a determinant.
First it’s super easy to see for \(2\times2\) determinants
\[ \begin{vmatrix}a_{11}&a_{12}\\a_{21}&a_{22}\end{vmatrix} = a_{11}a_{22} - a_{21}a_{12} \] and \[ \begin{vmatrix}a_{21}&a_{22}\\a_{11}&a_{12}\end{vmatrix} = a_{21}a_{12} - a_{11}a_{22}. \]
Typically, these proofs start with the \(2\times 2\) case and then extend to higher dimensions via induction.
What happens when the matrix \(A\) is singular?
In this case, when we row reduce \(A\), we no longer get the identity. Thus, \(A\vec{x}=\vec{0}\) has infinitely many solutions and the null space has positive dimension.
As a result, the column space of \(A\) and range of the associated linear transformation cannot be all of \(\mathbb R^n\). Thus, the matrix cannot be invertible.
This should all be reflected in the geometric behavior of the linear transformation.
Here’s a look at the geometric effect of multiplication by the matrix \[ A = \begin{bmatrix}2&4\\1&2\end{bmatrix}. \]
The “squishing” of the two-dimensional space into one is exactly why the range cannot be all of \(\mathbb R^2\).
It’s often easy to see when a small matrix is singular. A matrix is certainly singular if
Finally, I feel it’s time to mention transposes. It’s easy - If \(A\) is an \(m\times n\) matrix, then the transpose of \(A\), is the matrix \(A^T\) satisfying \[[A^T]_{ij} = [A]_{ji}.\] For example, \[ \begin{bmatrix} 1&2&3\\4&5&6 \end{bmatrix}^T = \begin{bmatrix} 1&4\\2&5\\3&6 \end{bmatrix}. \]
The transpose will prove quite useful when we do algebraic manipulations with the dot product starting next time, so let’s investigate some of it’s properties. In what follows, we’ll assume that \(A\) and \(B\) are sized so that the operations are defined.
The three three seem quite obvious and the last seems at least believable since you can expand a determinant around the first row or column.
The transpose interacts nicely with matrix multiplication and via the inverse.
These are perhaps a bit more mysterious and deserve a closer look.
Note that, if \(A\) is \(m\times n\) and \(B\) must be \(n\times p\) then both sides of \(\transpose{(AB)}=\transpose{B}\transpose{A}\) are, at least, well defined. We can show that they’re equal by investigating their entries
\[\begin{align*} \matrixentry{\transpose{(AB)}}{ji} &=\matrixentry{AB}{ij} =\sum_{k=1}^{n}\matrixentry{A}{ik}\matrixentry{B}{kj} =\sum_{k=1}^{n}\matrixentry{B}{kj}\matrixentry{A}{ik} \\ &=\sum_{k=1}^{n}\matrixentry{\transpose{B}}{jk}\matrixentry{\transpose{A}}{ki} =\matrixentry{\transpose{B}\transpose{A}}{ji} \end{align*}\]
To talk about the inverse of a product, then \(A\) and \(B\) both have to be square and of the same size. Assuming so, note that \[(AB)(\inverse{B}\inverse{A}) = A(B\inverse{B})\inverse{A} = AI\inverse{A} =A\inverse{A}=I.\] Thus, \(\inverse{B}\inverse{A}\) satisfies the definition of \(\inverse{(AB)}\).
To show that \(\inverse{(\transpose{A})}=\transpose{(\inverse{A})}\), it suffices to show that \(\transpose{(\inverse{A})}\) satisfies the definition of inverse for \(\transpose{A}\). We can use the corresponding traspose property to show this: \[ \transpose{A}\transpose{(\inverse{A})} = \transpose{(\inverse{A}A)} = \transpose{I} = I. \]
Finally, it’s worth mentioning that any matrix that satisfies \[\transpose{A} = A\] is called symmetric. The identity matrix is symmetric, for example.
Comments on the example
The vectors \(\vec{\imath}=\langle 1,0 \rangle\) and \(\vec{\jmath}=\langle 0,1 \rangle\) map to the column vectors \(\langle 4,2 \rangle\) and \(\langle 2,-2 \rangle\) of the matrix.
The resulting area is clearly much larger. In fact, the value of the determinant is \(-12\) so the area of the result should be \(12\).
The minus sign indicates that there’s a flip in the transformation, which is clear to see.