Determinants and elementary matrices
The determinant is a function that accepts a square matrix and returns a real number that conveys a wealth of information about the matrix and the linear transformation that it defines. Determinants can be characterized either geometrically or algebraically and it’s the relationship between those perspectives that makes it such a useful concept.
The basics of determinants
Let’s talk about determinants from an intuitive perspective first. First off, the determinant should be defined for any square matrix and should be a real number. That is, if \(n\in\mathbb R\) and \(A\in\mathbb R^{n\times n}\), then its determinant, denoted \[\det(A) \text{ or }|A|,\] is a well defined element of \(\mathbb R\).
So, what does it tell us?
Geometric interpretation
To be precise, the absolute value of the determinant of the matrix \(A\) represents the generalized volume of the image of the \(n\)-dimensional unit cube under the linear transformation defined by \(A\). The sign of the determinant is positive when that linear transformation preserves orientation and negative when it reverses orientation. Since a linear transformation acts uniformly throughout \(\mathbb R^n\), this same interpretation is true for any solid object in \(\mathbb R^n\).
We’re going to illustrate this with some pictures that look like so:
Note that sides of the rectangle emanating from the origin are colored red and blue. If the red side is your thumb, then this could be the back of your left hand. A change in orientation would switch those so that we’re looking at the front of your left hand
None of this is really useful, though, unless we can compute it.
Algebraic definition
Matrices look like
\[ A_1 = [a_{11}] \text{ or } A_2 = \begin{bmatrix}a_{11}&a_{12}\\a_{21}&a_{22}\end{bmatrix} \text{ or } A_3=\begin{bmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\ a_{31}&a_{32}&a_{33}\end{bmatrix} \] or really any dimension, \[ A_n = \begin{bmatrix} a_{11}&a_{12}&\cdots&a_{1n} \\ a_{21}&a_{22}&\cdots&a_{2n} \\ \vdots&\vdots&\ddots&\vdots \\ a_{n1}&a_{n2}&\cdots&a_{nn} \end{bmatrix}. \]
We’ll focus first on the intuition for lower dimensional matrices.
Determinants for \(1\times1\) matrices
A \(1\times1\) matrix looks like \([a_{11}]\), a one-dimensional vector looks like \([x_1]\), and “matrix” multiplication results in \[ [a_{11}][x_1] = [a_{11} x_1]. \] I guess this means that linear functions on \(\mathbb R^1\) correspond directly with the linear functions \(f(x)=ax\) that we learn about in pre-calculus. In particular,
- \(f\) stretches or compresses the line by the factor \(|a|\),
- \(f\) reverses orientation exactly when \(a<0\), and
- \(f\) is one-to-one (and, therefore, invertible) as long as \(a\neq0\).
Given the desired geometric properties of the determinant, it makes sense to define \[ \det\left([a_{11}]\right) = a_{11}. \]
Determinants for \(2\times2\) matrices
The determinant of a \(2\times2\) matrix \[ A = \begin{bmatrix} a&b\\c&d \end{bmatrix} \] can be computed by \[\det(A) = ad-bc\] and characterized as the area of the parallelogram spanned by the columns of \(A\). We’ll explain shortly how this formula arises.
Determinants for \(3\times3\) matrices
Determinants of \(3\times3\) matrices can be computed in terms of determinants of \(2\times2\) matrices using a technique called cofactor expansion. This technique generalizes so that we can define determinants of \(4\times4\) matrices in terms of \(3\times3\) matrices and then \(5\times 5\), etc.
Row operations and determinants
The elementary row operations affect the determinant in predictable ways. If we apply those operations to place a matrix into its reduced row echelon form and keep track of which operations we used, we can then determine the value of the determinant. In fact, we don’t even need to go that far; we need only take it to triangular form.
In this section we’ll see how these ideas apply in the \(2\times2\) case.
Elementary matrices
The elementary matrices are exactly those matrices obtained by applying an elementary row operation to the identity matrix. For example,
\[ \begin{bmatrix} 1&0\\0&1 \end{bmatrix} \xrightarrow{\,-2R_1+R_2 \to R_2\,} \begin{bmatrix}1&0\\-2&1\end{bmatrix} \]
In addition, multiplication by an elementary matrix performs that operation! For example, \[ \begin{bmatrix}1&0\\-2&1\end{bmatrix} \begin{bmatrix}a&b\\c&d\end{bmatrix} = \begin{bmatrix}a&b\\-2a+c&-2b+d\end{bmatrix}. \]
As a result, the nonsingular matrices are exactly those matrices that can be expressed as a product of elementary matrices. Furthermore, it’s not hard to show that \[ \det(E_1E_2) = \det(E_1)\det(E_2) \] for elementary matrices \(E_1\) and \(E_2\). Thus, this result extends to all nonsingular matrices.
Elementary matrices and determinants
Before the exam, we explored how the ways in which a linear transformation behaves geometrically. Now, we’re going to do something similar to illustrate the effects of the elementary matrices.
\(R_i \to a R_i\)
Consider the elementary matrix obtained by multiplying a row by the positive number \(a\): \[\begin{bmatrix}a&0\\0&1\end{bmatrix}\] If we multiply this matrix times every point in the unit square, we’ll stretch or squish the square by that factor. If \(a=2\), this looks like so:
In this case, the value of the determinant should be exactly \(a\).
If \(a\) is negative, we change the orientation of the image and the sign of the determinant as well as (potentially) the magnitude. Here’s what that looks like when \(a=-1/2\).
\[\begin{bmatrix}-1/2&0\\0&1\end{bmatrix}\]
What if \(a=0\)??
Multiplying a row by zero changes the value of the determinant to zero, as expected. That’s not an elementary row operation, though, since it destroys information about the system and does not preserve the solution. In the current context, it doesn’t help us understand the determinant.
\(R_i \leftrightarrow R_j\)
Swapping two rows in the identity matrix also induces a change of orientation and sign \[ \begin{bmatrix} 1&0\\0&1 \end{bmatrix} \xrightarrow{\,R_1 \leftrightarrow R_2\,} \begin{bmatrix}0&1\\1&0\end{bmatrix} \]
Note this is consistent with our proposed definition \(\det(A) = ad-bc\) of a \(2\times2\).
\(aR_j+R_i\to R_i\)
Adding a constant multiple of one row to another preserves the area and should preserve the value of the determinant. For example:
\[ \begin{bmatrix} 1&0\\0&1 \end{bmatrix} \xrightarrow{\,-2R_2+R_1\to R_1\,} \begin{bmatrix}1&-2\\0&1\end{bmatrix} \]
To understand the result, recall that every time we apply matrix multiplication to our image of the unit circle, the legs of the image are exactly the columns of the matrix.
Deriving the \(2\times2\) formula
Now, let’s use the elementary matrices to derive the formula \[ \det(A) = ad-bc, \text{ for } A = \begin{bmatrix}a&b\\c&d\end{bmatrix}. \]
Recall that our expectation is that
- \(|\det(A)|\) will tell us the area of the parallelogram spanned by the columns of \(A\) and
- \(\text{sign}(\det(A))\) will tell us whether the orientation has been preserved.
We’re going to do this by applying the elementary row operations to reduce it to a matrix whose determinant is easy to tell from the form of that matrix. In addition, we’ll keep track of those operations so that we can tell how to compute the value of the determinant of our original from the easy target matrix.
Easy form
First off, let’s note the determinant of any diagonal matrix is exactly the product of the terms on the diagonal. Consider, for example, the matrix \[\begin{bmatrix}3&0\\0&-1/2\end{bmatrix}.\] The geometric effect here is to stretch by the factor \(3\) horizontally and to squish by the factor \(1/2\) vertically, together with a reflection about the \(x\)-axis.
Thus, the value of the determinant here should be \(-3/2\), which is agrees with the product of the terms on the diagonal, as advertised.
Note that if we reflect in both directions, the orientation of the image is ultimately preserved. If we modify the previous example, to \[\begin{bmatrix}-3&0\\0&-1/2\end{bmatrix},\] then the resulting image looks like so:
The value of the determinant should be \(+3/2\), which is again the product of the terms on the diagonal.
Reduction to the easy form
OK, let’s use the elementary row operations to reduce the general \(2\times2\) matrix to diagonal form; i.e. we’ll reduce \[ \begin{bmatrix}a&b\\c&d\end{bmatrix} \text{ to } \begin{bmatrix}\alpha&0\\0&\delta\end{bmatrix}. \]
Assuming that \(a\neq0\), we can do this in just two steps:
\[ \begin{aligned} \begin{bmatrix}a&b\\c&d\end{bmatrix} &\xrightarrow{\,-\frac{c}{a}R_1+R_2\to R_2\,} \begin{bmatrix}a&b\\0&-\frac{c}{a}b + d\end{bmatrix} \\ &\xrightarrow{\,-\frac{b}{-\frac{c}{a}b + d}R_2+R_1\to R_1\,} \begin{bmatrix}a&0\\0&-\frac{c}{a}b + d\end{bmatrix} \end{aligned} \]
The determinant of this last matrix can be determined by simply multiplying the terms on the diagonal.
Note that neither step changes the value of the determinant, thus the determinant of the original matrix is exactly the determinant of target, which is the product of the terms on the diagonal: \[ a\left(-\frac{c}{a}b + d\right) = ad-bc. \] If \(a=0\) but \(c\neq0\), then we can first switch the rows to get \[ \begin{aligned} \begin{bmatrix}c&d\\a&b\end{bmatrix} &\xrightarrow{\,-\frac{a}{c}R_1+R_2\to R_2\,} \begin{bmatrix}c&d\\0&-\frac{a}{c}d + b\end{bmatrix} \\ &\xrightarrow{\,-\frac{d}{-\frac{a}{c}d + b}R_2+R_1\to R_1\,} \begin{bmatrix}c&0\\0&-\frac{a}{c}d + b\end{bmatrix} \end{aligned} \] Now, the determinant of that last matrix is \[ c\left(-\frac{a}{c}d + b\right) = bc-ad = -(ad-bc). \] This is the negative of the value that we expect for the determinant of the original matrix, which makes sense since we flipped the rows.
Finally, if \(a=c=0\), then \([[1,0]]^T \to [[0,0]]^T\), the image of the unit square lies wholly in the line spanned by the second column \([[b,d]]^T\), so the value of the determinant must be zero. This agrees with the formula \(ad-bc\).
Easier form
It’s worth noting that the last step in the reduction to diagonal form was not really necessary, if our objective is simply to find an easy way to compute the determinant. The reason is this step doesn’t change any of the diagonal entries so it doesn’t change the result of the computation. Thus, it’s sufficient to simply put the matrix into triangular form and then multiply the terms on the diagonal.
Higher dimensions
There are a couple of observations that help us extend these ideas to higher dimensions.
Row ops are 2D
The row operations affect just two dimensions at a time. Thus, we can perform the same row operations to get the determinant.
Here’s an example illustrating the effect of the elementary operation \(2R_2+R3 \to R3\) to get the 3D elementary matrix \[ \begin{bmatrix} 1&0&0 \\ 0&1&0 \\ 0&2&1 \end{bmatrix}. \]
Recursive definition
Suppose we have a high-dimensional matrix: \[ A = \begin{bmatrix} a_{11}&a_{12}&\cdots&a_{1n} \\ a_{21}&a_{22}&\cdots&a_{2n} \\ \vdots&\vdots&\ddots&\vdots \\ a_{n1}&a_{n2}&\cdots&a_{nn} \end{bmatrix}. \] We zero out the first column below the \(a_{11}\) to get \[ A_n = \begin{bmatrix} a_{11}&a_{12}&\cdots&a_{1n} \\ 0&a_{22}'&\cdots&a_{2n}' \\ \vdots&\vdots&\ddots&\vdots \\ 0&a_{n2}'&\cdots&a_{nn}' \end{bmatrix} = \begin{bmatrix} a_{11} & * \\ 0 & B \end{bmatrix} \] Now, if we continue the row reduction towards triangular form, then we should find ourselves triangulating \(B\). Thus, \[\det(A) = a_{11}\times\det(B).\] As a result, we can express the determinant of an \(n\times n\) dimensional matrix in terms of the determinant of an \((n-1)\times(n-1)\) dimensional matrix. The details of this lead to the so-called cofactor expansion formulae.
\[ \det(A) = \sum_{j=1}^n (-1)^{i+j}\, a_{ij}\, \det(M_{ij}) = \sum_{i=1}^n (-1)^{i+j}\, a_{ij}\, \det(M_{ij}). \]
In this formula, the matrix \(M_{ij}\) refers to the so-called \((n-1)\times(n-1)\) dimensional minor matrix obtained by deleting the \(i^{\text{th}}\) row and \(j^{\text{th}}\) column of \(A\).
Examples
This would be a good place to show an example or two.
Theoretical implications
- Matrix nonsingular iff \(\det(A) \neq 0\)
- \(\det(AB) = \det(A)\det(B)\)