(array([-1.+0.j, 1.+0.j, 2.+0.j]),
array([[ 3.33333333e-01, 5.77350269e-01, 8.94427191e-01],
[ 6.66666667e-01, 5.77350269e-01, 4.47213595e-01],
[-6.66666667e-01, -5.77350269e-01, 1.26390005e-15]]))
Mon, Mar 30, 2026
Just before our last exam, we learned about eigenvalues and eigenvectors of \(n\) dimensional matrices and how they help us identify various subspaces of \(\mathbb R^n\) that are invariant under the action of the matrix. Today, we’re going to discuss how that allows us to express matrices in various forms, depending upon the basis that we use to describe \(\mathbb R^n\). The simplest such form is as a diagonal matrix.
\[ \newcommand{\vect}[1]{\mathbf{#1}} \newcommand{\inverse}[1]{#1^{-1}} \newcommand{\similar}[2]{\inverse{#2}#1#2} \]
Recall that \(T\) generally denotes a linear transformation mapping \(\mathbb R^n \to \mathbb R^n\) and that \(T\) has some matrix representation, which we write \[T(\vect{x}) = A\vect{x}.\]
We’ve defined a vector \(\vect{x}\) to be an eigenvector of \(T\) with eigenvalue \(\lambda\) if \[T(\vect{x}) = A\vect{x} = \lambda \vect{x}.\]
A real eigenvector spans a one-dimensional subspace that is invariant under the action of the matrix.
A collection of real eigenvectors with the same eigenvalue forms a (potentially) multi-dimensional subspace that is invariant under the action of the matrix.
Complex eigenvalues and eigenvectors are more complicated but it turns out that they lead to a transformation involving rotation within a two-dimensional subspace of \(\mathbb R^n\); that transformation might also involve scaling.
We saw quite a few examples of eigenvalue/eigenvector pairs a couple of weeks ago. Let’s revisit two very simple examples that really get at the heart of the issue that’s relevant for us here today.
The action of \[A = \begin{bmatrix}2&0\\0&1/2\end{bmatrix}\] preserves the subspace of \(\mathbb R^2\) spanned by \([ 1,0 ]^T\) and the subspace spanned by \([ 0,1 ]^T\).
The action of \[B = \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix}\] preserves the subspace of \(\mathbb R^2\) spanned by \([ 1,1 ]^T\) and the subspace spanned by \([ -2,1 ]^T\).
The first has eigenvalue \(2\) and the second has eigenvalue \(1/2\).
Recall that it’s easy to see if a real number \(\lambda\) and vector \(\vect{x}\) forms an eigenvalue/eigenvector pair is generally easy. Just compute. For example 2 we have, \[ \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix} \begin{bmatrix}1\\1\end{bmatrix} = \begin{bmatrix}1+1\\1/2+3/2\end{bmatrix} = \begin{bmatrix}2\\2\end{bmatrix} = 2\begin{bmatrix}1\\1\end{bmatrix} \] and \[ \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix} \begin{bmatrix}-2\\1\end{bmatrix} = \begin{bmatrix}-2+1\\-1+3/2\end{bmatrix} = \begin{bmatrix}-1\\1/2\end{bmatrix} = \frac{1}{2}\begin{bmatrix}-2\\1\end{bmatrix} \]
The matrices \(A\) and \(B\) act on \(\mathbb R^2\) in a similar way, though along different directions. By linearity, it appears that the rest of the space moves along with the eigenvectors in a similar fashion.
We might say that the two matrices are similar.
The action of \(B\) is more transparent, though.
Definition: We say that a matrix \(A\) is similar to the matrix \(B\) if there is a non-singular matrix \(S\) such that \[ A = SBS^{-1}. \]
When \(A\) is similar to \(B\), then \(B\) is also similar to \(A\). Thus, we might refer to \(A\) and \(B\) as similar matrices.
Let \(A\) and \(S\) be the matrices
\[\begin{align*} A=\begin{bmatrix} -13 & -8 & -4 \\ 12 & 7 & 4 \\ 24 & 16 & 7 \end{bmatrix}&& S=\begin{bmatrix} 1 & 1 & 2 \\ -2 & -1 & -3 \\ 1 & -2 & 0 \end{bmatrix}\text{.} \end{align*}\]
You can check easily enough that \[ S^{-1} = \begin{bmatrix} -6 & -4 & -1 \\ -3 & -2 & -1 \\ 5 & 3 & 1 \end{bmatrix} \]
As it turns out, \[ \begin{align*} \similar{A}{S}&= \begin{bmatrix} -6 & -4 & -1 \\ -3 & -2 & -1 \\ 5 & 3 & 1 \end{bmatrix} \begin{bmatrix} -13 & -8 & -4 \\ 12 & 7 & 4 \\ 24 & 16 & 7 \end{bmatrix} \begin{bmatrix} 1 & 1 & 2 \\ -2 & -1 & -3 \\ 1 & -2 & 0 \end{bmatrix}\\ &= \begin{bmatrix} -1 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & -1 \end{bmatrix}\text{.} \end{align*} \] Thus, that last matrix is similar to \(A\).
Note nice the form of the new matrix is!
Diagonal matrices are very easy to compute with. We can
Thus, if we can re-express a matrix as a diagonal matrix, it allows us to understand the original matrix more easily.
Let’s be explicit about one of the claims on the previous slide:
Theorem: If \(A\) and \(B\) are similar matrices, then they have the same set of eigenvalues.
Proof: Suppose that \(A\) and \(B\) are similar via \(S\), so \(AS=SB\). Let \(\vect{x}\neq0\) be an eigenvector of \(B\) for the eigenvalue \(\lambda\). Then, \[ A\left(S\vect{x}\right) = \left(AS\right)\vect{x} = \left(SB\right)\vect{x} = S\left(B\vect{x}\right) = S\left(\lambda\vect{x}\right) = \lambda\left(S\vect{x}\right) \] It’s worth mentioning that \(S\vect{x}\) is non-zero, since \(S\) is nonsingular. Thus, we’ve found a non-zero eigenvector \(S\vect{x}\) for \(A\) with the same eigenvalue \(\lambda\).
More is true. If \(A\) and \(B\) are similar with eigenvalue \(\lambda\), then \(\lambda\) has the same geometric multiplicity for both matrices.
That is \[\{\vect{x}\in\mathbb R^n: A\vect{x} = \lambda \vect{x}\} \: \text{ and } \: \{\vect{x}\in\mathbb R^n: B\vect{x} = \lambda \vect{x}\}\] have the same dimension.
More evidence that \(A\) and \(B\) behave in the same manner!
Similarity of matrices forms an equivalence relation. That is,
Suppose we have a matrix \(A\) of size \(n\) and we suspect that it is similar to a diagonal matrix. How can we find that similarity relationship?
More precisely, how can we find a nonsingular matrix \(S\) and a square \(D\) (both of size \(n\)) such that
\[D = \inverse{S}AS?\]
This process is called diagonalization.
Recall the following two facts:
Thus, if we suspect that \(A\) is diagonalizable, then the resulting diagonal matrix must have exactly the eigenvalues of \(A\) on the diagonal!
If the eigenvalues of \(A\) play a role in this process by forming the diagonal of \(D\), then it makes sense that the eigenvectors of \(A\) should somehow play a role as well. In fact they form the columns of \(S\).
Theorem: Suppose that \(A\) is a square matrix of size \(n\) and that the set \[\{\vect{x}_1, \vect{x}_2,\ldots,\vect{x}_n\}\] forms a linearly independent set of eigenvectors of \(A\) with eigenvalues \(\{\lambda_1,\lambda_2,\cdots,\lambda_n\}\). Let \[S = \begin{bmatrix}\vect{x}_1 & \vect{x}_2 & \cdots & \vect{x}_n\end{bmatrix}\] be the matrix whose columns are the eigenvectors of \(A\) and let \[D = \begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \dots \\ 0 & 0 & \cdots & \lambda_n \end{bmatrix}.\] Then \(D = \inverse{S}AS\).
Let \(\vect{e}_j\) denote the \(j^{\text{th}}\) standard unit basis vector of \(\mathbb R^n\) and recall that a matrix is entirely defined by it’s action on those basis vectors. In fact, \[A\vect{e}_j\] is exactly the \(j^{\text{th}}\) column of \(A\). So, let’s show that \[\inverse{S}AS\vect{e}_j = D\vect{e}_j.\] Well, \[ \inverse{S}AS\vect{e}_j = \inverse{S}A\vect{x}_j = \inverse{S}(\lambda_j \vect{x}_j) = \lambda_j \inverse{S}\vect{x}_j = \lambda \vect{e}_j = D\vect{e}_j. \]
Recall that \[ D = \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix} \] satisfies \[ \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix} \begin{bmatrix}1\\1\end{bmatrix} = 2\begin{bmatrix}1\\1\end{bmatrix} \: \text{ and } \: \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix} \begin{bmatrix}-2\\1\end{bmatrix} = \frac{1}{2}\begin{bmatrix}-2\\1\end{bmatrix}. \] Thus, we must have \[ \begin{bmatrix}1 & -2 \\ 1 & 1\end{bmatrix}^{-1} \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix} \begin{bmatrix}1 & -2 \\ 1 & 1\end{bmatrix} = \begin{bmatrix}2 & 0 \\ 0 & 1/2\end{bmatrix}. \]
Suppose we’d like to diagonalize \[A = \left( \begin{array}{ccc} 3 & -2 & 0 \\ 4 & -6 & -3 \\ -4 & 8 & 5 \\ \end{array} \right) \] First, let’s find the eigenvalues and eigenvectors:
(array([-1.+0.j, 1.+0.j, 2.+0.j]),
array([[ 3.33333333e-01, 5.77350269e-01, 8.94427191e-01],
[ 6.66666667e-01, 5.77350269e-01, 4.47213595e-01],
[-6.66666667e-01, -5.77350269e-01, 1.26390005e-15]]))
It looks to me like we can take \(S\) and \(D\) to be
Let’s find out!
Here’s another fun example on Math.StackExchange, though kinda backwards:
https://math.stackexchange.com/q/1119668