{
if (apply1[1]) {
if (apply1[1] == "M") {
pic1.step("forward");
} else if (apply1[1] == "MInv") {
pic1.step(0);
}
}
}Fri, Mar 20, 2026
We’ve pretty much finished up linear and logistic regression. We’ll start on our last major supervised learning algorithm (namely, neural networks) in a couple weeks.
Before we get to that, though, we should discuss a really powerful dimensional reduction algorithm called Principal Component Analysis or PCA. There’s no way to understand that, though, without one more topic in linear algebra, which is hugely important in it’s own right, namely Eigenvalues and Eigenvectors!
When we study a linear transformation \(T:\mathbb R^n \to \mathbb R^n\), there are often many invariant subspaces. We can simplify the transformation by studying how it acts on those smaller, invariant spaces.
Eigenvectors and their associated eigenvalues are the algebraic tools for formulating this process. Furthermore, there’s a very geometric way to view these concepts as well.
Application of these tools will come quickly as we move to Google page rank next time and PCA later.
\[ \newcommand{\vect}[1]{\mathbf{#1}} \]
Throughout this presentation \(T\) will denote a linear transformation mapping \(\mathbb R^n \to \mathbb R^n\). We’ll generally suppose that \(T\) has the matrix representation \[T(\vect{x}) = A\vect{x}.\]
We say that \(\vect{x}\) is an eigenvector of \(T\) with eigenvalue \(\lambda\) if \[T(\vect{x}) = A\vect{x} = \lambda \vect{x}.\]
As we’ll see, there are good reasons to consider the possibilites that the scalar \(\lambda\) might be a real or a complex eigenvalue.
In the case where \(\lambda\) is real, the equation \[T(\vect{x}) = A\vect{x} = \lambda \vect{x}\] immediately implies that the one-dimensional subspace of \(\mathbb R^n\) spanned by \(\vect{x}\) \(\text{span}(\{\vect{x}\})\) is invariant under the action of \(T\).
The sign and magnitude of \(\lambda\) dictate the geometric properties of the transformation on that invariant space:
Eigenvalues can be complex numbes, meaning that they hav the form \[\lambda = a + b i,\] where \(a,b\in\mathbb R\) and is the imaginary unit satisfying \(i^2 = -1\).
We might talk about this a bit after our exam but our main applications involve real eigenvalues.
We’re going to start with several examples in \(\mathbb R^2\)
The action of \[M = \begin{bmatrix}2&0\\0&1/2\end{bmatrix}\] preserves the subspace of \(\mathbb R^2\) spanned by \([ 1,0 ]^T\) and the subspace spanned by \([ 0,1 ]^T\).
The action of \[M = \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix}\] preserves the subspace of \(\mathbb R^2\) spanned by \([ 1,1 ]^T\) and the subspace spanned by \([ -2,1 ]^T\).
The first has eigenvalue \(2\) and the second has eigenvalue \(1/2\).
Checking to see if a real number \(\lambda\) and vector \(\vect{x}\) forms an eigenvalue/eigenvector pair is generally easy. Just compute. For example, \[ \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix} \begin{bmatrix}1\\1\end{bmatrix} = \begin{bmatrix}1+1\\1/2+3/2\end{bmatrix} = \begin{bmatrix}2\\2\end{bmatrix} = 2\begin{bmatrix}1\\1\end{bmatrix} \] and \[ \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix} \begin{bmatrix}-2\\1\end{bmatrix} = \begin{bmatrix}-2+1\\-1+3/2\end{bmatrix} = \begin{bmatrix}-1\\1/2\end{bmatrix} = \frac{1}{2}\begin{bmatrix}-2\\1\end{bmatrix} \]
The action of \[M = \begin{bmatrix}1&1\\0&1\end{bmatrix}\] preserves the subspace of \(\mathbb R^2\) spanned by \([ 1,0 ]^T\) and that’s it.
I guess that \([ 1,0 ]^T\) is the only (non-zero) eigenvector and its eigenvalue is \(1\).
The action of \[M = \begin{bmatrix}1&3\\3&1\end{bmatrix}\] preserves and stretches the subspace of \(\mathbb R^2\) spanned by \([ 1,1 ]^T\) and preserves but flips the subspace spanned by \([ -2,1 ]^T\).
The first has eigenvalue \(2\) and the second has eigenvalue \(-1\).
viewof apply5 = (reset5, apply_switch({ invertible: false }))
viewof reset5 = Inputs.button('Reset')The action of \[M = \begin{bmatrix}4/5&2/5\\2/5&1/5\end{bmatrix}\] projects \(\mathbb R^2\) onto the subspace spanned by \([ 2,1 ]^T\).
Once we’ve arrived at \[\left(A-\lambda I\right)\vect{x} = \vect{0},\] Note that this implies that \(A-\lambda I\) is singular. Of course, there’s a simple test for singularity, namely \[\det(A-\lambda I) = 0.\] The expression on the left is generally a polynomial in the variable \(\lambda\); it is called the characteristic polynomial of \(A\).
The characteristic polynomial gives us a simple algebraic criterion on \(\lambda\) that we can use to solve for \(\lambda\); we simply find its roots.
Recall that \[M = \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix}\] has the eigenvalues \(\lambda=2\) and \(\lambda =1/2\). Let’s show how we can use the characteristic polynomial to find those eigenvalues.
The first step will be to find the matrix \(M-\lambda I\), which can be otained by simply sutracting \(1\)s off the diagonal of \(M\). Thus, \[M-\lambda I = \begin{bmatrix}1-\lambda&1\\1/2&3/2-\lambda\end{bmatrix}.\]
From the factored form, we can see easily that the eigenvalues are \[\lambda=2 \text{ and } \lambda=1/2.\]
Once you’ve got an eigenvalue, finding the corresponding eigenvectors boils down to solving the system \[A\vect{x} = \lambda \vect{x},\] where \(\lambda\) is now known. Equivalently, we find the null space of \(A-\lambda I\).
Let’s continue with the example \[M = \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix}\] with eigenvalues \(\lambda=2\) and \(\lambda =1/2\).
For \(\lambda = 2\), we form the matrix \[M - 2I = \begin{bmatrix}1-2&1\\\frac{1}{2}&\frac{3}{2}-2\end{bmatrix} = \begin{bmatrix}-1&1\\1/2&-1/2\end{bmatrix}.\] It’s pretty easy to see that’s singular and that \([1 \quad 1]^T\) is in the null space so that the null space is just the span of that vector.
For \(\lambda = 1/2\), we form the matrix \[M - \frac{1}{2}I = \begin{bmatrix}1-\frac{1}{2}&1\\\frac{1}{2}&\frac{3}{2}-\frac{1}{2}\end{bmatrix} = \begin{bmatrix}1/2&1\\1/2&1\end{bmatrix}.\] It is again pretty easy to see that’s singular and that now \([-2 \quad 1]^T\) is in the null space so that the null space is just the span of that vector.
We now turn to the question of interpreting eigenvalues with a focus on real eigenvalues for the time being. A summary might look like:
While the intuition gained from our examples in \(\mathbb R^2\) should hep a lot, it’s important to realize that this is all happening in \(\mathbb R^n\), where \(n\) is potentially large.
Thus, \(\alpha\vect{x} + \beta\vect{y}\) is also an eigenvector with eigenvalue \(\lambda\).
Suppose that \(\vect{x}\) and \(\vect{y}\) are eigenvectors for the eigenvalues \(\lambda_1\) and \(\lambda_2\). Suppose also, that one of the eigenvectors is a constant multiple of the other, say \[\vect{x} = c\vect{y}.\] Then, \[\lambda_1 (c\vect{y}) = A (c\vect{y}) = c(A\vect{y}) = c(\lambda_2\vect{y}) = \lambda_2(c\vect{y}).\] So, we must have \(\lambda_1 = \lambda_2\).
Consider the matrices \[ A = \begin{bmatrix}1&0\\0&1\end{bmatrix} \quad B = \begin{bmatrix}1&1\\0&1\end{bmatrix}. \] Both \(A\) and \(B\) have the characteristic polynomial \((\lambda-1)^2\); we say that \(\lambda=1\) is a repeated eigenvalue. It’s easy to see that the corresponding eigenspace of \(A\) is all of \(\mathbb R^2\). For \(B\), though \[\begin{bmatrix}1&1\\0&1\end{bmatrix} \begin{bmatrix}x\\y\end{bmatrix} = \begin{bmatrix}x+y\\y\end{bmatrix}. \] Thus, \([x \quad y]^T\) is an eigenvector only when \(y=0\).
It’s worth mentioning that we’ve already explored the geometric action of \(B\).
{
if (apply1[1]) {
if (apply1[1] == "M") {
pic1.step("forward");
} else if (apply1[1] == "MInv") {
pic1.step(0);
}
}
}{
if (apply2[1]) {
if (apply2[1] == "M") {
pic2.step("forward");
} else if (apply2[1] == "MInv") {
pic2.step(0);
}
}
}{
if (apply3[1]) {
if (apply3[1] == "M") {
pic3.step("forward");
} else if (apply3[1] == "MInv") {
pic3.step(0);
}
}
}