Eigenvalues and eigenvectors

Mon, Mar 03, 2025

Eigenvalues and eigenvectors

When we study a linear transformation \(T:\mathbb R^n \to \mathbb R^n\), there are often many invariant subspaces. We can simplify the transformation by studying how it acts on those smaller, invariant spaces.

Eigenvectors and their associated eigenvalues are the algebraic tools for formulating this process. Furthermore, there’s a very geometric way to view these concepts as well.

Application of these tools will come quickly as we move to Google page rank next time.

\[ \newcommand{\vect}[1]{\mathbf{#1}} \]

Basic definitions

Throughout this presentation \(T\) will denote a linear transformation mapping \(\mathbb R^n \to \mathbb R^n\). We’ll generally suppose that \(T\) has the matrix representation \[T(\vect{x}) = A\vect{x}.\]

We say that \(\vect{x}\) is an eigenvector of \(T\) with eigenvalue \(\lambda\) if \[T(\vect{x}) = A\vect{x} = \lambda \vect{x}.\]

Real eigenvalues

As we’ll see, there are good reasons to consider the possibilites that the scalar \(\lambda\) might be a real or a complex eigenvalue.

In the case where \(\lambda\) is real, the equation \[T(\vect{x}) = A\vect{x} = \lambda \vect{x}\] immediately implies that the one-dimensional subspace of \(\mathbb R^n\) spanned by \(\vect{x}\) \(\text{span}(\{\vect{x}\})\) is invariant under the action of \(T\).

The sign and magnitude of \(\lambda\) dictate the geometric properties of the transformation on that inv:

  • \(|\lambda|<1\) implies that \(T\) compresses \(\text{span}(\{\vect{x}\})\),
  • \(|\lambda|>1\) implies that \(T\) stretches out \(\text{span}(\{\vect{x}\})\),
  • \(\lambda<0\) implies that \(T\) reflects \(\text{span}(\{\vect{x}\})\).

Complex eigenvalues

When we say that \(\lambda\) is complex, we mean that \(\lambda\) has the form \[\lambda = a + b i,\] where \(a,b\in\mathbb R\) and is the imaginary unit satisfying \(i^2 = -1\).

We’ll see how complex eigenvalues naturally arise from the computations involved in finding real eigenvalues. We’ll also see how there’s a natural interpretation of these things involving rotation in the vector space \(\mathbb R^n\).

Examples

We’re going to start with several examples in \(\mathbb R^2\)

Example 1

The action of \[M = \begin{bmatrix}2&0\\0&1/2\end{bmatrix}\] preserves the subspace of \(\mathbb R^2\) spanned by \([ 1,0 ]^T\) and the subspace spanned by \([ 0,1 ]^T\).

  • \([ 1,0 ]^T\) is an eigevector with eigenvalue \(2\) and
  • \([ 0,1 ]^T\) is an eigevector with eigenvalue \(1/2\).

Example 2

The action of \[M = \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix}\] preserves the subspace of \(\mathbb R^2\) spanned by \([ 1,1 ]^T\) and the subspace spanned by \([ -2,1 ]^T\).

The first has eigenvalue \(2\) and the second has eigenvalue \(1/2\).

Double checking example 2

Checking to see if a real number \(\lambda\) and vector \(\vect{x}\) forms an eigenvalue/eigenvector pair is generally easy. Just compute. For example, \[ \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix} \begin{bmatrix}1\\1\end{bmatrix} = \begin{bmatrix}1+1\\1/2+3/2\end{bmatrix} = \begin{bmatrix}2\\2\end{bmatrix} = 2\begin{bmatrix}1\\1\end{bmatrix} \] and \[ \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix} \begin{bmatrix}-2\\1\end{bmatrix} = \begin{bmatrix}-2+1\\-1+3/2\end{bmatrix} = \begin{bmatrix}-1\\1/2\end{bmatrix} = \frac{1}{2}\begin{bmatrix}-2\\1\end{bmatrix} \]

Example 3

The action of \[M = \begin{bmatrix}1&1\\0&1\end{bmatrix}\] preserves the subspace of \(\mathbb R^2\) spanned by \([ 1,0 ]^T\) and that’s it.

I guess that \([ 1,0 ]^T\) is the only (non-zero) eigenvector and its eigenvalue is \(1\).

Example 4

The action of \[M = \begin{bmatrix}1&3\\3&1\end{bmatrix}\] preserves and stretches the subspace of \(\mathbb R^2\) spanned by \([ 1,1 ]^T\) and preserves but flips the subspace spanned by \([ -2,1 ]^T\).

The first has eigenvalue \(2\) and the second has eigenvalue \(-1\).

Example 5

The action of \[M = \begin{bmatrix}4/5&2/5\\2/5&1/5\end{bmatrix}\] projects \(\mathbb R^2\) onto the subspace spanned by \([ 2,1 ]^T\).

Finding eigenvalues

There’s an surprisingly easy way to find eigenvalues. Let’s begin by supposing that \[A\vect{x} = \lambda \vect{x}.\] Let’s bring both terms to one side to obtain \[A\vect{x} - \lambda \vect{x} = \vect{0}.\] We could rewrite this as \[\begin{aligned} A\vect{x} - \lambda \vect{x} &= A\vect{x} - \lambda I\vect{x} \\ &= \left(A-\lambda I\right)\vect{x} = \vect{0}. \end{aligned}\]

Finding eigenvalues (cont)

Once we’ve arrived at \[\left(A-\lambda I\right)\vect{x} = \vect{0},\] Note that this implies that \(A-\lambda I\) is singular. Of course, there’s a simple test for singularity, namely \[\det(A-\lambda I) = 0.\] The expression on the left is generally a polynomial in the variable \(\lambda\); it is called the characteristic polynomial of \(A\).

The characteristic polynomial gives us a simple algebraic criterion on \(\lambda\) that we can use to solve for \(\lambda\); we simply find its roots.

Example

Recall that \[M = \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix}\] has the eigenvalues \(\lambda=2\) and \(\lambda =1/2\). Let’s show how we can use the characteristic polynomial to find those eigenvalues.

The first step will be to find the matrix \(M-\lambda I\), which can be otained by simply sutracting \(1\)s off the diagonal of \(M\). Thus, \[M-\lambda I = \begin{bmatrix}1-\lambda&1\\1/2&3/2-\lambda\end{bmatrix}.\]

Example (cont)

We now find the determinant \[\begin{aligned} \left|M-\lambda I\right| &= \begin{vmatrix}1-\lambda&1\\1/2&3/2-\lambda\end{vmatrix} = (1-\lambda)\left(\frac{3}{2}-\lambda\right) - \frac{1}{2} \\ &=\lambda^2-5\frac{\lambda }{2}+\frac{3}{2} - \frac{1}{2} = \lambda^2-5\frac{\lambda }{2}+1 \\ &= \frac{1}{2}\left(2\lambda^2 - 5\lambda + 2\right) = \frac{1}{2}(x-2)(2x-1). \end{aligned}\]

From the factored form, we can see easily that the eigenvalues are \[\lambda=2 \text{ and } \lambda=1/2.\]

Finding eigenvectors

Once you’ve got an eigenvalue, finding the corresponding eigenvectors boils down to solving the system \[A\vect{x} = \lambda \vect{x},\] where \(\lambda\) is now known. Equivalently, we find the null space of \(A-\lambda I\).

Example

Let’s continue with the example \[M = \begin{bmatrix}1&1\\1/2&3/2\end{bmatrix}\] with eigenvalues \(\lambda=2\) and \(\lambda =1/2\).

Example (cont)

For \(\lambda = 2\), we form the matrix \[M - 2I = \begin{bmatrix}1-2&1\\\frac{1}{2}&\frac{3}{2}-2\end{bmatrix} = \begin{bmatrix}-1&1\\1/2&-1/2\end{bmatrix}.\] It’s pretty easy to see that’s singular and that \([1 \quad 1]^T\) is in the null space so that the null space is just the span of that vector.

Example (cont 2)

For \(\lambda = 1/2\), we form the matrix \[M - \frac{1}{2}I = \begin{bmatrix}1-\frac{1}{2}&1\\\frac{1}{2}&\frac{3}{2}-\frac{1}{2}\end{bmatrix} = \begin{bmatrix}1/2&1\\1/2&1\end{bmatrix}.\] It is again pretty easy to see that’s singular and that now \([-2 \quad 1]^T\) is in the null space so that the null space is just the span of that vector.

Interpreting real eigenvalues

We now turn to the question of interpreting eigenvalues with a focus on real eigenvalues for the time being. A summary might look like:

  • The set of all eigenvectors for a single eigenvalue form a subspace,
  • Eigenvectors for distinct eigenvalues are linearly independent,
  • The dimension of the subspace for an eigenvalue cannot exceed the algebraic multiplicity of the eigenvalue (obviously) but repeated eigenvalues can be a bit tricky.

While the intuition gained from our examples in \(\mathbb R^2\) should hep a lot, it’s important to realize that this is all happening in \(\mathbb R^n\), where \(n\) is potentially large.

Eigenspaces

It’s pretty easy to show that the set of all eigenvectors for a single eigenvalue form a subspace. If \(\vect{x}\) and \(\vect{y}\) are eigenvectors for the eigenvector \(\lambda\) and \(\alpha\) and \(\beta\) are scalars, then \[\begin{aligned} A(\alpha\vect{x} + \beta\vect{y}) &= \alpha A\vect{x} + \beta A\vect{y} \\ &= \alpha \lambda \vect{x} + \beta\lambda \vect{y} \\ &= \lambda (\alpha\vect{x} + \beta\vect{y}). \end{aligned}\]

Thus, \(\alpha\vect{x} + \beta\vect{y}\) is also an eigenvector with eigenvalue \(\lambda\).

Linear independence for distinct eigenvalues

Suppose that \(\vect{x}\) and \(\vect{y}\) are eigenvectors for the eigenvalues \(\lambda_1\) and \(\lambda_2\). Suppose also, that one of the eigenvectors is a constant multiple of the other, say \[\vect{x} = c\vect{y}.\] Then, \[\lambda_1 (c\vect{y}) = A (c\vect{y}) = c(A\vect{y}) = c(\lambda_2\vect{y}) = \lambda_2(c\vect{y}).\] So, we must have \(\lambda_1 = \lambda_2\).

Repeated eigenvalues

Consider the matrices \[ A = \begin{bmatrix}1&0\\0&1\end{bmatrix} \quad B = \begin{bmatrix}1&1\\0&1\end{bmatrix}. \] Both \(A\) and \(B\) have the characteristic polynomial \((\lambda-1)^2\); we say that \(\lambda=1\) is a repeated eigenvalue. It’s easy to see that the corresponding eigenspace of \(A\) is all of \(\mathbb R^2\). For \(B\), though \[\begin{bmatrix}1&1\\0&1\end{bmatrix} \begin{bmatrix}x\\y\end{bmatrix} = \begin{bmatrix}x+y\\y\end{bmatrix}. \] Thus, \([x \quad y]^T\) is an eigenvector only when \(y=0\).

It’s worth mentioning that we’ve already explored the geometric action of \(B\).

Complex eigenvalues

Let’s use our method for finding eigenvalues to see how complex eigenvalues might arise. We’ll then try to figure out how to interpret them.

A simple example

We’ll start with the simplest example along these lines, namely \[ A = \begin{bmatrix}0&-1\\1&0\end{bmatrix}. \] Then \[ \det(A-\lambda I) = \begin{vmatrix}-\lambda&-1\\1&-\lambda\end{vmatrix} = \lambda^2 + 1. \] I guess we need \(\lambda^2 = -1\), i.e. \(\lambda=\pm i\).

Rotate \(90^{\circ}\)

The geometric action of \[A = \begin{bmatrix}0&-1\\1&0\end{bmatrix}\] is to rotate \(\mathbb R^2\) through the angle \(\pi/2\).

Rotate \(\theta\)

Let \(R(\theta)\) denote the matrix \[ R(\theta) = \begin{bmatrix} \cos(\theta)&-\sin(\theta) \\ \sin(\theta)&\cos(\theta) \end{bmatrix}.\]

Then, \(R(\theta)\vec{\imath}\) and \(R(\theta)\vec{\jmath}\) simply extract out the columns of \(R(\theta)\), just as they do for any matrix.

Notice, though that the first column is exactly \(\vec{\imath}\) rotated through the angle \(\theta\). This follows from the very definition of the sine and cosine.

Similarly, the second column is exactly \(\vec{\jmath}\) rotated through the angle \(\theta\). You could convince yourself of that using the fact that the second column is perpendicular to the first.

It follows by linearity that \(R(\theta)\) rotates every vector in the plane through the angle \(\theta\)! The matrix \(R(\theta)\) is aptly called a rotation matrix.

Eigenvalues of \(R(\theta)\)

Let’s compute the eigenvalues of \(R(\theta)\). First, we find the characteristic polynomial: \[\begin{aligned} \det(R(\theta) - \lambda I) &= \begin{vmatrix} \cos(\theta) - \lambda & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) - \lambda \end{vmatrix} = (\cos(\theta) - \lambda)^2 + \sin^2(\theta) \\ &= \lambda^2 - 2\cos(\theta)\lambda + \sin^2(\theta) + \cos^2(\theta) = \lambda^2 - 2\cos(\theta)\lambda + 1. \end{aligned}\] We find the roots of the characteristic polynomial using the quadratic formula: \[\begin{aligned} \lambda &= \frac{2\cos(\theta) \pm \sqrt{4\cos^2(\theta) - 4}}{2} \\ &= \frac{2\cos(\theta)}{2} \pm \frac{\sqrt{-4}\sqrt{1-\cos^2(\theta)}}{2} = \cos(\theta) \pm \sin(\theta) i. \end{aligned}\]

Thus, the eigenvalues of the rotation matrix are complex and the degree of the rotation is reflected by the value of the eigenvalue.

Rotation and scaling

Some matrices include both scaling and rotation. To achieve this, simply define \(A = r\,R(\theta)\), where \(r\in\mathbb R\). The eigenvalues should then be \(r(\cos(\theta)\pm\sin(\theta))\).

For example,

\[\begin{aligned} \sqrt{2}R(\pi/4) &= \sqrt{2}\begin{bmatrix}\cos(\pi/4)&-\sin(\pi/4)\\ \sin(\pi/4)&\cos(\pi/4)\end{bmatrix} \\ &= \sqrt{2}\begin{bmatrix}1/\sqrt{2}&-1/\sqrt{2}\\ 1/\sqrt{2}&1/\sqrt{2}\end{bmatrix} = \begin{bmatrix}1&-1\\1&1\end{bmatrix} \end{aligned}\]

The eigenvalues should be \[\sqrt{2}(\cos(\pi/4) + i\sin(\pi/4)) = \sqrt{2} \left(\frac{1}{\sqrt{2}} \pm i\frac{1}{\sqrt{2}}\right) = 1\pm i.\]

Double check

Let’s double check those eigenvalues:

\[ \begin{vmatrix}1-\lambda & -1 \\ 1 & 1 - \lambda\end{vmatrix} = (1-\lambda)^2 + 1 = \lambda^2 - 2\lambda + 2. \]

Applying the quadratic formula to find the roots, we get \[ \frac{2\pm\sqrt{4-8}}{2} = 1\pm i, \] as expected.

Rotation and scaling illustrated

The action of \[M = \begin{bmatrix}1&-1\\1&1\end{bmatrix}\] rotates \(\mathbb R^2\) through the angle \(\pi/4\) and expands by the factor \(\sqrt{2}\).

Three dimensional matrices

Let’s play with the matrix \[ M = \begin{bmatrix} 0 & 2 & 0 \\ -1 & 2 & 1 \\ 0 & 0 & 2 \end{bmatrix} \]

The eigenvalues

We now find the characteristic polynomial and then the eigenvalues.

\[\begin{aligned} \det(M-\lambda I) &= \begin{vmatrix} -\lambda & 2 & 0 \\ -1 & 2-\lambda & 1 \\ 0 & 0 & 2-\lambda \end{vmatrix} \\ &= (2-\lambda) \begin{vmatrix} -\lambda & 2 \\ -1 & 2-\lambda \end{vmatrix} = (2-\lambda)(\lambda^2 - 2\lambda + 2). \end{aligned}\]

The roots of that polynomial are exactly \(\lambda = 2\) and \(\lambda = 1\pm i\).

In this situation, we expect there to be a one-dimensional eigenspace that’s stretched out by the fact 2 and a two-dimensional eigenspace that’s rotated through the angle \(45^{\circ}\) and expanded by the factor \(\sqrt{2}\).

Real eigenspace

Let’s use Python to find the eigenspaces. Here’s the real eigenspace:

import numpy as np
from scipy.linalg import null_space
M = np.matrix([
    [0,2,0],
    [-1,2,1],
    [0,0,2]])
I = np.identity(3)
null_space(M - 2*I)
array([[0.57735027],
       [0.57735027],
       [0.57735027]])

The real eigenspace is spanned by \([1\quad1\quad1]^T\).

Complex eigenspace

And the complex eigenspace:

null_space(M - (1+1j)*I)
array([[8.16496581e-01-0.00000000e+00j],
       [4.08248290e-01+4.08248290e-01j],
       [2.77555756e-17-1.66967135e-17j]])

The eigenspace should be spanned by the vectors \[ [2\quad1\quad0]^T \text{ and } [0\quad1\quad0]^T. \]

Still to come

  • Power iteration to find the dominant eigenvector
  • Diagonalization