Constructing a Basis for the Null Space of a Matrix

On page 101 of our textbook, we see a theorem that looks something like the following:

Theorem BNS: Basis for Null Spaces
Suppose that $A$ is an $m\times n$ matrix and the reduced row echelon form $B = (b_{i,j})$ of $A$ has $r$ pivot columns. Let $$D = \{d_1, d_2,\ldots, d_r\}$$ be the indices of the pivot columns and let $$F = \{f_1, f_2,\ldots, f_{nāˆ’r}\}$$ be the indices of the non-pivot (or free) columns. For $1 \leq j \leq n āˆ’ r$, construct the $n āˆ’ r$ vectors $\vec{z}_j \in \mathbb R^n$, defined by $$ (\vec{z}_j)_i = \begin{cases} 1 & \text{if } i \in F, i = f_j \\ 0 & \text{if } i \in F, i \neq f_j \\ -b_{k,f_j} & \text{if } i \in D, i=d_k \end{cases} $$ Then, $S=\{\vec{z}_1,\vec{z}_2,\ldots,\vec{z}_{n-r}\}$ satisfies
  1. $N(A)=\langle S \rangle$ and
  2. $S$ is a linearly independent set.

A set of vectors like this that spans a space and is linearly independence is called a basis.

The theorem is a bit hard to wrap one's head around but, often, the description of an algorithm like this is easier to understand if you program it. In fact, you really can't tell a computer how to do something until you understand it pretty well yourself! Let's give it a try.

Example

Here's an example to play with lifted right off of page 89 from our textbook. Of course, this is live code so you can feel free to change it to explore other examples. Once the matrix is defined, we'll go head and reduce it to row echelon form using the rref method.

Implementing the algorithm

First, let's identify the sets $D$ and $F$ that index the pivot columns and free columns respectively.

This might not be quite what we expected because Python (and therefore Sage) uses a zero-based index, rather than a one-based index. Thus, if there are 6 columns, they are indexed 0, 1, 2, 3, 4, 5. Regardless, our working has three free columns; thus, the null space should have dimension three.

Now, here is the definition of z taken straight from the theorem and translated to Python.

Seems to work, at least!

Dissecting the algorithm

Let's try to pick this apart a bit to get a better understanding. We'll begin by focusing on the case $i\in F$ where $(\vec{z}_j)_i$ is zero or one, effectively ignoring the last part. $$ (\vec{z_1}_j)_i = \begin{cases} 1 & \text{if } i \in F, i = f_j \\ 0 & \text{if } i \in F, i \neq f_j \\ \color{#eee}-b_{k,f_j} & \color{#eee}\text{if } i \in D, i=d_k \end{cases} $$ We might denote this first step by $\vec{z_1}$ and we can implement it by just suppressing the last part as follows:

We can now see a nice way to interpret this. For each vector, we've got to determine one of the free index locations to place a 1; we'll place a 0 in the other free index locations. For the $j^{\text{th}}$ vector, find the free index in the $j^{\text{th}}$ location and place the 1 there.

In our working example, the free indices (counting from one, like a normal person) are 3, 5, and 6. Thus, our first vector has a 1 in the $3^{\text{rd}}$ spot, our second vector has a 1 in the $5^{\text{th}}$ spot, and our third vector has a 1 in the $6^{\text{th}}$ spot.

Note that this portion alone guarantees linear independence of the vectors.

Span

Finally, we'd like to get a grip on why this set of vectors spans the null space. To do so, let's first consider how we find the null space by translating the reduced row echelon form of the matrix to a linear system. That is, we take our example matrix: $$ \left(\begin{array}{rrrrrr} 1 & 0 & 2 & 0 & -1 & 2 \\ 0 & 1 & 1 & 0 & 3 & -1 \\ 0 & 0 & 0 & 1 & 4 & -2 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{array}\right) $$ and translate it to the homeogenous system $$ \begin{align} x_1 + 2x_3 - x_5 + 2x_6 &= 0 \\ x_2 + x_3 + 3x_5 - x_6 &= 0 \\ x_4 + 4x_5 - 2x_6 &= 0 \end{align}. $$ Solving for the pivot variables $x_1$, $x_2$, and $x_4$ in terms of the free variables $x_3$, $x_5$, and $x_6$, we find $$ \begin{align} x_1 &= -2x_3 + x_5 - 2x_6 \\ x_2 &= -x_3 - 3x_5 + x_6 \\ x_4 &= -4x_5 + 2x_6 \end{align}. $$ From there, it's pretty easy to see the general form of a solution: $$ x_3 \left(\begin{matrix} -2 \\ -1 \\ \color{lightgray}1 \\ 0 \\ \color{lightgray}0 \\ \color{lightgray}0 \end{matrix}\right) + x_5 \left(\begin{matrix} 1 \\ -3 \\ \color{lightgray}0 \\ -4 \\ \color{lightgray}1 \\ \color{lightgray}0 \end{matrix}\right) + x_6 \left(\begin{matrix} -2 \\ 1 \\ \color{lightgray}0 \\ 2 \\ \color{lightgray}0 \\ \color{lightgray}1 \end{matrix}\right) $$ Note that each vector corresponds naturally to the free variable that appears as its scalar multiple - just as in the formula for $(\vec{z}_j)_i$ in our algorithm. Also note that entries corresponding to the free indices have been lightly grayed. This should allow us to distinguish between the free indices and the pivot indices fairly easily.

For the free indices, we have $$ (\vec{z}_j)_i = \begin{cases} 1 & \text{if } i \in F, i = f_j \\ 0 & \text{if } i \in F, i \neq f_j \\ \color{#ddd}-b_{k,f_j} & \color{#ddd}\text{if } i \in D, i=d_k \end{cases}. $$ Note that $j$ tells us which spanning vector we are constructing; we want the $i^{\text{th}}$ elment to be $1$ precisely when $i$ is the $j^{\text{th}}$ free index. For example, when $j=3$ we should examine our $3^{\text{rd}}$ element of $F=\{3,5,6\}$, which is 6. Thus, the $6^{\text{th}}$ element of the third vector should be 1. The other free index positions (3 and 5) should contain zeros.

For the pivot indices, we have $$ (\vec{z}_j)_i = \begin{cases} \color{#ddd}1 & \color{#ddd}\text{if } i \in F, i = f_j \\ \color{#ddd}0 & \color{#ddd}\text{if } i \in F, i \neq f_j \\ -b_{k,f_j} & \text{if } i \in D, i=d_k \end{cases}. $$ Again, $j$ indicates which spanning vector we are constructing so it makes sense that we should look in the column corresponding to the $j^{\text{th}}$ free variable. Now, however, we are solving an equation to find the contribution from the $j^{\text{th}}$ free variable to one of the pivot variables. That's where the $f_j$ comes from in the $-b_{k,f_j}$. To contribute to the $k^{\text{th}}$ pivot variable, we should be working in the $k^{\text{th}}$ row of the reduced row echelon form of the matrix. That's where the $k$ comes from in the $-b_{k,f_j}$.

In our working example, the second vector corresponds to the second free index, which is 5. Thus, the terms we see in that vector should come from the $5^{\text{th}}$ column of our matrix. If we are interested in, say, the third pivot variable, which is 4, we should look in the third row of the matrix. That's why we see the $-4$ in the fourth position of the second vector.