Skip to main content

Section 4.1 Another look at the Cantor set

In the last chapter we met (in an as yet uncompleted section) a strange set called the Cantor set. As it turns out, this is the protypical self-similar set so let's take a look a closer look.

Cantor constructed his set in the 1880’s to help him understand a problem in Fourier series. While the set seemed unnatural to mathematicians of the time, it has become a central example in real analysis. Cantor’s construction is as follows. Start with the unit interval \(I = [0,1],\) the set of all real numbers between 0 and 1 inclusive. Remove the open middle third \(\left(\frac{1}{3},\frac{2}{3}\right)\) of the interval \(I\) to obtain the two intervals \(I_1 = \left[0,\frac{1}{3}\right]\) and \(I_2 = \left[\frac{2}{3},1\right].\) Then remove the open middle thirds of the intervals \(I_1\) and \(I_2\) to obtain the intervals \(I_{1,1} = \left[0,\frac{1}{9}\right],\) \(I_{1,2} = \left[\frac{2}{9},\frac{1}{3}\right],\) \(I_{2,1} = \left[\frac{2}{3},\frac{7}{9}\right],\) and \(I_{2,2} = \left[\frac{8}{9},1\right].\) Repeating this process inductively, we obtain \(2^n\) intervals of length \(1\left/3^n\right.\) at the \(n^{\text{th}}\) stage. The cantor set \(C\) consists of all those points in \(I\) which are never removed at any stage. More precisely, if \(C_n\) denotes the union of all of the intervals left after the \(n^{\text{th}}\) stage of the construction, then

\begin{equation*} C = \underset{n=1}{\overset{\infty }{\bigcap }}C_{n.} \end{equation*}

This process is illustrated in figure Figure 4.1.1.

Figure 4.1.1. Construction of the Cantor set

It’s clear that \(C\) should be self-similar, since the effect of the construction on the intervals \(I_1\) and \(I_2\) is the same as the effect on the whole interval \(I,\) but on a smaller scale. Thus \(C\) consists of two copies of itself scaled by the factor \(1/3.\)

The Cantor set has many non-intuitive properties. In some sense, it seems very small; if we were to assign a “length” to it, that length would have to be zero. Indeed, by it’s very construction it is contained in \(2^n\) intervals of length \(1\left/3^n.\right.\) Thus the length of \(C_n\) is \(2^n/3^n\) which tends to zero as \(n\rightarrow \infty .\) Since \(C\) is contained in \(C_n\) for all \(n\text{,}\) the length of \(C\) must be zero. It might even appear that there is nothing left in \(C\) after tossing so much out of the original interval \(I.\) In reality, the Cantor set is a very rich set with infinitely many points. Recall that only open intervals are removed during the construction. Thus all of the infinitely many endpoints remain. For example, \(1/3,\) \(2/3,\) and \(80/81\) are all in \(C.\) There are still many more points in \(C,\) however.

There is a general technique for finding points of the Cantor set. The first stage in the construction consists of the two intervals \(I_0\) on the left and \(I_2\) on the right. Choose one and discard the other. Now the interval we chose, say \(I_0\) for concreteness, contains two disjoint intervals, \(I_{0,0}\) and \(I_{0,2},\) in the next stage of the construction. Choose one of those and discard the other. If we continue this process inductively, we obtain a nested sequence of closed intervals which will collapse down to a point in the Cantor set. In this way, each sequence of zeros and twos forms an address for a point in the Cantor set; there is a one-to-one correspondence between the points and the addresses.

The addressing scheme for the Cantor set allows us to prove a number of interesting properties. To see that there are points in the Cantor set that are not endpoints of any of the removed intervals, simply note that the those endpoints correspond exactly to addresses that end in all zeros or all twos, but there are many other possible addresses - \((0,2,0,2,0,2,\ldots)\) comes to mind.

From a more advanced perspective, the addressing scheme allows us to show that the Cantor set is uncountable, since the set of addresses certainly is.

To find specific points in the Cantor set, we might regard the addresses as expansions in base 3, also called ternary expansions. To see this note that everything in the first third of the unit inveral can be written with a leading \(0\) while everything in the last third of the unit interval can be written with a leading \(2\text{.}\) Each of those breaks up in a similar manner, as shown in Figure 4.1.2.

Figure 4.1.2. Addresses for the Cantor set

Thus, we now see that the Cantor set is exactly the set of all numbers in the unit interval that can be written using only zeros and twos in its ternary expansion. Furthermore, one particular point in the Cantor set that is not one of the removed endpoints is

\begin{align*} 0_{\dot 3}\overline{02} \amp= \sum_{n=1}^{\infty} \frac{2}{3^{2n}} = 2\sum_{n=1}^{\infty} \left(\frac{1}{9}\right)^n\\ \amp= 2\frac{1/9}{1-1/9} = 2\frac{1/9}{8/9} = 2\times\frac{1}{8} = 1/4. \end{align*}

Use repeating blocks of length three to find three different numbers in the Cantor set that are not endpoints of any of the removed intervals.

A major question that we will address later in the book asks, “What is the dimension of the Cantor set?” Certainly, it is too small to be considered a one dimensional set; it is just a scattering of points along the unit interval with length zero. It is uncountable, however; perhaps it is too large to be considered as zero dimensional. We will develop a notion of “fractal dimension” that quantitatively captures this in-betweeness.