11 January 2022

Infinity and infinite sets are some of the most captivating ideas in mathematics. They stretch our minds and challenge us to imagine things with no parallel in reality.

Unfortunately, this otherworldliness also means it can be hard to get an accurate intuition of what they are like. For example, the nature of the real numbers and their uncountable size are concepts which are frequently misunderstood—even famous YouTube channels Numberphile and Vsauce have made mistakes in their explanations.

So what did they get wrong about the reals? And is there a better way to think about them? What makes the reals really real?

Let’s start with some definitions. We say a set is countably infinite if its elements can be arranged in an infinitely long list; it has a first item, a second item, and so on ad infinitum. Mathematically, we define this as having a one-to-one pairing with the set of whole numbers \(\mathbb{N} = \{1, 2, 3, ...\}\); each element can be paired with one unique whole number, just like the numbering in an (infinite) list.

We say a set is uncountably infinite, or uncountable, if it is “bigger” than \(\mathbb{N}\). By bigger here I mean that there are too many elements in the set to pair uniquely with whole numbers. In other words, any infinite list of its elements will be incomplete—it’s simply too infinite!

I won’t show it here, but we can prove that the set of real numbers, \(\mathbb{R}\) (containing “all” numbers: fractions, decimals, irrationals, etc.), is uncountably infinite.1

If we think about what real/decimal numbers are like, then this is where we might be tempted to make a misguided explanation to understand this. It goes like this:

“Of course we can’t list out the real numbers. Let’s take just the interval \([0, 1]\). Our first number is 0; but what’s our next one? Is it 0.001? 0.000001? We can always find something smaller than what we chose before, so we can’t list the numbers starting from 0.”

This is the kind of explanation given in two popular videos about infinity: Numberphile’s Infinity is bigger than you think; and Vsauce’s The Banach–Tarski Paradox. It’s how I understood it for a while, but unfortunately, it’s misleading.

This property of always being able to find numbers between other ones, is called being “densely ordered.” \(\mathbb{R}\) is densely ordered—but this is not unique to it. There are smaller sets which can also be densely ordered.

Take the set of rational numbers, called \(\mathbb{Q}\), which is the set of numbers which can be written as a fraction of whole numbers. Given two different rationals \(p, q\), we can always find another rational number between them:

\[p \lt \frac{p+q}{2} \lt q.\]

This should give us the same problem as with the reals: we can’t list them out in order, so they must be uncountable.

But actually, they’re not: we can list them, with a clever trick of writing the rationals out in a grid where going right increases the bottom number, and going down increases the top. If we take a zigzag path through the numbers, skipping any duplicates, this gives us an infinite list which includes every rational.

Hence, \(\mathbb{Q}\) is a countably infinite set, while also being densely ordered.

(If we wanted, we could also make a non-dense ordering on \(\mathbb{Q}\). If we invent some other idea of order based on the list, so that a number is “less than” another if it is earlier in the list, then this order is not dense: there aren’t any numbers between two adjacent elements in the list. This works with any countably infinite set.)

So, if the existence of a dense ordering of \(\mathbb{R}\) isn’t unique to it, what is?

To answer this question, we have to look at least upper bounds. Probably one of the most self-explanatory terms in mathemathics, the least upper bound for a set is a number which:

  • is an upper bound, i.e. is greater or equal to all elements in the set
  • and is the least of them, i.e. any smaller number is no longer an upper bound.

For example, the real interval \([0,1]\) (inclusive) has a LUB of 1:

  • 1 is greater or equal to all numbers in the interval
  • but we can’t go any lower than 1 without 1 being greater than it.

In this case, the LUB is the same as the set’s maximum. But the motivation for least upper bounds is that they solve some tricky problems with finding the maximum of some sets.

To see this, let’s remove 1 to make the open-ended interval \([0,1)\). Despite not going off to infinity, this set has no maximum element. It would be 1 if it were included, but since it isn’t, we run into a similar problem to before: there is no number just before 1 we could choose as the maximum. Fortunately, the LUB does exist, and is still 1. It’s outside the set, but that’s fine for a LUB.

Now for what we’ve been waiting for: the major property which distinguishes \(\mathbb{R}\) from smaller sets like \(\mathbb{Q}\) is that all subsets of \(\mathbb{R}\), where they have an upper bound, always have a least upper bound.

For a counter-example, let’s take \(\mathbb{Q}\) and define a subset

\[L = \{3, 3.1, 3.14, 3.141, ...\},\]

made up of all finite decimal fragments of \(\pi\). It has upper bounds, of course; any number greater than \(\pi\) is an upper bound. However, it does not have a least upper bound in \(\mathbb{Q}\)! For any upper bound, we can always find a smaller one by picking a rounded up truncation of \(\pi\) which goes below it; i.e. a number from the set

\[U = \{4, 3.2, 3.15, 3.142, ...\}.\]

The “missing” LUB of \(L\) is \(\pi\) itself, which cannot be written as a fraction and is hence not in \(\mathbb{Q}\). The way this set is defined reveals a kind of “hole” or “gap” in \(\mathbb{Q}\), where we wish some extra number existed.2

There are no such holes in \(\mathbb{R}\). All subsets like \(L\) will always have a real least upper bound. This is what we call “the completeness of the reals,” and it is essential to all mathematics which deals closely with real numbers.

So there we have it. We learned that dense orderings, despite being seemingly unlistable, can be made on any countably infinite set via pairing with \(\mathbb{Q}\). What makes the reals really real is the absense of these strange holes that we can find at the boundaries of sets.

  1. We prove this by assuming the reals can be listed, and then constructing a new number which we can show isn’t in the list. See this video by rootski for the details. 

  2. This method of finding gaps in \(\mathbb Q\)—taking two sets which squeeze a single real number between them—is so good that real numbers can actually defined by these pairs of sets, called Dedekind cuts. In this framework, the number \(\pi\) is uniquely identified by the pair of sets \((L, U)\). The set of real numbers is the “closure” of the rationals under Dedekind cuts: the set we end up with when we add in all the gaps we find between cuts.