Arick Shao 邵崇哲

What Are the Foundations of Mathematics?

In mathematics, when one proves some statement to be true, one makes a certain set of assumptions and formally argues in favor of some conclusion. For example, if one is given a triangle, whose angles are given by \(a\), \(b\), and \(c\), then one can formally argue that \(a + b + c = 180^\circ\). In terms of using mathematics to model the real world, or applying mathematics toward the real world, these assumptions we make should hence be based on things we consider as "reasonable" or "apparent" by our real-world intuitions.

For example, the previous article, [Why Prove Things?], contained a simple proof of the fact that the sum of two even numbers are even. Note, however, that this proof took as implicit assumptions the basic properties of the number system, in particular the integers. For instance, we accepted without question the distributive property: given two integers \(x\) and \(y\), the sum \(2x + 2y\) is equal to \( 2 ( x + y ) \). In fact, from the beginning of primary school all the way through much of the undergraduate mathematics curriculum, we have taken the integers themselves (as well as the real numbers, etc.) for granted. Indeed, we view this informal description of numbers, their basic properties, and how basic algebraic operations act on them form the very foundations of elementary mathematics.

In many ways, these assumptions about the number system are reasonable. Such numbers accurately reflect many facets of everyday life, including counting objects and financial transactions. Most people work with numbers, as well as with expressions such as "\(2\)" or "\(8 + 7\)", without knowing, or needing to know, precisely what these objects are. In this sense, how numbers and operations on numbers work is accepted as matters of faith.

The Building Blocks of Mathematics

Although one may be happy to just believe in the number system, this needs not be the end of the story. What if, we instead assume, or believe in, something more basic than numbers, and we construct the number system from these more basic assumptions? Philosophically speaking, this would result in an even stronger and more plausible mathematical universe. It would also lend additional credence to the numbers as objects that naturally arise from this larger mathematical universe.

In advanced undergraduate-level mathematics, one may come across the Peano axioms. These are a small set of assumptions modeling the nature of counting, from which one can logically derive the natural numbers. Then, from these natural numbers, along with other more foundational assumptions, one can in fact logically construct the entire number system—the integers, rational numbers, and real numbers—as well as define all their basic algebraic operations. Furthermore, one can then prove all the basic properties of these numbers, such as the aforementioned distributive property, from these more basic assumptions.

You may wonder just how far down the rabbit hole one can go. Is there some most fundamental set of assumptions—the foundations of mathematics—from which all of mathematics as we know it can be reconstructed?

Since the late 1800s, several mathematicians attempted to create these foundations. One particularly notable person in this story is Georg Cantor, who in 1874 pioneered set theory. In this formulation, the most basic objects in the mathematical universe are sets, which informally represent "collections of things". For example, both the integers and the real number line can be viewed as sets (of numbers), as can all other objects found within mathematics. Although counterintuitive, any number itself could be viewed as a set (though it is not clear what elements the set \(2.5\) should contain), as can the \(+\) operation!

While sets provide a formal framework encapsulating all the objects of the mathematical universe, the notion of sets itself is treated informally, like the number system was in elementary mathematics. Though it was understood what sets intuitively represent, what sets formally and precisely are was left vague. In practice, one constructs a set by informally producing criteria for what belonged in this set (e.g., the set of all fruits except for oranges, the set of all even numbers). Because of its informal nature, Cantor's set theory is often called a naive set theory.

For future reference, we introduce just a bit of set notation. One of the most basic relations in set theory is that of set membership, whether one object (which is also a set) lies in another set. If a set \(x\) lies in another set \(y\) (i.e., \(x\) is an element of \(y\)), then we write \(x \in y\). For example, the number \(1\) lies in the set \(\mathbb{N}\) of all natural numbers, so we write \(1 \in \mathbb{N}\).

Russell's Paradox

For most practical purposes, this naive set theory, with its objects based on informally defined sets, seems to provide an adequate foundation for mathematics. Indeed, most of undergraduate and postgraduate mathematics can be thought of as being built upon this foundation. However, even before 1900, mathematicians knew that mathematics built on a naive set theory had serious issues.

Perhaps the simplest and most well-known issue is what we now call Russell's Paradox, named after Bertrand Russell. (Although, this is not historically the first issue with naive set theory.) The idea is rather simple, even if it is quite confusing at first glance. We consider the set, which we call \(A\) here, defined as containing precisely every set \(x\) that is not an element of itself (i.e., \(x \not\in x\)). In formal set-theoretic notation, this can be written as \[ A = \{ x \mid x \not\in x \} \text{.} \] Every statement in the above description is certainly valid, so there is no problem with defining such a set \(A\).

So far, so good. Now, we ask whether \(A\) itself is an element of \(A\). This is also a sensible question, with the answer being either "yes" or "no", so let us explore both possibilities.

  1. Suppose, on one hand, that \(A\) is an element of itself. Because \(A \in A\), then \(A\) must satisfy the criterion we used to define \(A\). In other words, \(A \not\in A\). As this conclusion directly contradicts what we assumed, we conclude that \(A\) could not have possibly been an element of itself to begin with.
  2. Suppose, on the other hand, that \(A\) is not an element of itself. Because \(A \not\in A\), then \(A\) must now violate the criterion we used to define \(A\). In other words, since it is not the case that \(A \not\in A\), we must have that \(A \in A\) instead. Again, this contradicts what we assumed, hence the case that our initial assumption \(A \not\in A\) holds is also not possible.

At this point, we have proved that the only two possibilities for our question, "Is \(A\) an element of itself?", result in contradictions and hence are impossible. In other words, we are completely out of options. The only conclusion is that all of mathematics, even at this most basic level, is one giant contradiction! We have, in less than four paragraphs, destroyed all of mathematics.

Let us now sit and fully appreciate just how devastating Russell's paradox really is. Note that none of this argument involves any background from calculus, trigonometry, or algebra—or, more generally, any knowledge from university or secondary school mathematics. Even more, nothing in the argument even involved numbers or arithmetic—no primary school background either. Indeed, the only background we used in Russell's paradox was a few of the most basic notions involving sets (which are quite intuitive in their own right). Thus, with the minimal possible background knowledge, and with an argument lasting less than five minutes, we have completely dismantled mathematics as we know it. Devastating indeed!

Formal Set Theories

As of now, mathematics is still around, and mathematicians still have jobs. Thus, Russell's paradox must have been resolved between the late 1800s and now. How were mathematicians able to do away with what is apparently a crushing contradiction?

The main point is that our informal characterization of sets was too lax, and we were able to define all kinds of sets without worries. In terms of Russell's paradox, we were already doomed as soon as we were able to define the set \(A\), since the contradiction followed logically from that point. (In fact, we would still be doomed if we were even able to define the set of all sets!) Thus, the answer to Russell's paradox is that we must be much more careful with how we construct sets, so that we do not shoot ourselves, as well as all of mathematics, in the foot.

All of this means that this naive set theory, with its informal treatment of sets, had to go. In its place, one had to conceive of a formal set theory, with very careful rules and axioms regarding what is and is not a set. There are multiple ways to do this, with the two most popular approaches being the following:

  1. Zermelo–Fraenkel set theory: This approach maintains the view that sets are the basic building blocks of all mathematical objects. For this, one specifies, using around nine axioms, what objects are actually sets. Objects that are not allowed to be sets through the Zermelo–Fraenkel axioms then do not exist in this universe and hence cannot be considered under this theory.
  2. Von Neumann–Bernays–Gödel set theory: In this alternative approach, one considers instead a larger universe of objects. However, not every possible object (called a class) in this expanded universe can be a set. Again, a small number of foundational axioms in this theory determine what classes are also sets.

At the end of the day, regardless of which approach one prefers (the two approaches are in fact functionally equivalent), the end result is the same. The logical argument that previously led us to Russell's paradox is now, under either theory, a theorem which states that \(A\) is not a set. Disaster averted!

The moral of the story is just how essential it is to have a very formal and rigorous description of what we do in mathematics, even down to the most basic foundations. By being just a bit lax with sets, the most basic objects around, we caused enough trouble to end mathematics.

Berry's Paradox

Finally, we discuss another apparent paradox of a different nature that again threatens all of mathematics. This Berry's paradox, named after a librarian at Oxford, can be described as follows.

Let \(B\) be the set of all numbers than can be described using fifty or less English words. Since there are only finitely many English words, there must only be finitely many ways one can write fifty or less English words. As a result, this set \(B\) must be finite. Moreover, as a finite set, \(B\) must have a largest number, which we call \(n\). However, we can now describe \(n+1\) as "one greater than the largest number that can be described using fifty or less English words". Since the above description certainly uses less than fifty English words, \(n + 1 \in B\), contradicting that \(n\) is the largest number in \(B\). Yet again, mathematics faces imminent destruction!

How can mathematics escape from this new existential threat? It turns out that the culprit is the vague use of language, in particular the imprecise nature of the English language. While the phrase "can be described" is perfectly viable for everyday conversation and prose, it is, on the other hand, ambiguous and devoid of meaning in the context of formal logic. As such, its use leaves us susceptible to nonsense.

The resolution to Berry's paradox, then, is that we must be much more precise and formal with the language we use in mathematics (whereas to resolve Russell's paradox, we have to exercise care in how we define sets). In particular, in our arguments, we must remain within the confines of formal logic—precise constructs such as "and", "or", "if/then", "for all"—and avoid ambiguous notions such as "can be described".

Why We Need Formal Foundations

When students initially encounter formal mathematics, with its proofs and insistence on rigor, then often (rightly) question why mathematicians adhere so strictly to formalities, especially when intuitions already make a solution seem obvious. While there are many reasons for being rigorous with proofs (see the discussion in [Why Prove Things?], for instance), Russell's and Berry's paradoxes add yet another compelling argument. If we fail to be careful and precise with even the most basic language or building blocks of mathematics, then we can very easily create contradictions that render all of mathematics meaningless. Thus, as a discipline based on formal logical reasoning, mathematics demands that we remain precise and rigorous, even at its most foundational levels.

Valid XHTML 1.0 Strict!Valid CSS!