The axiom of choice

This post is about the axiom of choice, arguably the most famous and the most controversial of the axioms underpinning the set theoretic foundations of mathematics today. The axiom of choice is about making choices: choosing an element each from an arbitrary collection of nonempty sets. The axiom basically says that this can always be done, although it provides no recipe for doing so.

Before formally presenting the axiom we state some definitions. The treatment is essentially taken from here.

Definition: A function I from a set \Lambda onto a set X is said to index the set X by \Lambda. The set \Lambda is called the index and X is the indexed set. If I(\lambda)=x, we write x_\lambda for I(\lambda).

Definition: Let X be a nonempty set indexed by a set \Lambda. The cartesian product of X is defined to be the set \Pi _{\lambda\in\Lambda}x_\lambda of all functions f with domain \Lambda and codomain \cup X=\cup_{x\in X}x, satisfying the condition f(\lambda)\in x_\lambda.

Let us consider an example before we go further. We let X=\{\{a,b\},\{c,d\}\} be indexed by \Lambda=\{1,2\} where 1\to \{a,b\}, 2\to \{c,d\}. Now there are four functions possible from \Lambda to \{a,b,c,d\} which satisfy the definition:

The function f_1 for which f(1)=a, f(2)=c.
The function f_2 for which f(1)=b, f(2)=c.
The function f_3 for which f(1)=a, f(2)=d.
The function f_4 for which f(1)=b, f(2)=d.

The collection \{f_1,f_2,f_3,f_4\} is the cartesian product. Now intuitively the behavior of the function f_1 can be captured by simply writing (a,c) since the fact that the first coordinate corresponds to a and the second to c indicates the function 1\to a,2\to c. Similarly the behavior of the function f_2,f_3,f_4 are also captured by (b,c),(a,d),(b,d) respectively. Note that a particular ordered pair always corresponds to a unique function f_i. So the cartesian product may be simply represented as \{(a,c),(b,c),(a,d),(b,d)\}.

In fact if \Lambda has at most countable number of elements within it, we can always consider it as an subset of the form \{1,2,3\cdots\} (which may or may not terminate), and index appropriately. Then the cartesian product obtained can be equivalently thought of as a collection of n-tuples or infinite sequences where the ith coordinate corresponds to the image of i in the indexing. For example if X=\{\{0,1\}\} and \Lambda=\mathbb{N} then all zero-one sequences correspond to the requisite functions and their collection forms the cartesian product.

Definition: Any function f which satisfies the conditions required to make it a member of a cartesian product is called a choice function.

We now give three formulations of the axiom of choice:

Axiom of choice:

1. For every nonempty set whose elements are nonempty sets and are indexed by a set I there exists a choice function.

2. If \{a_i\} is a family of nonempty sets, indexed by a nonempty set I, then there exists a family \{x_i\} with i\in I such that x_i\in a_i for each i\in I. (This corresponds roughly to our informal comment about choices in the first paragraph.)

3. The cartesian product of a nonempty collection of nonempty sets is nonempty.

Theorem: The three formulations are equivalent.

Proof: 1\Rightarrow 2: Let \{a_i\}_{i\in I} be a family of sets, indexed by the nonempty set I. Let S=\{a_i\mid i\in I\}. (Note that the elements of S are themselves sets.) We index S by itself, and consider the cartesian product of S. By the assumption and by I\ne \emptyset, the set S is a nonempty set of nonempty sets. Hence owing to (1) there exists a choice function, i.e. a function f\colon S\to \bigcup S such that f(x)\in x for all x \in S. For i\in I let x_i=f(a_i). Then \{x_i\}_{i\in I} is a family with the required properties.

2\Rightarrow 1: Let S=\{a_i:i\in I\} be a nonempty set whose elements are themselves nonempty where S is indexed by I. By (2) we have a family x_i with x_i\in a_i. We define our choice function to be f:I\to\cup S where f(i)=x_i.

1\Leftrightarrow 3: This is obvious by the definitions.\Box

Due to this result any of these three statements may be referred to as the axiom of choice. In fact, there are many other equivalent statements. The term ZFC is usually used to denote the Zermelo Fraenkel axioms together with the axiom of choice, and these axioms form a basis of most of the mathematics done today.

Why is this axiom so relevant? Firstly if one refuses to accept this axiom, one loses a lot of interesting results. Secondly, many times assuming the axiom of choice is so much easier. Many results such as the Cantor-Berstein-Schroeder theorem were proved originally using the axiom of choice, although later a “choice-free” proof was discovered. Similarly the axiom of choice makes undergraduate analysis easier by enabling one to say that f(x) is continuous at x=a if f(a_n) tends to f(a) for each sequence (a_n) that tends to a.

Are there any negative issues with the axiom of choice? Well, firstly it is not constructive and doesn’t actually tell us the choices to make. It just guarantees that some choice exists. In one formulation it tells us that the real numbers can be well ordered but it gives us no recipe of doing so. In other words, it doesn’t actually construct a well order of the reals for us. Secondly, some counter intuitive results such as the Banach Tarsky paradox are true in ZFC and many mathematicians find this troubling and “fishy”.

Thirdly, and probably most importantly, there is history. After it was introduced (in another form) in 1904 by Zermelo, many mathematicians wondered whether or not the axiom was consistent with ZF. In other words people asked whether accepting it would lead to contradictions. Many decades passed before in 1938 Godel showed that ZFC is indeed consistent. Another doubt was whether the axiom of choice was independent of ZF. It was only by 1963 that Cohen proved that the axiom of choice cannot derived from ZF and so together with Godel’s result this implied that the axiom of choice is independent of ZF. During this long period from 1904 to 1963, many bizzare results such as the Banach Tarsky paradox, existence of non-Lebesgue measurable sets etc were found. All this sowed a suspicion in people’s minds as to whether this axiom was “false” at some fundamental level. As a result mathematicians began to pay close attention to results in which the axiom was used, and to try and device proofs which avoided the axiom as much as possible. This habit somewhat lessened after the independence results, but it has not been dropped by everyone even today.

The fact of the matter is that while rejecting the axiom of choice leads to many weird results, so does accepting it. See the chapters entitled “Disasters without Choice” and “Disasters with Choice” in this book for details. However in the long run, the advantages of accepting the axiom outweigh the advantages of rejecting it primarily because the sheer quantity of beautiful and aesthetic results flowing from the axiom of choice provide substance to mathematics as a whole.



Filed under Miscellaneous, Set theory

3 responses to “The axiom of choice

  1. Pingback: On unabridged versions of Ramsey’s theorem | Notes on Mathematics

  2. Pingback: Konig’s lemma | Notes on Mathematics

  3. Pingback: The Zermelo Fraenkel axioms | Notes on Mathematics

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s