Monthly Archives: September 2012

The Cantor Set

This is a post regarding the basic properties of the Cantor set.

We start with the definition: The Cantor set is obtained by first constructing a sequence (C_n) of closed sets and then taking the intersection of the sets in this sequence. The sequence is constructed as follows:

1. Start with [0,1] and remove the middle third, i.e. the interval (\frac{1}{3},\frac{2}{3}) to obtain the set C_1=[0,\frac{1}{3}]\cup[\frac{2}{3},1]. So C_1 contains 2 disjoint closed intervals each of length \frac{1}{3}.

2. Next remove the middle third of each of these two intervals leaving C_2=[0,\frac{1}{9}]\cup[\frac{2}{9},\frac{1}{3}]\cup[\frac{2}{3},\frac{7}{9}]\cup[\frac{8}{9},1] consisting of 2^2 disjoint closed intervals each of length \frac{1}{3^2}.

3. Assuming C_n has been constructed and consists of 2^n disjoint closed intervals each of length \frac{1}{3^n}, remove the middle thirds of all these intervals to obtain C_{n+1} consisting of 2^{n+1} disjoint closed intervals each of length \frac{1}{3^{n+1}}.

By induction we have our sequence (C_n).

Now, we define the Cantor set C as C=\cap_{n=1}^\infty C_n.

We now describe the properties of the Cantor set:

1. In the usual topology on \mathbb{R} the Cantor set is closed (being an intersection of closed sets). Moreover since it is bounded so by the Heine Borel theorem it is compact. What’s extraordinary is that every compact metric space is a continuous image of the Cantor set! The proof may be found here.

2. The Cantor set is uncountable. We prove this by considering the ternary expansion of all numbers in [0,1]. All such numbers may have the form 0.a_1a_2\cdots with a_i\in\{0,1,2\} (including 1=0.22\cdots). We claim that x\in C iff x has a ternary expansion of the form 0.a_1a_2\cdots with a_i\in\{0,2\}.

To prove this consider the construction of C through the ternary lens: In ternary the construction of C_1 involves removing all numbers in (0.011\cdots,0.2) from [0.00\cdots,0.22\cdots]. This shows that in C_1 every number has a ternary expansion of the form 0.a_1a_2\cdots with a_1\in\{0,2\} and a_i\in\{0,1,2\}\forall i>1. Conversely every number of this form is in C_1. Likewise the construction of C_2 involves removing all numbers in (0.0011\cdots,0.02) and in (0.2011\cdots,0.22) from C_1. This shows that in C_2 every number has a ternary expansion of the form 0.a_1a_2\cdots with a_1,a_2\in\{0,2\} and a_i\in\{0,1,2\}\forall i>2. Again conversely every number of this form is in C_2. Continuing in this way we can conclude that x\in C_n iff x=0.a_1\cdots a_na_{n+1}a_{n+2}\cdots with a_1\cdots a_n\in\{0,2\} and a_i\in\{0,1,2\}\forall i>n. Hence by definition of C our claim is established.

Now assume that C is countable and has been enumerated in the list described below:

0.d_1^{(1)}d_2^{(1)}d_3^{(1)}\cdots
0.d_1^{(2)}d_2^{(2)}d_3^{(2)}\cdots
0.d_1^{(3)}d_2^{(3)}d_3^{(3)}\cdots
\cdots

Here each d_i^{(j)}\in \{0,2\}.

Now let x=0.a_1a_2a_3\cdots where a_i=0 if d_i^{(i)}=2 and a_i=2 otherwise. Then x\notin C despite being of the requisite ternary form. Hence C is uncountable.

3. The Cantor set has Lebesgue measure zero. We define a set A to be of Lebesgue measure zero if \forall \epsilon>0\exists a sequence of intervals (I_n) such that A\subset\cup_{n=1}^\infty I_n and \sum_{n=1}^\infty l(I_n)<\epsilon. Here if the end points of I_n are a_n and b_n (with a_n\le b_n) then l(I_n)=b_n-a_n. (The proof that this definition is consistent with the Lebesgue measure may be found in this book.) Now given \epsilon>0 choose n so that (\frac{2}{3})^n<\epsilon. Since C\subset C_n and the total length of C_n is (\frac{2}{3})^n, so it is clear that C is a set of Lebesgue measure zero. Together with point 2, we see that this establishes the existence of uncountable sets of zero measure!

Advertisements

Leave a comment

Filed under Measure Theory, Real Analysis, Set theory

The axiom of choice

This post is about the axiom of choice, arguably the most famous and the most controversial of the axioms underpinning the set theoretic foundations of mathematics today. The axiom of choice is about making choices: choosing an element each from an arbitrary collection of nonempty sets. The axiom basically says that this can always be done, although it provides no recipe for doing so.

Before formally presenting the axiom we state some definitions. The treatment is essentially taken from here.

Definition: A function I from a set \Lambda onto a set X is said to index the set X by \Lambda. The set \Lambda is called the index and X is the indexed set. If I(\lambda)=x, we write x_\lambda for I(\lambda).

Definition: Let X be a nonempty set indexed by a set \Lambda. The cartesian product of X is defined to be the set \Pi _{\lambda\in\Lambda}x_\lambda of all functions f with domain \Lambda and codomain \cup X=\cup_{x\in X}x, satisfying the condition f(\lambda)\in x_\lambda.

Let us consider an example before we go further. We let X=\{\{a,b\},\{c,d\}\} be indexed by \Lambda=\{1,2\} where 1\to \{a,b\}, 2\to \{c,d\}. Now there are four functions possible from \Lambda to \{a,b,c,d\} which satisfy the definition:

The function f_1 for which f(1)=a, f(2)=c.
The function f_2 for which f(1)=b, f(2)=c.
The function f_3 for which f(1)=a, f(2)=d.
The function f_4 for which f(1)=b, f(2)=d.

The collection \{f_1,f_2,f_3,f_4\} is the cartesian product. Now intuitively the behavior of the function f_1 can be captured by simply writing (a,c) since the fact that the first coordinate corresponds to a and the second to c indicates the function 1\to a,2\to c. Similarly the behavior of the function f_2,f_3,f_4 are also captured by (b,c),(a,d),(b,d) respectively. Note that a particular ordered pair always corresponds to a unique function f_i. So the cartesian product may be simply represented as \{(a,c),(b,c),(a,d),(b,d)\}.

In fact if \Lambda has at most countable number of elements within it, we can always consider it as an subset of the form \{1,2,3\cdots\} (which may or may not terminate), and index appropriately. Then the cartesian product obtained can be equivalently thought of as a collection of n-tuples or infinite sequences where the ith coordinate corresponds to the image of i in the indexing. For example if X=\{\{0,1\}\} and \Lambda=\mathbb{N} then all zero-one sequences correspond to the requisite functions and their collection forms the cartesian product.

Definition: Any function f which satisfies the conditions required to make it a member of a cartesian product is called a choice function.

We now give three formulations of the axiom of choice:

Axiom of choice:

1. For every nonempty set whose elements are nonempty sets and are indexed by a set I there exists a choice function.

2. If \{a_i\} is a family of nonempty sets, indexed by a nonempty set I, then there exists a family \{x_i\} with i\in I such that x_i\in a_i for each i\in I. (This corresponds roughly to our informal comment about choices in the first paragraph.)

3. The cartesian product of a nonempty collection of nonempty sets is nonempty.

Theorem: The three formulations are equivalent.

Proof: 1\Rightarrow 2: Let \{a_i\}_{i\in I} be a family of sets, indexed by the nonempty set I. Let S=\{a_i\mid i\in I\}. (Note that the elements of S are themselves sets.) We index S by itself, and consider the cartesian product of S. By the assumption and by I\ne \emptyset, the set S is a nonempty set of nonempty sets. Hence owing to (1) there exists a choice function, i.e. a function f\colon S\to \bigcup S such that f(x)\in x for all x \in S. For i\in I let x_i=f(a_i). Then \{x_i\}_{i\in I} is a family with the required properties.

2\Rightarrow 1: Let S=\{a_i:i\in I\} be a nonempty set whose elements are themselves nonempty where S is indexed by I. By (2) we have a family x_i with x_i\in a_i. We define our choice function to be f:I\to\cup S where f(i)=x_i.

1\Leftrightarrow 3: This is obvious by the definitions.\Box

Due to this result any of these three statements may be referred to as the axiom of choice. In fact, there are many other equivalent statements. The term ZFC is usually used to denote the Zermelo Fraenkel axioms together with the axiom of choice, and these axioms form a basis of most of the mathematics done today.

Why is this axiom so relevant? Firstly if one refuses to accept this axiom, one loses a lot of interesting results. Secondly, many times assuming the axiom of choice is so much easier. Many results such as the Cantor-Berstein-Schroeder theorem were proved originally using the axiom of choice, although later a “choice-free” proof was discovered. Similarly the axiom of choice makes undergraduate analysis easier by enabling one to say that f(x) is continuous at x=a if f(a_n) tends to f(a) for each sequence (a_n) that tends to a.

Are there any negative issues with the axiom of choice? Well, firstly it is not constructive and doesn’t actually tell us the choices to make. It just guarantees that some choice exists. In one formulation it tells us that the real numbers can be well ordered but it gives us no recipe of doing so. In other words, it doesn’t actually construct a well order of the reals for us. Secondly, some counter intuitive results such as the Banach Tarsky paradox are true in ZFC and many mathematicians find this troubling and “fishy”.

Thirdly, and probably most importantly, there is history. After it was introduced (in another form) in 1904 by Zermelo, many mathematicians wondered whether or not the axiom was consistent with ZF. In other words people asked whether accepting it would lead to contradictions. Many decades passed before in 1938 Godel showed that ZFC is indeed consistent. Another doubt was whether the axiom of choice was independent of ZF. It was only by 1963 that Cohen proved that the axiom of choice cannot derived from ZF and so together with Godel’s result this implied that the axiom of choice is independent of ZF. During this long period from 1904 to 1963, many bizzare results such as the Banach Tarsky paradox, existence of non-Lebesgue measurable sets etc were found. All this sowed a suspicion in people’s minds as to whether this axiom was “false” at some fundamental level. As a result mathematicians began to pay close attention to results in which the axiom was used, and to try and device proofs which avoided the axiom as much as possible. This habit somewhat lessened after the independence results, but it has not been dropped by everyone even today.

The fact of the matter is that while rejecting the axiom of choice leads to many weird results, so does accepting it. See the chapters entitled “Disasters without Choice” and “Disasters with Choice” in this book for details. However in the long run, the advantages of accepting the axiom outweigh the advantages of rejecting it primarily because the sheer quantity of beautiful and aesthetic results flowing from the axiom of choice provide substance to mathematics as a whole.

3 Comments

Filed under Miscellaneous, Set theory

The Zermelo Fraenkel axioms

In this post we plan to discuss the Zermelo Fraenkel axioms of set theory (a term which we will abbreviate as ZF). These axioms are used by most mathematicians as the pillars on which theorems and lemmas are build. Although the axioms are named after the mathematicians Zermelo and Fraenkel, contributions from Skolem too helped in evolving the consensus on what the basic axioms ought to be. A detailed history of the evolution of these axioms and of set theory in general may be found here.

We will take a naive approach towards the basic underlying principles of logic used to illustrate the axioms. Our overall goal is to give an essential idea, and not describe a rigorous formal system. Note that in ZF terms always denote sets. There is no consensus on the exact wordings and the sequence of the axioms. We are following the treatment given in this book.

1. Axiom of extensionality: If a and b have the same elements then a=b.

It is easy to prove using the underlying logic that equal sets have the same elements: Suppose x=y. If t\in x then by substitution t\in y. So all elements of x are in y. Similarly all elements of y are in x. Hence both x and y have the same elements. The above axiom is the converse. Therefore we can conclude that a=b iff a and b have the same elements. This also leads to the following theorem:

Theorem: a=b\iff (a\subset b\land b\subset a).
Proof: Suppose that a=b. Now for x\in a since it follows that x\in a (this is not a misprint) so a\subset a. By substitution we have a\subset b. Similarly we have b\subset a. Conversely if a\subset b and b\subset a then clearly \forall x, x\in a\iff x\in b and so a=b.\Box

2. Axiom of the empty set: A set \emptyset having no elements exists.

This axiom may be thought of as redundant in certain versions of ZF, where it may be deduced from other axioms. By a little piece of vacuous reasoning it follows that the empty set is a subset of every set.

3. Axiom of pairing: For every x and y, the pair set \{x,y\} exists.

We can prove the following theorem with the aid of this axiom:

Theorem: For every x, the set \{x\} exists.

Proof: Substituting x for y in the above axiom yields a set which by the axiom of extensionality equals \{x\}.\Box

It is also now clear that if y\subset\{x\} then y=\emptyset or y=\{x\}. For if y\ne \emptyset and t\in y then t\in \{x\} and as \{x\} is a singleton so t=x. But then x\in y and so \{x\}\subset y following which y=\{x\}.

4. Axiom of union: The elements of \cup X are the elements of the elements of X.

Hence the elements of \cup \{x,y\} are not x and y. Instead they are the elements of x and y. In other words \cup\{x,y\} is the familiar set x\cup y.

5. Axiom of power set: The elements of \mathcal{P}(X) are precisely the subsets of X.

We do not define other set functions, like intersection and Cartesian product since their existence follows from the next axiom.

6. Axiom of separation: If \psi(x) is a formula and X is a set then the set \{x\in X: \psi(x)\} exists.

It is now an easy matter to define intersection of the sets in X as Y=\cap X where Y=\{x\in\cup X:\forall t(t\in X\rightarrow x\in t)\}, provided X\ne \emptyset. To define the Cartesian product note that we define the ordered pair (a,b) as \{\{a\},\{a,b\}\} and then X\times Y=\{(x,y):x\in X\land y\in Y\}. We cannot however define complement because of the following theorem. It says that there is no universe (a set which has everything).

Theorem: There is no set U such that \forall x(x\in U).

Proof: Suppose that there exists such a set U. Now, by the axiom of separation we have the set X=\{z\in U:z\notin z\}. Now either we have X\in X or X\notin X. In case X\in X then by definition of X we have X\notin X. In case X\notin X then again by definition of X we have X\in X. Both cases lead to contradictions.\Box

In naive set theory where any definable collection (such as \{z:z\notin z\}) was understood as a set there is no way to clear the anomaly which follows from the above proof. One of the reasons for the abandonment of naive set theory in favor of ZF was in fact this paradox. In ZF the set \{z:z\notin z\} does not make sense as the axiom of separation specifically asks for a set X from which to pick the element z from. As we don’t have a universal set in ZF so the paradox is entirely sidestepped. The fact that there is no universal set in ZF has appeared intuitively undesirable to some, and there exist set theories which include a universal set.

It is interesting to note that although the paradox is attributed to Russell, it was Zermelo who had originally discovered it. Unfortunately he didn’t publish this idea and Russell independently discovered the paradox a year later and published it.

7. Axiom of infinity There is a set Z such that:
(i) \emptyset\in Z.
(ii) If x\in Z then \{x\}\in Z.

This axiom guarantees the existence of an infinite set. It is present to ensure that ZF is powerful enough to contain the collection of all numbers (a concept which we do not wish to make precise at this juncture). Once we have an infinite set we can also make precise the idea of a smallest infinite set as given by this axiom. This is illustrated by the following theorem:

Theorem: Let Z be the set provided by the above axiom. Let X be the set of subsets of Z that satisfy properties (i) and (ii) above. Let Z_0=\cap X. Then:
(a) X exists.
(b) Z_0 exists.
(c) If any set T satisfies properties (i) and (ii) of the axiom of infinity then Z_0\subset T.

Proof: (a) The set \{x\in \mathcal{P}(Z): x satisfies (i) and (ii) \} exists by the axiom of separation and equals X.
(b) The intersection is well defined if X\ne \emptyset which it is as Z\in X.
(c) Note that Z_0\cap T satisfies (i) and (ii) of the axiom of infinity and is in X so by definition of intersection Z_0\subset Z_0\cap T. If we suppose Z_0\nsubseteq T then \exists z_0\in Z_0 such that z_0\notin T following which z_0\notin Z_0\cap T. This contradicts Z_0\subset Z_0\cap T.\Box

We may define Z_0 as the set of whole numbers and show that the members of this set match our intuitive understanding of the members of \{0,1,2,3...\}. However such a discussion is presently out of our scope.

We have used the notion of the empty set in the statement of the axiom of infinity. The axiom of infinity can be rephrased so as not to actually assume that the empty set exists. We could have stated (i) as \forall y(\exists no x\in y\rightarrow y\in Z). Then the axiom of infinity a priori only asserts the existence of a set meeting this description, without actually asserting that it contains anything. Once we had got the set Z we could have used the axiom of separation to define a set \emptyset=\{x\in Z:x\ne x\}. Hence in this formulation the axiom of the empty set would have ceased to be an axiom and would have become a theorem of ZF. However, only a superficial difference exists in these two versions of ZF and we find no significant advantage in choosing one over the other.

8. Axiom of replacement: Informally this axiom states that ranges of functions restricted to sets exists. More formally, if \psi(x,y) is a formula such that \forall x\forall y\forall z(\psi(x,y)\land \psi(x,z)\rightarrow y=z) then for every set D there is a set R such that \forall y(y\in R\leftrightarrow \exists x(x\in D\land \psi(x,y))). The formula \psi(x,y) can now used to define the function governed by the relation f(x)=y as per the usual definition.

This axiom was not part of Zermelo’s original axioms and was added later, since it was found to aid immensely in establishing certain intuitively supported results in set theory. It is rarely used directly though, (one example is here), and within set theory its use is primarily in proving that certain large sets (such as those which are uncountable) theoretically exist in ZF.

9. Axiom of Regularity: This axiom states that every non empty set x contains an element y such that x\cap y=\emptyset.

This axiom disallows a set being a member of itself because of the following result:

Theorem: For all x, x\notin x.

Proof: By way of contradiction assume x\in x. By the theorem proved in the discussion on the axiom of pairing the set \{x\} exists. By the axiom of regularity there is an element y\in \{x\} such that y\cap\{x\}=\emptyset. Since \{x\} is a singleton so x=y and so x\cap\{x\}=\emptyset. This is a contradiction as by our hypothesis x\in x\cap\{x\}.\Box

Just like the previous axiom it is also a not obvious why this axiom must be true a priory. It is arguably the least useful ingredient of ZF, since most results in the branches of mathematics based on set theory hold even in the absence of regularity. However within mathematics we hardly ever encounter sets which contain themselves as a member, and this axiom is essential in outlawing such behavior.

That ends the list of the Zermelo Fraenkel axioms. One more axiom, probably the most famous of all, called the axiom of choice is usually added to ZF, and the resulting list of axioms is then referred to as ZFC.

2 Comments

Filed under Set theory

On unabridged versions of Ramsey’s theorem

In continuation of our discussions on Ramsey theory in this post we plan to prove the unabridged versions of Ramsey’s theorem. While the abridged versions dealt with graphs, unabridged versions deals with hypergraphs. A hypergraph, is a set of vertices V together with a set of subsets of V of a fixed order n. If n=2 then we get back the ordinary definition of a graph.

An example of a hypergraph of order 3 is given below:

The vertex set V=\{a,b,c,d,e\} and the hyperedges are given by: \{a,b,c\}, \{b,d,e\} and \{a,b,d\}.

The infinite unabridged version of Ramsey’s theorem may now be described as follows: Given a complete infinite hypergraph of order n with vertex set \mathbb{N}, and a r-coloring of its hyperedges we are guaranteed a infinite complete monochromatic subgraph.

Let us try to express this more succinctly. We will extend the previously defined arrows notation as follows: The notation \alpha\rightarrow (\beta)^n_r will mean that every r-colored complete hypergraph of order n on \alpha vertices contains a monochromatic complete subgraph with \beta vertices. Since hypergraphs do not seem to be as picturesque as graphs, so maybe the idea will be better expressed in terms of sets: \alpha\rightarrow (\beta)^n_r equivalently means that for any assignment of r-colors to the n-subsets of \alpha, there is a particular color (say red) and a subset X of \alpha of size \beta such that all n-subsets of X are red. (Note that \alpha,\beta etc can be any cardinal numbers.)

With the notation that \omega stands for the cardinality of \mathbb{N} the infinite version of Ramsey’s theorem now is:

Theorem: \forall n,r\in\mathbb{N}, \omega\rightarrow(\omega)^n_r.

The proof is essentially on the same lines as in the abridged case, with the addition that we induct on n. (n may intuitively be thought of as the “hyper-ness” of our hypergraph.)

Proof: By induction on n. If n=1 we just get the infinite pigeonhole principle: Any finite partitioning of \mathbb{N} contains an infinite part. Indeed, both the infinite and finite Ramsey theorems may be thought of as gigantic generalizations of the pigeonhole principle.

Now suppose that for a fixed n,r we have \omega\rightarrow(\omega)^n_r. Now consider any r-coloring f of the n+1-subsets of \omega. Just like in the proof of the abridged case we build a sequence (w_k). In the abridged case we applied the infinite pigeonhole principle repeatedly at this point. Here we will use the induction hypothesis repeatedly. Let w_1=1. Consider the r-coloring f_{w_1} of all n-subsets of \omega-\{1\} given by f_{w_1}(X)=f(\{1\}\cup X). By the induction hypothesis there will be a countable subset of \omega whose all n-subsets will be monochromatic. Call that subset V_1. Let the least element of V_1 be designated as w_2.

The construction of (w_k) follows by induction: If the set V_k and its least element w_{k+1} have been found, consider the r-coloring f_{w_{k+1}} of the n-subsets of V_k given by f_{w_{k+1}}(X)=f(\{w_{k+1}\}\cup X). By the induction hypothesis it contains a countable set all of whose n-subsets are monochromatic. Call that subset V_{k+1}. Its least element is designated as w_{k+2}.

The sequence (w_k) has the property that given any w_i the union of \{w_i\} with any n-order subset containing elements from w_{i+1},w_{i+2},w_{i+3}\cdots has the same color (with respect to the the original coloring f). Let us call w_i as c_j-based if this color is c_j. The fact that all the members of the sequence (w_k) are based in any of r-ways induces a partitioning of the sequence (w_k). By the infinite pigeonhole principle we are guaranteed an infinite part. Now, by ordering the terms appropriately in this infinite part we have a subsequence (w_{k_j}) of (w_k) in which all vertices are the same color base, say c_1. It is now easy to observe that all the n+1-subsets of this infinite part are colored c_1, which proves the theorem.\Box

We now turn to the finite version. In our arrows notation, the natural generalization from the abridged analogue reads:

Theorem: \forall l,n,r\in\mathbb{N}, \exists m\in\mathbb{N} such that m\rightarrow (l)_r^n.

Proof: The proof is identical to the abridged case with cosmetic changes. By way of contradiction assume that there is an l such that for all m\in\mathbb{N} we have m\nrightarrow (l)_r^n. We will use the notation \hat{K_i} for a hypergraph on i vertices where all possible n-subsets of the vertices are the hyperedges. Also let G be a hypergraph with vertices V=\{v_i:i\in\mathbb{N}\} and let the hyperedges of G be enumerated by E=\{E_i:E_i\subset\mathbb{N}, |E_i|=n\}. We construct a (rooted) tree T as follows:

1. First introduce a root vertex rt.

2. Each vertex is allowed to have at most r children which correspond to the r-colors, subject to it satisfying the criteria below. A child is always labeled by one among the r-colors. (Call the colors c_1,c_2\cdots c_r for convenience).

3. A child c_i is permitted if and only if its introduction creates a path of some finite length k starting from the root, so that if the hyperedges E_1,E_2\cdots E_k are colored by the colors used in the path in the same order, then the corresponding subgraph in G does not contain a monochromatic \hat{K_l}. For example if the introduction of a child c_i creates the k length path rt,c_a,c_b\cdots ,c_i and the hyperedges E_1,E_2,\cdots E_k when colored c_a,c_b\cdots c_i don’t contain a monochromatic \hat{K_l} the child c_i is permitted to be added to T.

Note that for all m, there always exists a coloring of \hat{K_m} such that no monochromatic \hat{K_l} exists within. So the situation that a child cannot be added to any vertex at a given level k cannot arise. For we can always take a coloring of \hat{K_{k+n}} containing no monochromatic \hat{K_l}. Since any k hyperedges in it would yield a sequence of colors already existing in T, we know which vertex to add the child to. We give the child the color corresponding to any other edge. Hence we can forever keep adding children and so T is infinite. It is also obvious that each level k of T has at most r^k vertices and so each level is finite.

Now by Konig’s lemma there will be an infinite path in T. This infinite path provides a r-coloring of G that contains no monochromatic \hat{K_i} and hence no monochromatic infinite hypergraph which contradicts the infinite Ramsey theorem proved above. This contradiction proves the theorem.\Box

The astute reader may have noted that we have used weak versions of the axiom of choice occasionally in the proof. Ramsey himself was aware of this and remarked so in his original paper. In fact, it can be proved that Ramsey’s theorem cannot ever be proved without using some form of the axiom of choice. Moreover it can be shown that a weak form of the axiom of choice is equivalent to Ramsey’s theorem. Hence Ramsey’s theorem may be interpreted as a choice axiom.

1 Comment

Filed under Combinatorics, Graph theory, Ramsey theory

Arrows, Ramsey numbers and the Party Problem

Here come the arrows! The notation \alpha\rightarrow (\beta)^2_r means that every r-colored complete graph on \alpha vertices contains a monochromatic complete subgraph with \beta vertices. Using the notation \omega for the cardinality of the natural numbers the abridged Ramsey theorems state that \omega\rightarrow (\omega)^2_r and that \forall n\exists m,m\rightarrow (n)^2_r.

Now let us take a particular case: n=3,r=2. Our objective here is to prove that 6\rightarrow (3)^2_2. In other words, if the edges of K_6 are 2-colored then there exists a monochromatic K_3. Fix a vertex v in K_6. From the 5 edges emanating from v, at least three must share a color: call it c. Now if all the edges in triangle made by those three vertices are not colored c, we are done for we have found a monochromatic K_3 in the other color. If even one edge, say xy among the triangle made by those three vertices is of color c then we have a monochromatic K_3 colored c with vertices x,y and v.

The fact that 6\rightarrow (3)^2_2 is also referred to as the solution of the party problem: Prove that in any gathering of six people there are three mutual friends, or three mutual strangers. By representing the people as vertices and their relationships via the colors on the edges connecting them we easily see that a monochromatic triangle is guaranteed and therefore the result holds. It is also obvious that the gathering may contain more then six people, a monochromatic triangle is still guaranteed within (in fact, it will only become more likely with more people). In other words: m\rightarrow (3)^2_2\forall m\ge 6.

Is six the least such number? Indeed it is as the following graph demonstrates:

We note that there is no monochromatic triangle in the above 2-colored K_5.

For a given n the least such number m,m\rightarrow (n)^2_r is called the diagonal Ramsey number R(n,n). While such numbers are guaranteed to exist by Ramsey’s theorem, actually finding them is hard indeed. It is trivial to see that R(1,1)=1 and R(2,2)=2. Our proof above illustrates that R(3,3)=6. It has also been proven that R(4,4)=18. However all efforts to find out the exact value of any other diagonal Ramsey number have failed. It is one of those problems, which are easily stated but extremely difficult to attack.

Regarding R(5,5), Joel Spencer comments in his book, Ten lectures on the Probabilistic Method:

Erdős asks us to imagine an alien force, vastly more powerful than us, landing on Earth and demanding the value of R(5,5) or they will destroy our planet. In that case, he claims, we should marshal all our computers and all our mathematicians and attempt to find the value. But suppose, instead, that they ask for R(6,6). In that case, he believes, we should attempt to destroy the aliens.

In closing it may be remarked that the Ramsey number R(m,n) is defined to be the least positive integer such that any 2-coloring of K_{R(m,n)} in two colors, say red and blue, contains either a red K_m or a blue K_n. This definition also has the natural generalization for more then two colors. Although the existence of all Ramsey numbers is guaranteed by Ramsey’s theorem, only few have been discovered so far.

1 Comment

Filed under Combinatorics, Miscellaneous, Ramsey theory

On abridged versions of Ramsey’s theorem

This post will aim to prove two abridged versions of Ramsey’s theorem, the result which gave its name to the area of mathematics now known as Ramsey theory. Ramsey theory may be understood as ramifications of a philosophical principal. The principle may be summarized as: “Complete disorder is an impossibility. Any structure will necessarily contain an orderly substructure.” Ramsey’s theorem is an example of this principle. Van der Waerden’s theorem is another example.

First a bit of history regarding this theorem. Frank Ramsey, a 25 year old British mathematician, proved this result in 1928 as a part of his investigations in mathematical logic. Unfortunately he passed away in 1930 (even before his theorem was formally published) and did not prove anything else related to this. In 1933, Skolem published another proof of the theorem, and then in 1934, Erdos and Szekeres found yet another. It was the Erdos and Szekeres paper which gave Ramsey’s theorem a wide audience. Further work by Erdos in this area led mathematicians to consider other Ramsey type theorems.

Here we consider abridged versions of Ramsey’s theorem which can be easily understood in the context of graphs. We will imitate Ramsey’s original approach (though not his proof) to first prove an infinite version (which deals with infinite graphs) and then proceed towards the finite analogue.

So what is the infinite version. Briefly it says that if we have an infinite complete graph on countable vertices and if we use r colors c_1,c_2\cdots c_r to arbitrarily color the edges then we are guaranteed an infinite monochromatic subgraph. Here our usage of coloring an edge is just a fanciful way to describe partitioning the edge set into finitely many parts. More formally, by a r-coloring of the edges of a graph G we mean any function with domain E(G) and range \{1,2\cdots r\} which induces such a partitioning. The colors are the labels that we may use for this partitioning.

Theorem: Let G be the complete infinite graph with vertices V=\{v_i:i\in\mathbb{N}\}. Given any r-coloring of the edges, G will contain an infinite complete monochromatic subgraph.

Ramsey’s original formulation was more general. He considered the set of vertices to be any infinite set (and not just countable). That is not much gained because a coloring of an uncountable vertex graph induces a coloring of a countable vertex subgraph. Within this countable subgraph we can find a monochromatic infinite complete graph, which is also a subgraph of the original. However he also generalized in another sense: Instead of coloring edges he considered colorings of hyperedges of a fixed order (an edge is a special case: it is a hyperedge of order 2). This too can be overcome by induction on the “hyper-ness” of the graph. We choose to postpone such discussions.

Proof: Suppose the edges of G have been colored using the colors c_1,c_2,\cdots c_r. We will build an infinite subsequence of the vertices in V by applying the infinite pigeonhole principle. Let w_1=v_1. Now w_1 is connected with every other vertex in G and the edges making this connection are colored in any of r-ways. We partition the set of all these vertices in this fashion: If v is connected to w_1 via an edge colored c_i we consider it to be a member of the set P_i. Clearly, \{P_i:i=1,2\cdots r\} is a partitioning of V-\{w_1\}. Now, since \cup P_i is infinite so one of the P_i must also be infinite. (Here we are using the infinite pigeonhole principle.) Note that there may be more then one infinite P_i. At any rate choose the “smallest” vertex among the infinite parts (or the vertex with the least i in \{v_i:i\in P_j, P_j is infinite \}. Rename that particular P_i as V_1 and that vertex as w_2.

At this point the construction of the sequence becomes obvious. If for some n we have already found a w_n and and a V_n then we can partition the vertices in V_n connected to w_n into r-parts depending on how they are connected with w_n. Choosing the smallest vertex among the infinite parts yields w_{n+1} and the part it belongs to yields V_{n+1}. Since V_{n+1} is infinite so we have no fear of running out of vertices.

Now note an important property of the sequence (w_n). Given any w_i its connections with w_{i+1},w_{i+2},w_{i+3}\cdots are all of the same color by definition. Let us call w_i as c_j-based if the edges w_iw_{i+1},w_iw_{i+2},w_iw_{i+3}\cdots are all colored c_j. The fact that all the members of the sequence (w_n) are based in any of r-ways induces a partitioning of the sequence (w_n). As before by the pigeonhole principle we are guaranteed an infinite part. Now, by ordering the terms appropriately in this infinite part we have a subsequence (w_{n_k}) of (w_n) in which all vertices are the same color base, say c_1. It is now easy to observe that the subgraph induced by these vertices has all edges colored c_1, which proves the theorem.\Box

We now turn our attention to the finite analogue. The infinite version guarantees an infinite monochromatic complete graph within an arbitrarily colored infinite countable graph (with countable vertices). The finite version guarantees a finite monochromatic complete graph of desired size within an arbitrarily colored finite countable graph (with suitable number of vertices). More formally the theorem states:

Theorem: Let r,l_1,l_2,\cdots l_r be natural numbers. Then there exists an m\in \mathbb{N} such that any r-coloring of K_m contains a monochromatic K_{l_i} for some i entirely colored in color i.

We will prove the theorem with l_1=l_2=\cdots =l_r=n. This is because no loss of generality occurs by assuming this. To see this suppose r=2,l_1=5 and l_2=7. Now if we have proved the theorem for r=2,l_1=l_2=7 we have also guaranteed the existence of an m such that any 2-coloring of K_m contains a K_7 colored in either of the two colors. Clearly the same m works for l_1=5,l_1=7 as a 1-colored K_7 would contain a 1-colored K_5.

Proof: Our modified claim is that for all r,n\in\mathbb{N} there is an m\in\mathbb{N} such that any r-coloring of K_m contains a monochromatic K_n. By way of contradiction assume that there is an n\in\mathbb{N} such that for every m\in \mathbb{N} there is a r-coloring of K_m containing no monochromatic K_n. Now let G be a complete graph with vertices V=\{v_i:i\in\mathbb{N}\} and let the edges of G be enumerated by E=\{e_i:i\in\mathbb{N}\}. We construct a (rooted) tree T as follows:

1. First introduce a root vertex rt.

2. Each vertex is allowed to have at most r children (a child of a vertex x is defined to be an adjacent vertex y so that d(rt,x)+1=d(rt,y)) which correspond to the r-colors, subject to it satisfying the criteria below. A child is always labeled by one among the r-colors. (Call the colors c_1,c_2\cdots c_r for convenience).

3. A child c_i is permitted if and only if its introduction creates a path of some finite length k starting from the root, so that if the edges e_1,e_2\cdots e_k are colored by the colors used in the path in the same order, then the corresponding subgraph in G does not contain a monochromatic K_n. For example if the introduction of a child c_i creates the k length path rt,c_a,c_b\cdots ,c_i and the edges e_1,e_2,\cdots e_k when colored c_a,c_b\cdots c_i don’t contain a monochromatic K_n the child c_i is permitted to be added to T.

Note that for all m, there always exists a coloring of K_m such that no monochromatic K_n exists within. So the situation that a child cannot be added to any vertex at a given level k cannot arise. For we can always take a coloring of K_k containing no monochromatic K_n. Since any k edges in it would yield a sequence of colors already existing in T, we know which vertex to add the child to. We give the child the color corresponding to any other edge. Hence we can forever keep adding children and so T is infinite. It is also obvious that each level k of T has at most r^k vertices and so each level is finite.

Now by Konig’s lemma there will be an infinite path in T. This infinite path provides a r-coloring of G that contains no monochromatic complete graph and hence no monochromatic infinite complete graph which contradicts the infinite Ramsey theorem proved above. This contradiction proves the theorem.\Box

3 Comments

Filed under Combinatorics, Graph theory, Ramsey theory

A ball game and Konig’s lemma

In this post we will examine an application of Konig’s lemma to investigate a game proposed by Raymond Smullyan.

Imagine there is an unlimited supply of balls with us, with positive integers painted on each. It is very much possible that a particular positive integer is repeated many times on various balls. We do not know anything else about the distribution of numbers on the balls. Further imagine that there is a box with a ball initially within it.

To play the game we may do either of two things:

(1) We may remove a ball from the box and replace it with any finite number of balls, arbitrarily chosen, but constrained to have numbers painted on them which are less then the number of the ball removed. For example, if we have a number 1000 ball we may remove it and replace it by 6000 balls: 3000 balls on which 657 is painted, and 3000 balls on which 999 is painted.

(2) We may remove a ball from the box and not do any replacement.

After playing the game once we can continue playing provided there are balls left in the box.

The question is: Is it possible to play the ball game indefinitely? Can we ensure that the box will never be empty?

Smullyan himself answered this problem in the negative by using Konig’s lemma.

Theorem: The Smullyan game terminates after a finite number of steps.

Proof: For any play of the Smullyan game define a (rooted) tree as follows:
1. The vertices are the painted balls used during the game.
2. The root is the painted ball intially within the box.
3. The edges are given by the following criteria: There is an edge between the vertices x and y iff during some move of the game, the ball y is among the balls which replaces the ball x (or vice-versa).

By the definition of the game this is a rooted tree where each level is finite. Can there be an infinite path in the tree? No there can’t. This is because, by definition an infinite path would correspond to a infinite strictly decreasing sequence of positive integers which is absurd. By the contrapositive form of Konig’s lemma we can therefore conclude that the tree is finite. In other words only finitely many balls were used and so the game terminated after finitely many steps.

An alternate proof without using Konig’s lemma is available here.

Leave a comment

Filed under Game Theory, Graph theory, Miscellaneous