# Monthly Archives: September 2012

## The Cantor Set

This is a post regarding the basic properties of the Cantor set.

We start with the definition: The Cantor set is obtained by first constructing a sequence $(C_n)$ of closed sets and then taking the intersection of the sets in this sequence. The sequence is constructed as follows:

1. Start with $[0,1]$ and remove the middle third, i.e. the interval $(\frac{1}{3},\frac{2}{3})$ to obtain the set $C_1=[0,\frac{1}{3}]\cup[\frac{2}{3},1]$. So $C_1$ contains $2$ disjoint closed intervals each of length $\frac{1}{3}$.

2. Next remove the middle third of each of these two intervals leaving $C_2=[0,\frac{1}{9}]\cup[\frac{2}{9},\frac{1}{3}]\cup[\frac{2}{3},\frac{7}{9}]\cup[\frac{8}{9},1]$ consisting of $2^2$ disjoint closed intervals each of length $\frac{1}{3^2}$.

3. Assuming $C_n$ has been constructed and consists of $2^n$ disjoint closed intervals each of length $\frac{1}{3^n}$, remove the middle thirds of all these intervals to obtain $C_{n+1}$ consisting of $2^{n+1}$ disjoint closed intervals each of length $\frac{1}{3^{n+1}}$.

By induction we have our sequence $(C_n)$.

Now, we define the Cantor set $C$ as $C=\cap_{n=1}^\infty C_n$.

We now describe the properties of the Cantor set:

1. In the usual topology on $\mathbb{R}$ the Cantor set is closed (being an intersection of closed sets). Moreover since it is bounded so by the Heine Borel theorem it is compact. What’s extraordinary is that every compact metric space is a continuous image of the Cantor set! The proof may be found here.

2. The Cantor set is uncountable. We prove this by considering the ternary expansion of all numbers in $[0,1]$. All such numbers may have the form $0.a_1a_2\cdots$ with $a_i\in\{0,1,2\}$ (including $1=0.22\cdots$). We claim that $x\in C$ iff $x$ has a ternary expansion of the form $0.a_1a_2\cdots$ with $a_i\in\{0,2\}$.

To prove this consider the construction of $C$ through the ternary lens: In ternary the construction of $C_1$ involves removing all numbers in $(0.011\cdots,0.2)$ from $[0.00\cdots,0.22\cdots]$. This shows that in $C_1$ every number has a ternary expansion of the form $0.a_1a_2\cdots$ with $a_1\in\{0,2\}$ and $a_i\in\{0,1,2\}\forall i>1$. Conversely every number of this form is in $C_1$. Likewise the construction of $C_2$ involves removing all numbers in $(0.0011\cdots,0.02)$ and in $(0.2011\cdots,0.22)$ from $C_1$. This shows that in $C_2$ every number has a ternary expansion of the form $0.a_1a_2\cdots$ with $a_1,a_2\in\{0,2\}$ and $a_i\in\{0,1,2\}\forall i>2$. Again conversely every number of this form is in $C_2$. Continuing in this way we can conclude that $x\in C_n$ iff $x=0.a_1\cdots a_na_{n+1}a_{n+2}\cdots$ with $a_1\cdots a_n\in\{0,2\}$ and $a_i\in\{0,1,2\}\forall i>n$. Hence by definition of $C$ our claim is established.

Now assume that $C$ is countable and has been enumerated in the list described below:

$0.d_1^{(1)}d_2^{(1)}d_3^{(1)}\cdots$
$0.d_1^{(2)}d_2^{(2)}d_3^{(2)}\cdots$
$0.d_1^{(3)}d_2^{(3)}d_3^{(3)}\cdots$
$\cdots$

Here each $d_i^{(j)}\in \{0,2\}$.

Now let $x=0.a_1a_2a_3\cdots$ where $a_i=0$ if $d_i^{(i)}=2$ and $a_i=2$ otherwise. Then $x\notin C$ despite being of the requisite ternary form. Hence $C$ is uncountable.

3. The Cantor set has Lebesgue measure zero. We define a set $A$ to be of Lebesgue measure zero if $\forall \epsilon>0\exists$ a sequence of intervals $(I_n)$ such that $A\subset\cup_{n=1}^\infty I_n$ and $\sum_{n=1}^\infty l(I_n)<\epsilon$. Here if the end points of $I_n$ are $a_n$ and $b_n$ (with $a_n\le b_n)$ then $l(I_n)=b_n-a_n$. (The proof that this definition is consistent with the Lebesgue measure may be found in this book.) Now given $\epsilon>0$ choose $n$ so that $(\frac{2}{3})^n<\epsilon$. Since $C\subset C_n$ and the total length of $C_n$ is $(\frac{2}{3})^n$, so it is clear that $C$ is a set of Lebesgue measure zero. Together with point 2, we see that this establishes the existence of uncountable sets of zero measure!

Filed under Measure Theory, Real Analysis, Set theory

## The axiom of choice

This post is about the axiom of choice, arguably the most famous and the most controversial of the axioms underpinning the set theoretic foundations of mathematics today. The axiom of choice is about making choices: choosing an element each from an arbitrary collection of nonempty sets. The axiom basically says that this can always be done, although it provides no recipe for doing so.

Before formally presenting the axiom we state some definitions. The treatment is essentially taken from here.

Definition: A function $I$ from a set $\Lambda$ onto a set $X$ is said to index the set $X$ by $\Lambda$. The set $\Lambda$ is called the index and $X$ is the indexed set. If $I(\lambda)=x$, we write $x_\lambda$ for $I(\lambda)$.

Definition: Let $X$ be a nonempty set indexed by a set $\Lambda$. The cartesian product of $X$ is defined to be the set $\Pi _{\lambda\in\Lambda}x_\lambda$ of all functions $f$ with domain $\Lambda$ and codomain $\cup X=\cup_{x\in X}x$, satisfying the condition $f(\lambda)\in x_\lambda$.

Let us consider an example before we go further. We let $X=\{\{a,b\},\{c,d\}\}$ be indexed by $\Lambda=\{1,2\}$ where $1\to \{a,b\}, 2\to \{c,d\}$. Now there are four functions possible from $\Lambda$ to $\{a,b,c,d\}$ which satisfy the definition:

The function $f_1$ for which $f(1)=a, f(2)=c$.
The function $f_2$ for which $f(1)=b, f(2)=c$.
The function $f_3$ for which $f(1)=a, f(2)=d$.
The function $f_4$ for which $f(1)=b, f(2)=d$.

The collection $\{f_1,f_2,f_3,f_4\}$ is the cartesian product. Now intuitively the behavior of the function $f_1$ can be captured by simply writing $(a,c)$ since the fact that the first coordinate corresponds to $a$ and the second to $c$ indicates the function $1\to a,2\to c$. Similarly the behavior of the function $f_2,f_3,f_4$ are also captured by $(b,c),(a,d),(b,d)$ respectively. Note that a particular ordered pair always corresponds to a unique function $f_i$. So the cartesian product may be simply represented as $\{(a,c),(b,c),(a,d),(b,d)\}$.

In fact if $\Lambda$ has at most countable number of elements within it, we can always consider it as an subset of the form $\{1,2,3\cdots\}$ (which may or may not terminate), and index appropriately. Then the cartesian product obtained can be equivalently thought of as a collection of n-tuples or infinite sequences where the ith coordinate corresponds to the image of $i$ in the indexing. For example if $X=\{\{0,1\}\}$ and $\Lambda=\mathbb{N}$ then all zero-one sequences correspond to the requisite functions and their collection forms the cartesian product.

Definition: Any function $f$ which satisfies the conditions required to make it a member of a cartesian product is called a choice function.

We now give three formulations of the axiom of choice:

Axiom of choice:

1. For every nonempty set whose elements are nonempty sets and are indexed by a set $I$ there exists a choice function.

2. If $\{a_i\}$ is a family of nonempty sets, indexed by a nonempty set $I$, then there exists a family $\{x_i\}$ with $i\in I$ such that $x_i\in a_i$ for each $i\in I$. (This corresponds roughly to our informal comment about choices in the first paragraph.)

3. The cartesian product of a nonempty collection of nonempty sets is nonempty.

Theorem: The three formulations are equivalent.

Proof: $1\Rightarrow 2$: Let $\{a_i\}_{i\in I}$ be a family of sets, indexed by the nonempty set $I$. Let $S=\{a_i\mid i\in I\}$. (Note that the elements of $S$ are themselves sets.) We index $S$ by itself, and consider the cartesian product of $S$. By the assumption and by $I\ne \emptyset$, the set $S$ is a nonempty set of nonempty sets. Hence owing to (1) there exists a choice function, i.e. a function $f\colon S\to \bigcup S$ such that $f(x)\in x$ for all $x \in S$. For $i\in I$ let $x_i=f(a_i)$. Then $\{x_i\}_{i\in I}$ is a family with the required properties.

$2\Rightarrow 1$: Let $S=\{a_i:i\in I\}$ be a nonempty set whose elements are themselves nonempty where $S$ is indexed by $I$. By (2) we have a family $x_i$ with $x_i\in a_i$. We define our choice function to be $f:I\to\cup S$ where $f(i)=x_i$.

$1\Leftrightarrow 3$: This is obvious by the definitions.$\Box$

Due to this result any of these three statements may be referred to as the axiom of choice. In fact, there are many other equivalent statements. The term ZFC is usually used to denote the Zermelo Fraenkel axioms together with the axiom of choice, and these axioms form a basis of most of the mathematics done today.

Why is this axiom so relevant? Firstly if one refuses to accept this axiom, one loses a lot of interesting results. Secondly, many times assuming the axiom of choice is so much easier. Many results such as the Cantor-Berstein-Schroeder theorem were proved originally using the axiom of choice, although later a “choice-free” proof was discovered. Similarly the axiom of choice makes undergraduate analysis easier by enabling one to say that $f(x)$ is continuous at $x=a$ if $f(a_n)$ tends to $f(a)$ for each sequence $(a_n)$ that tends to $a$.

Are there any negative issues with the axiom of choice? Well, firstly it is not constructive and doesn’t actually tell us the choices to make. It just guarantees that some choice exists. In one formulation it tells us that the real numbers can be well ordered but it gives us no recipe of doing so. In other words, it doesn’t actually construct a well order of the reals for us. Secondly, some counter intuitive results such as the Banach Tarsky paradox are true in ZFC and many mathematicians find this troubling and “fishy”.

Thirdly, and probably most importantly, there is history. After it was introduced (in another form) in 1904 by Zermelo, many mathematicians wondered whether or not the axiom was consistent with ZF. In other words people asked whether accepting it would lead to contradictions. Many decades passed before in 1938 Godel showed that ZFC is indeed consistent. Another doubt was whether the axiom of choice was independent of ZF. It was only by 1963 that Cohen proved that the axiom of choice cannot derived from ZF and so together with Godel’s result this implied that the axiom of choice is independent of ZF. During this long period from 1904 to 1963, many bizzare results such as the Banach Tarsky paradox, existence of non-Lebesgue measurable sets etc were found. All this sowed a suspicion in people’s minds as to whether this axiom was “false” at some fundamental level. As a result mathematicians began to pay close attention to results in which the axiom was used, and to try and device proofs which avoided the axiom as much as possible. This habit somewhat lessened after the independence results, but it has not been dropped by everyone even today.

The fact of the matter is that while rejecting the axiom of choice leads to many weird results, so does accepting it. See the chapters entitled “Disasters without Choice” and “Disasters with Choice” in this book for details. However in the long run, the advantages of accepting the axiom outweigh the advantages of rejecting it primarily because the sheer quantity of beautiful and aesthetic results flowing from the axiom of choice provide substance to mathematics as a whole.

Filed under Miscellaneous, Set theory

## The Zermelo Fraenkel axioms

In this post we plan to discuss the Zermelo Fraenkel axioms of set theory (a term which we will abbreviate as ZF). These axioms are used by most mathematicians as the pillars on which theorems and lemmas are build. Although the axioms are named after the mathematicians Zermelo and Fraenkel, contributions from Skolem too helped in evolving the consensus on what the basic axioms ought to be. A detailed history of the evolution of these axioms and of set theory in general may be found here.

We will take a naive approach towards the basic underlying principles of logic used to illustrate the axioms. Our overall goal is to give an essential idea, and not describe a rigorous formal system. Note that in ZF terms always denote sets. There is no consensus on the exact wordings and the sequence of the axioms. We are following the treatment given in this book.

1. Axiom of extensionality: If $a$ and $b$ have the same elements then $a=b$.

It is easy to prove using the underlying logic that equal sets have the same elements: Suppose $x=y$. If $t\in x$ then by substitution $t\in y$. So all elements of $x$ are in $y$. Similarly all elements of $y$ are in $x$. Hence both $x$ and $y$ have the same elements. The above axiom is the converse. Therefore we can conclude that $a=b$ iff $a$ and $b$ have the same elements. This also leads to the following theorem:

Theorem: $a=b\iff (a\subset b\land b\subset a)$.
Proof: Suppose that $a=b$. Now for $x\in a$ since it follows that $x\in a$ (this is not a misprint) so $a\subset a$. By substitution we have $a\subset b$. Similarly we have $b\subset a$. Conversely if $a\subset b$ and $b\subset a$ then clearly $\forall x, x\in a\iff x\in b$ and so $a=b$.$\Box$

2. Axiom of the empty set: A set $\emptyset$ having no elements exists.

This axiom may be thought of as redundant in certain versions of ZF, where it may be deduced from other axioms. By a little piece of vacuous reasoning it follows that the empty set is a subset of every set.

3. Axiom of pairing: For every $x$ and $y$, the pair set $\{x,y\}$ exists.

We can prove the following theorem with the aid of this axiom:

Theorem: For every $x$, the set $\{x\}$ exists.

Proof: Substituting $x$ for $y$ in the above axiom yields a set which by the axiom of extensionality equals $\{x\}$.$\Box$

It is also now clear that if $y\subset\{x\}$ then $y=\emptyset$ or $y=\{x\}$. For if $y\ne \emptyset$ and $t\in y$ then $t\in \{x\}$ and as $\{x\}$ is a singleton so $t=x$. But then $x\in y$ and so $\{x\}\subset y$ following which $y=\{x\}$.

4. Axiom of union: The elements of $\cup X$ are the elements of the elements of $X$.

Hence the elements of $\cup \{x,y\}$ are not $x$ and $y$. Instead they are the elements of $x$ and $y$. In other words $\cup\{x,y\}$ is the familiar set $x\cup y$.

5. Axiom of power set: The elements of $\mathcal{P}(X)$ are precisely the subsets of $X$.

We do not define other set functions, like intersection and Cartesian product since their existence follows from the next axiom.

6. Axiom of separation: If $\psi(x)$ is a formula and $X$ is a set then the set $\{x\in X: \psi(x)\}$ exists.

It is now an easy matter to define intersection of the sets in $X$ as $Y=\cap X$ where $Y=\{x\in\cup X:\forall t(t\in X\rightarrow x\in t)\}$, provided $X\ne \emptyset$. To define the Cartesian product note that we define the ordered pair $(a,b)$ as $\{\{a\},\{a,b\}\}$ and then $X\times Y=\{(x,y):x\in X\land y\in Y\}$. We cannot however define complement because of the following theorem. It says that there is no universe (a set which has everything).

Theorem: There is no set $U$ such that $\forall x(x\in U)$.

Proof: Suppose that there exists such a set $U$. Now, by the axiom of separation we have the set $X=\{z\in U:z\notin z\}$. Now either we have $X\in X$ or $X\notin X$. In case $X\in X$ then by definition of $X$ we have $X\notin X$. In case $X\notin X$ then again by definition of $X$ we have $X\in X$. Both cases lead to contradictions.$\Box$

In naive set theory where any definable collection (such as $\{z:z\notin z\}$) was understood as a set there is no way to clear the anomaly which follows from the above proof. One of the reasons for the abandonment of naive set theory in favor of ZF was in fact this paradox. In ZF the set $\{z:z\notin z\}$ does not make sense as the axiom of separation specifically asks for a set $X$ from which to pick the element $z$ from. As we don’t have a universal set in ZF so the paradox is entirely sidestepped. The fact that there is no universal set in ZF has appeared intuitively undesirable to some, and there exist set theories which include a universal set.

It is interesting to note that although the paradox is attributed to Russell, it was Zermelo who had originally discovered it. Unfortunately he didn’t publish this idea and Russell independently discovered the paradox a year later and published it.

7. Axiom of infinity There is a set $Z$ such that:
(i) $\emptyset\in Z$.
(ii) If $x\in Z$ then $\{x\}\in Z$.

This axiom guarantees the existence of an infinite set. It is present to ensure that ZF is powerful enough to contain the collection of all numbers (a concept which we do not wish to make precise at this juncture). Once we have an infinite set we can also make precise the idea of a smallest infinite set as given by this axiom. This is illustrated by the following theorem:

Theorem: Let $Z$ be the set provided by the above axiom. Let $X$ be the set of subsets of $Z$ that satisfy properties (i) and (ii) above. Let $Z_0=\cap X$. Then:
(a) $X$ exists.
(b) $Z_0$ exists.
(c) If any set $T$ satisfies properties (i) and (ii) of the axiom of infinity then $Z_0\subset T$.

Proof: (a) The set $\{x\in \mathcal{P}(Z): x$ satisfies (i) and (ii) $\}$ exists by the axiom of separation and equals $X$.
(b) The intersection is well defined if $X\ne \emptyset$ which it is as $Z\in X$.
(c) Note that $Z_0\cap T$ satisfies (i) and (ii) of the axiom of infinity and is in $X$ so by definition of intersection $Z_0\subset Z_0\cap T$. If we suppose $Z_0\nsubseteq T$ then $\exists z_0\in Z_0$ such that $z_0\notin T$ following which $z_0\notin Z_0\cap T$. This contradicts $Z_0\subset Z_0\cap T$.$\Box$

We may define $Z_0$ as the set of whole numbers and show that the members of this set match our intuitive understanding of the members of $\{0,1,2,3...\}$. However such a discussion is presently out of our scope.

We have used the notion of the empty set in the statement of the axiom of infinity. The axiom of infinity can be rephrased so as not to actually assume that the empty set exists. We could have stated (i) as $\forall y(\exists$ no $x\in y\rightarrow y\in Z)$. Then the axiom of infinity a priori only asserts the existence of a set meeting this description, without actually asserting that it contains anything. Once we had got the set $Z$ we could have used the axiom of separation to define a set $\emptyset=\{x\in Z:x\ne x\}$. Hence in this formulation the axiom of the empty set would have ceased to be an axiom and would have become a theorem of ZF. However, only a superficial difference exists in these two versions of ZF and we find no significant advantage in choosing one over the other.

8. Axiom of replacement: Informally this axiom states that ranges of functions restricted to sets exists. More formally, if $\psi(x,y)$ is a formula such that $\forall x\forall y\forall z(\psi(x,y)\land \psi(x,z)\rightarrow y=z)$ then for every set $D$ there is a set $R$ such that $\forall y(y\in R\leftrightarrow \exists x(x\in D\land \psi(x,y)))$. The formula $\psi(x,y)$ can now used to define the function governed by the relation $f(x)=y$ as per the usual definition.

This axiom was not part of Zermelo’s original axioms and was added later, since it was found to aid immensely in establishing certain intuitively supported results in set theory. It is rarely used directly though, (one example is here), and within set theory its use is primarily in proving that certain large sets (such as those which are uncountable) theoretically exist in ZF.

9. Axiom of Regularity: This axiom states that every non empty set $x$ contains an element $y$ such that $x\cap y=\emptyset$.

This axiom disallows a set being a member of itself because of the following result:

Theorem: For all $x$, $x\notin x$.

Proof: By way of contradiction assume $x\in x$. By the theorem proved in the discussion on the axiom of pairing the set $\{x\}$ exists. By the axiom of regularity there is an element $y\in \{x\}$ such that $y\cap\{x\}=\emptyset$. Since $\{x\}$ is a singleton so $x=y$ and so $x\cap\{x\}=\emptyset$. This is a contradiction as by our hypothesis $x\in x\cap\{x\}$.$\Box$

Just like the previous axiom it is also a not obvious why this axiom must be true a priory. It is arguably the least useful ingredient of ZF, since most results in the branches of mathematics based on set theory hold even in the absence of regularity. However within mathematics we hardly ever encounter sets which contain themselves as a member, and this axiom is essential in outlawing such behavior.

That ends the list of the Zermelo Fraenkel axioms. One more axiom, probably the most famous of all, called the axiom of choice is usually added to ZF, and the resulting list of axioms is then referred to as ZFC.

Filed under Set theory

## On unabridged versions of Ramsey’s theorem

In continuation of our discussions on Ramsey theory in this post we plan to prove the unabridged versions of Ramsey’s theorem. While the abridged versions dealt with graphs, unabridged versions deals with hypergraphs. A hypergraph, is a set of vertices $V$ together with a set of subsets of $V$ of a fixed order $n$. If $n=2$ then we get back the ordinary definition of a graph.

An example of a hypergraph of order 3 is given below:

The vertex set $V=\{a,b,c,d,e\}$ and the hyperedges are given by: $\{a,b,c\}$, $\{b,d,e\}$ and $\{a,b,d\}$.

The infinite unabridged version of Ramsey’s theorem may now be described as follows: Given a complete infinite hypergraph of order $n$ with vertex set $\mathbb{N}$, and a $r$-coloring of its hyperedges we are guaranteed a infinite complete monochromatic subgraph.

Let us try to express this more succinctly. We will extend the previously defined arrows notation as follows: The notation $\alpha\rightarrow (\beta)^n_r$ will mean that every $r$-colored complete hypergraph of order $n$ on $\alpha$ vertices contains a monochromatic complete subgraph with $\beta$ vertices. Since hypergraphs do not seem to be as picturesque as graphs, so maybe the idea will be better expressed in terms of sets: $\alpha\rightarrow (\beta)^n_r$ equivalently means that for any assignment of $r$-colors to the $n$-subsets of $\alpha$, there is a particular color (say red) and a subset $X$ of $\alpha$ of size $\beta$ such that all $n$-subsets of $X$ are red. (Note that $\alpha,\beta$ etc can be any cardinal numbers.)

With the notation that $\omega$ stands for the cardinality of $\mathbb{N}$ the infinite version of Ramsey’s theorem now is:

Theorem: $\forall n,r\in\mathbb{N}, \omega\rightarrow(\omega)^n_r$.

The proof is essentially on the same lines as in the abridged case, with the addition that we induct on $n$. ($n$ may intuitively be thought of as the “hyper-ness” of our hypergraph.)

Proof: By induction on $n$. If $n=1$ we just get the infinite pigeonhole principle: Any finite partitioning of $\mathbb{N}$ contains an infinite part. Indeed, both the infinite and finite Ramsey theorems may be thought of as gigantic generalizations of the pigeonhole principle.

Now suppose that for a fixed $n,r$ we have $\omega\rightarrow(\omega)^n_r$. Now consider any $r$-coloring $f$ of the $n+1$-subsets of $\omega$. Just like in the proof of the abridged case we build a sequence $(w_k)$. In the abridged case we applied the infinite pigeonhole principle repeatedly at this point. Here we will use the induction hypothesis repeatedly. Let $w_1=1$. Consider the $r$-coloring $f_{w_1}$ of all $n$-subsets of $\omega-\{1\}$ given by $f_{w_1}(X)=f(\{1\}\cup X)$. By the induction hypothesis there will be a countable subset of $\omega$ whose all $n$-subsets will be monochromatic. Call that subset $V_1$. Let the least element of $V_1$ be designated as $w_2$.

The construction of $(w_k)$ follows by induction: If the set $V_k$ and its least element $w_{k+1}$ have been found, consider the $r$-coloring $f_{w_{k+1}}$ of the $n$-subsets of $V_k$ given by $f_{w_{k+1}}(X)=f(\{w_{k+1}\}\cup X)$. By the induction hypothesis it contains a countable set all of whose $n$-subsets are monochromatic. Call that subset $V_{k+1}$. Its least element is designated as $w_{k+2}$.

The sequence $(w_k)$ has the property that given any $w_i$ the union of $\{w_i\}$ with any $n$-order subset containing elements from $w_{i+1},w_{i+2},w_{i+3}\cdots$ has the same color (with respect to the the original coloring $f$). Let us call $w_i$ as $c_j$-based if this color is $c_j$. The fact that all the members of the sequence $(w_k)$ are based in any of $r$-ways induces a partitioning of the sequence $(w_k)$. By the infinite pigeonhole principle we are guaranteed an infinite part. Now, by ordering the terms appropriately in this infinite part we have a subsequence $(w_{k_j})$ of $(w_k)$ in which all vertices are the same color base, say $c_1$. It is now easy to observe that all the $n+1$-subsets of this infinite part are colored $c_1$, which proves the theorem.$\Box$

We now turn to the finite version. In our arrows notation, the natural generalization from the abridged analogue reads:

Theorem: $\forall l,n,r\in\mathbb{N}, \exists m\in\mathbb{N}$ such that $m\rightarrow (l)_r^n$.

Proof: The proof is identical to the abridged case with cosmetic changes. By way of contradiction assume that there is an $l$ such that for all $m\in\mathbb{N}$ we have $m\nrightarrow (l)_r^n$. We will use the notation $\hat{K_i}$ for a hypergraph on $i$ vertices where all possible $n$-subsets of the vertices are the hyperedges. Also let $G$ be a hypergraph with vertices $V=\{v_i:i\in\mathbb{N}\}$ and let the hyperedges of $G$ be enumerated by $E=\{E_i:E_i\subset\mathbb{N}, |E_i|=n\}$. We construct a (rooted) tree $T$ as follows:

1. First introduce a root vertex $rt$.

2. Each vertex is allowed to have at most $r$ children which correspond to the $r$-colors, subject to it satisfying the criteria below. A child is always labeled by one among the $r$-colors. (Call the colors $c_1,c_2\cdots c_r$ for convenience).

3. A child $c_i$ is permitted if and only if its introduction creates a path of some finite length $k$ starting from the root, so that if the hyperedges $E_1,E_2\cdots E_k$ are colored by the colors used in the path in the same order, then the corresponding subgraph in $G$ does not contain a monochromatic $\hat{K_l}$. For example if the introduction of a child $c_i$ creates the $k$ length path $rt,c_a,c_b\cdots ,c_i$ and the hyperedges $E_1,E_2,\cdots E_k$ when colored $c_a,c_b\cdots c_i$ don’t contain a monochromatic $\hat{K_l}$ the child $c_i$ is permitted to be added to $T$.

Note that for all $m$, there always exists a coloring of $\hat{K_m}$ such that no monochromatic $\hat{K_l}$ exists within. So the situation that a child cannot be added to any vertex at a given level $k$ cannot arise. For we can always take a coloring of $\hat{K_{k+n}}$ containing no monochromatic $\hat{K_l}$. Since any $k$ hyperedges in it would yield a sequence of colors already existing in $T$, we know which vertex to add the child to. We give the child the color corresponding to any other edge. Hence we can forever keep adding children and so $T$ is infinite. It is also obvious that each level $k$ of $T$ has at most $r^k$ vertices and so each level is finite.

Now by Konig’s lemma there will be an infinite path in $T$. This infinite path provides a $r$-coloring of $G$ that contains no monochromatic $\hat{K_i}$ and hence no monochromatic infinite hypergraph which contradicts the infinite Ramsey theorem proved above. This contradiction proves the theorem.$\Box$

The astute reader may have noted that we have used weak versions of the axiom of choice occasionally in the proof. Ramsey himself was aware of this and remarked so in his original paper. In fact, it can be proved that Ramsey’s theorem cannot ever be proved without using some form of the axiom of choice. Moreover it can be shown that a weak form of the axiom of choice is equivalent to Ramsey’s theorem. Hence Ramsey’s theorem may be interpreted as a choice axiom.

1 Comment

Filed under Combinatorics, Graph theory, Ramsey theory

## Arrows, Ramsey numbers and the Party Problem

Here come the arrows! The notation $\alpha\rightarrow (\beta)^2_r$ means that every $r$-colored complete graph on $\alpha$ vertices contains a monochromatic complete subgraph with $\beta$ vertices. Using the notation $\omega$ for the cardinality of the natural numbers the abridged Ramsey theorems state that $\omega\rightarrow (\omega)^2_r$ and that $\forall n\exists m,m\rightarrow (n)^2_r$.

Now let us take a particular case: $n=3,r=2$. Our objective here is to prove that $6\rightarrow (3)^2_2$. In other words, if the edges of $K_6$ are $2$-colored then there exists a monochromatic $K_3$. Fix a vertex $v$ in $K_6$. From the $5$ edges emanating from $v$, at least three must share a color: call it $c$. Now if all the edges in triangle made by those three vertices are not colored $c$, we are done for we have found a monochromatic $K_3$ in the other color. If even one edge, say $xy$ among the triangle made by those three vertices is of color $c$ then we have a monochromatic $K_3$ colored $c$ with vertices $x,y$ and $v$.

The fact that $6\rightarrow (3)^2_2$ is also referred to as the solution of the party problem: Prove that in any gathering of six people there are three mutual friends, or three mutual strangers. By representing the people as vertices and their relationships via the colors on the edges connecting them we easily see that a monochromatic triangle is guaranteed and therefore the result holds. It is also obvious that the gathering may contain more then six people, a monochromatic triangle is still guaranteed within (in fact, it will only become more likely with more people). In other words: $m\rightarrow (3)^2_2\forall m\ge 6$.

Is six the least such number? Indeed it is as the following graph demonstrates:

We note that there is no monochromatic triangle in the above $2$-colored $K_5$.

For a given $n$ the least such number $m,m\rightarrow (n)^2_r$ is called the diagonal Ramsey number $R(n,n)$. While such numbers are guaranteed to exist by Ramsey’s theorem, actually finding them is hard indeed. It is trivial to see that $R(1,1)=1$ and $R(2,2)=2$. Our proof above illustrates that $R(3,3)=6$. It has also been proven that $R(4,4)=18$. However all efforts to find out the exact value of any other diagonal Ramsey number have failed. It is one of those problems, which are easily stated but extremely difficult to attack.

Regarding $R(5,5)$, Joel Spencer comments in his book, Ten lectures on the Probabilistic Method:

ErdÅ‘s asks us to imagine an alien force, vastly more powerful than us, landing on Earth and demanding the value of $R(5,5)$ or they will destroy our planet. In that case, he claims, we should marshal all our computers and all our mathematicians and attempt to find the value. But suppose, instead, that they ask for $R(6,6)$. In that case, he believes, we should attempt to destroy the aliens.

In closing it may be remarked that the Ramsey number $R(m,n)$ is defined to be the least positive integer such that any $2$-coloring of $K_{R(m,n)}$ in two colors, say red and blue, contains either a red $K_m$ or a blue $K_n$. This definition also has the natural generalization for more then two colors. Although the existence of all Ramsey numbers is guaranteed by Ramsey’s theorem, only few have been discovered so far.

1 Comment

Filed under Combinatorics, Miscellaneous, Ramsey theory

## On abridged versions of Ramsey’s theorem

This post will aim to prove two abridged versions of Ramsey’s theorem, the result which gave its name to the area of mathematics now known as Ramsey theory. Ramsey theory may be understood as ramifications of a philosophical principal. The principle may be summarized as: “Complete disorder is an impossibility. Any structure will necessarily contain an orderly substructure.” Ramsey’s theorem is an example of this principle. Van der Waerden’s theorem is another example.

First a bit of history regarding this theorem. Frank Ramsey, a 25 year old British mathematician, proved this result in 1928 as a part of his investigations in mathematical logic. Unfortunately he passed away in 1930 (even before his theorem was formally published) and did not prove anything else related to this. In 1933, Skolem published another proof of the theorem, and then in 1934, Erdos and Szekeres found yet another. It was the Erdos and Szekeres paper which gave Ramsey’s theorem a wide audience. Further work by Erdos in this area led mathematicians to consider other Ramsey type theorems.

Here we consider abridged versions of Ramsey’s theorem which can be easily understood in the context of graphs. We will imitate Ramsey’s original approach (though not his proof) to first prove an infinite version (which deals with infinite graphs) and then proceed towards the finite analogue.

So what is the infinite version. Briefly it says that if we have an infinite complete graph on countable vertices and if we use $r$ colors $c_1,c_2\cdots c_r$ to arbitrarily color the edges then we are guaranteed an infinite monochromatic subgraph. Here our usage of coloring an edge is just a fanciful way to describe partitioning the edge set into finitely many parts. More formally, by a $r$-coloring of the edges of a graph $G$ we mean any function with domain $E(G)$ and range $\{1,2\cdots r\}$ which induces such a partitioning. The colors are the labels that we may use for this partitioning.

Theorem: Let $G$ be the complete infinite graph with vertices $V=\{v_i:i\in\mathbb{N}\}$. Given any $r$-coloring of the edges, $G$ will contain an infinite complete monochromatic subgraph.

Ramsey’s original formulation was more general. He considered the set of vertices to be any infinite set (and not just countable). That is not much gained because a coloring of an uncountable vertex graph induces a coloring of a countable vertex subgraph. Within this countable subgraph we can find a monochromatic infinite complete graph, which is also a subgraph of the original. However he also generalized in another sense: Instead of coloring edges he considered colorings of hyperedges of a fixed order (an edge is a special case: it is a hyperedge of order 2). This too can be overcome by induction on the “hyper-ness” of the graph. We choose to postpone such discussions.

Proof: Suppose the edges of $G$ have been colored using the colors $c_1,c_2,\cdots c_r$. We will build an infinite subsequence of the vertices in $V$ by applying the infinite pigeonhole principle. Let $w_1=v_1$. Now $w_1$ is connected with every other vertex in $G$ and the edges making this connection are colored in any of $r$-ways. We partition the set of all these vertices in this fashion: If $v$ is connected to $w_1$ via an edge colored $c_i$ we consider it to be a member of the set $P_i$. Clearly, $\{P_i:i=1,2\cdots r\}$ is a partitioning of $V-\{w_1\}$. Now, since $\cup P_i$ is infinite so one of the $P_i$ must also be infinite. (Here we are using the infinite pigeonhole principle.) Note that there may be more then one infinite $P_i$. At any rate choose the “smallest” vertex among the infinite parts (or the vertex with the least $i$ in $\{v_i:i\in P_j, P_j$ is infinite $\}$. Rename that particular $P_i$ as $V_1$ and that vertex as $w_2$.

At this point the construction of the sequence becomes obvious. If for some $n$ we have already found a $w_n$ and and a $V_n$ then we can partition the vertices in $V_n$ connected to $w_n$ into $r$-parts depending on how they are connected with $w_n$. Choosing the smallest vertex among the infinite parts yields $w_{n+1}$ and the part it belongs to yields $V_{n+1}$. Since $V_{n+1}$ is infinite so we have no fear of running out of vertices.

Now note an important property of the sequence $(w_n)$. Given any $w_i$ its connections with $w_{i+1},w_{i+2},w_{i+3}\cdots$ are all of the same color by definition. Let us call $w_i$ as $c_j$-based if the edges $w_iw_{i+1},w_iw_{i+2},w_iw_{i+3}\cdots$ are all colored $c_j$. The fact that all the members of the sequence $(w_n)$ are based in any of $r$-ways induces a partitioning of the sequence $(w_n)$. As before by the pigeonhole principle we are guaranteed an infinite part. Now, by ordering the terms appropriately in this infinite part we have a subsequence $(w_{n_k})$ of $(w_n)$ in which all vertices are the same color base, say $c_1$. It is now easy to observe that the subgraph induced by these vertices has all edges colored $c_1$, which proves the theorem.$\Box$

We now turn our attention to the finite analogue. The infinite version guarantees an infinite monochromatic complete graph within an arbitrarily colored infinite countable graph (with countable vertices). The finite version guarantees a finite monochromatic complete graph of desired size within an arbitrarily colored finite countable graph (with suitable number of vertices). More formally the theorem states:

Theorem: Let $r,l_1,l_2,\cdots l_r$ be natural numbers. Then there exists an $m\in \mathbb{N}$ such that any $r$-coloring of $K_m$ contains a monochromatic $K_{l_i}$ for some $i$ entirely colored in color $i$.

We will prove the theorem with $l_1=l_2=\cdots =l_r=n$. This is because no loss of generality occurs by assuming this. To see this suppose $r=2,l_1=5$ and $l_2=7$. Now if we have proved the theorem for $r=2,l_1=l_2=7$ we have also guaranteed the existence of an $m$ such that any $2$-coloring of $K_m$ contains a $K_7$ colored in either of the two colors. Clearly the same $m$ works for $l_1=5,l_1=7$ as a $1$-colored $K_7$ would contain a $1$-colored $K_5$.

Proof: Our modified claim is that for all $r,n\in\mathbb{N}$ there is an $m\in\mathbb{N}$ such that any $r$-coloring of $K_m$ contains a monochromatic $K_n$. By way of contradiction assume that there is an $n\in\mathbb{N}$ such that for every $m\in \mathbb{N}$ there is a $r$-coloring of $K_m$ containing no monochromatic $K_n$. Now let $G$ be a complete graph with vertices $V=\{v_i:i\in\mathbb{N}\}$ and let the edges of $G$ be enumerated by $E=\{e_i:i\in\mathbb{N}\}$. We construct a (rooted) tree $T$ as follows:

1. First introduce a root vertex $rt$.

2. Each vertex is allowed to have at most $r$ children (a child of a vertex $x$ is defined to be an adjacent vertex $y$ so that $d(rt,x)+1=d(rt,y)$) which correspond to the $r$-colors, subject to it satisfying the criteria below. A child is always labeled by one among the $r$-colors. (Call the colors $c_1,c_2\cdots c_r$ for convenience).

3. A child $c_i$ is permitted if and only if its introduction creates a path of some finite length $k$ starting from the root, so that if the edges $e_1,e_2\cdots e_k$ are colored by the colors used in the path in the same order, then the corresponding subgraph in $G$ does not contain a monochromatic $K_n$. For example if the introduction of a child $c_i$ creates the $k$ length path $rt,c_a,c_b\cdots ,c_i$ and the edges $e_1,e_2,\cdots e_k$ when colored $c_a,c_b\cdots c_i$ don’t contain a monochromatic $K_n$ the child $c_i$ is permitted to be added to $T$.

Note that for all $m$, there always exists a coloring of $K_m$ such that no monochromatic $K_n$ exists within. So the situation that a child cannot be added to any vertex at a given level $k$ cannot arise. For we can always take a coloring of $K_k$ containing no monochromatic $K_n$. Since any $k$ edges in it would yield a sequence of colors already existing in $T$, we know which vertex to add the child to. We give the child the color corresponding to any other edge. Hence we can forever keep adding children and so $T$ is infinite. It is also obvious that each level $k$ of $T$ has at most $r^k$ vertices and so each level is finite.

Now by Konig’s lemma there will be an infinite path in $T$. This infinite path provides a $r$-coloring of $G$ that contains no monochromatic complete graph and hence no monochromatic infinite complete graph which contradicts the infinite Ramsey theorem proved above. This contradiction proves the theorem.$\Box$

Filed under Combinatorics, Graph theory, Ramsey theory

## A ball game and Konig’s lemma

In this post we will examine an application of Konig’s lemma to investigate a game proposed by Raymond Smullyan.

Imagine there is an unlimited supply of balls with us, with positive integers painted on each. It is very much possible that a particular positive integer is repeated many times on various balls. We do not know anything else about the distribution of numbers on the balls. Further imagine that there is a box with a ball initially within it.

To play the game we may do either of two things:

(1) We may remove a ball from the box and replace it with any finite number of balls, arbitrarily chosen, but constrained to have numbers painted on them which are less then the number of the ball removed. For example, if we have a number 1000 ball we may remove it and replace it by 6000 balls: 3000 balls on which 657 is painted, and 3000 balls on which 999 is painted.

(2) We may remove a ball from the box and not do any replacement.

After playing the game once we can continue playing provided there are balls left in the box.

The question is: Is it possible to play the ball game indefinitely? Can we ensure that the box will never be empty?

Smullyan himself answered this problem in the negative by using Konig’s lemma.

Theorem: The Smullyan game terminates after a finite number of steps.

Proof: For any play of the Smullyan game define a (rooted) tree as follows:
1. The vertices are the painted balls used during the game.
2. The root is the painted ball intially within the box.
3. The edges are given by the following criteria: There is an edge between the vertices $x$ and $y$ iff during some move of the game, the ball $y$ is among the balls which replaces the ball $x$ (or vice-versa).

By the definition of the game this is a rooted tree where each level is finite. Can there be an infinite path in the tree? No there can’t. This is because, by definition an infinite path would correspond to a infinite strictly decreasing sequence of positive integers which is absurd. By the contrapositive form of Konig’s lemma we can therefore conclude that the tree is finite. In other words only finitely many balls were used and so the game terminated after finitely many steps.

An alternate proof without using Konig’s lemma is available here.