Generating Functions

In this post we will discuss generating functions. Generating functions are a powerful tool which can be used to solve many problems both in combinatorics and also in analysis.

Let R^\infty denote the set of all sequences of real numbers. For all i\ge 1 we let x^i denote the sequence which has value 1 for its (i+1)th term and all its other terms are zero. The symbol 1 stands for the sequence (1,0,0,\cdots). Also for any real number \alpha we define the product of \alpha and a sequence \alpha(a_0,a_1,\cdots) as (\alpha a_0,\alpha a_1,\cdots). We let two sequences {a_n} and {b_n} be equal if a_i=b_i for all i. We define the sum of {a_n} and {b_n} by the sequence {a_n+b_n} and the product by the sequence {c_n} where c_i=\sum_{j=1}^ia_jb_{i-j}. Clearly the sequence (a_0,a_1,\cdots ) is equal to the sequence a_0+a_1x+a_2x^2\cdots which we will also denote simply by \sum a_ix^i. Note that a_0 here stands for the sequence obtained as a product of a_0 and 1, i.e. a_0(1,0,0\cdots). Algebraically speaking \mathbb {R}^\infty, equipped with these operations, is an \mathbb {R}-algebra.

More importantly there is an analytic viewpoint of \mathbb {R}^\infty also. Readers who are familiar with the theory of power series can consider the elements of \mathbb {R}^\infty to be power series, i.e. each element is basically a function with its domain an open interval (or simply \{0\}). By standard theorems in analysis, if \sum a_ix^i and \sum b_ix^i both converge for |x|0 then for all such x, \sum a_ix^i=\sum b_ix^i if and only if a_i=b_i for all $i$. Hence the approach of considering \sum a_ix^i as a purely formal object may be considered equivalent to considering it as a power series.

However, we will soon see as to why convergence issues do not play any role in our context as long as the power series converges for at least one non zero x and there is more value in interpreting \sum a_ix^i as simply an element of \mathbb {R}^\infty.

Let (a_n) be a real sequence such that the power series \sum a_ix^i converges for at least one non zero x. Then the function f which sends such an x to the power series \sum a_ix^i is called the generating function of the sequence. We frequently abuse notation and refer to the value of the generating function at some non-zero point x as the generating function (which is akin to referring the function f as f(x)).

Let (a_n) be the constant sequence of one’s. It is well known that for any real number x, if |x|<1 the series 1+x+x^2+\cdots converges to 1/(1-x). So the generating function of (a_n) is f where f(x)=1/(1-x) for all x at which the series converges.

The reason for requiring convergence at a non zero point is as follows. As soon as we have convergence at a non zero x, by a theorem of analysis it follows that there is convergence in an open interval around 0. Now, f has a unique power series expansion in that interval and so we are guaranteed that there is a one-one correspondence between the purely discrete object (1,1,\cdots) thought of as an element of \mathbb{R}^\infty and the generating function f. This can be exploited in the reverse direction, for if we wish to recover our sequence from the function f, then since f is defined by f(x)=1/(1-x)=1+x+x^2+\cdots x^n there is absolutely no ambiguity, and we cannot get back any other sequence. In fact, we may say that our sequence has been encoded within the definition of f, as a closed form expression 1/(1-x) .

If convergence was only given at 0, then such a one-one correspondence is not possible, since any closed form analytic function f, which is a_0 at 0 would become the generating function to the sequence (a_0,a_1,a_2,\cdots). So for any sequence (a_0,a_1,\cdots) we will consider its generating function to be defined by a_0+a_1x+a_2x^2\cdots as long as there is a non zero $x$ for which there is convergence, and once we have done that will not bother about any convergence issues at all.

The reader may be wondering what was the point of giving an algebraic approach initially, for a generating function really seems to have to do more with a power series. Furthermore in the notation of the first paragraph when we were considering a sequence as an element of \mathbb{R}^\infty we gave found that in our algebra (a_0,a_1,\cdots) is nothing but a_0+a_1x+a_2x^2+\cdots. This was not a power series but simply our notation for a sequence. It may appear confusing to have the same notation for two different objects, but it has been deliberately adopted for a very good reason. During our computations, we often manipulate our series so that we may no longer be sure whether convergence of a given series at a non-zero point is guaranteed. This poses a mathematical problem of the validity of our operations. However our ambiguous notation comes to our rescue at that instant, for what we really are doing at that point, without explicitly saying so, is not dealing with the series a_0+a_1x+a_2x^2+\cdots. Instead we are manipulating the sequence a_0+a_1x+a_2x^2+\cdots=(a_0,a_1,a_2,\cdots) with which there are absolutely no concepts of convergence attached. Of course if we need closed form expressions or some other analytic properties have to be established we need convergence so that one can use the one-one correspondence between the sequence and the generating function and dive in the world of analysis. In this way, a constant interplay between the discrete world of sequences and the analytic world of sequences brings out the real power of generating functions.


1 Comment

Filed under Algebra, Combinatorics, Real Analysis

Automorphic Numbers

In this post we discuss automorphic numbers. Note that all numbers involved in this post are non negative integers.

Definition: Let x have n digits in its base 10 representation. Then x is said to be automorphic if x^2\equiv x\pmod {10^n}.

In our base 10 representation this means that the last n digits of x^2 are precisely x. In other words, the digits of x are appended to x^2. Note that since the digits of x are appended to x^2 they are also appended to x^m for any m\ge 1.

A list of some automorphic numbers can be found here. Our aim here is to characterize all such numbers in base 10.

Theorem: The automorphic numbers in base 10 are given by \{0,1\}\cup\{5^{2^m}\pmod {10^m}:m\ge 1\}\cup\{16^{5^m}\pmod {10^m}:m\ge 1\}.

Proof: Suppose x is an automorphic number of n digits. Then,

x(x-1)\equiv 0\pmod {10^n}

which is equivalent to

\Big(x\equiv 0\text{ or }1\pmod{2^n}\Big) \text{ and } \Big(x\equiv 0\text{ or }1\pmod{5^n}\Big)

which is equivalent to the occurrence of exactly one of the following:

Case 1: x\equiv 0\pmod {2^n} \text{ and } x\equiv 0\pmod {5^n}

Case 2: x\equiv 1\pmod {2^n} \text{ and } x\equiv 1\pmod {5^n}

Case 3: x\equiv 1\pmod {2^n} \text{ and } x\equiv 0\pmod {5^n}

Case 4: x\equiv 0\pmod {2^n} \text{ and } x\equiv 1\pmod {5^n}

Now we consider all these cases one by one. Case 1 is equivalent to x\equiv 0\pmod {10^n}. Since x had n digits so this means x=0. Similarly x=1 in case 2.

Now suppose case 3 holds. By the Chinese Remainder theorem there exists a unique solution to these two congruences modulo 10^n. We show that 5^{2^n} is a solution to them. By uniqueness it will then follow that case 3 is equivalent to saying that x\equiv 5^{2^n}\pmod {10^n}. Since x has n digits and 5^{2^n}\pmod {10^n} has at most n digits so we will therefore have x equal to 5^{2^n}\pmod {10^n}. Hence x\in\{5^{2^m}\pmod {10^m}:m\ge 1\}.

We first show that 5^{2^n}\equiv 1\pmod {2^n}. Proceed by induction on n. For n=1 note that 5^{2}\equiv 1\pmod 2.

Now assume the result for k. Then 5^{2^k}=q2^k+1 and squaring yields 5^{2^{k+1}}=q^2 2^{2k}+q2^{k+1}+1=(q^2 2^{k-1}+q)2^{k+1}+1 and so 5^{2^{k+1}}\equiv 1\pmod {2^{k+1}} thereby completing the induction step. Next observe that as n<2^n so 5^{2^n}\equiv 0\pmod {5^n} follows immediately, thereby completing case 3.

Note that our argument also implies that any number of the form 5^{2^m}\pmod{10^m} is automorphic. Indeed, if a=5^{2^m}\pmod{10^m} had t\le m digits then we have shown that a^2\equiv a\pmod {10^m}. Now as 10^t divides 10^m so a^2\equiv a\pmod {10^t}, i.e. a is automorphic.

To finish the proof we now consider case 4. We show that 16^{5^n} is a solution to the congruences of case 4. Clearly 16^{5^n}\equiv 0\pmod {2^n} as n<4.5^n. To show 16^{5^n}\equiv 1\pmod {5^n} we proceed by induction as before. The base case may be explicitly worked out or be shown to hold by invoking Fermat’s theorem: m^p\equiv m\pmod p, for any prime p and integer m. Next assume the result for k, so that 16^{5^k}=q5^k+1. Raising both sides to the fifth power yields the result for k+1. The rest of the argument is similar.\Box

Leave a comment

Filed under Miscellaneous, Number theory, Recreational Mathematics

The Rank Nullity Theorem

One of the important results in linear algebra is the rank nullity theorem. Here I am going to present a proof of it which is slightly less known. The reason I like this proof is because it ties together many concepts and results quite nicely, and also because I independently thought of it.

The theorem (as is well known) says that if V,W are vector spaces with n=\dim V<\infty and T:V\to W a linear map then \text{rank} (T)+\text{nullity}(T)=n.

In this proof I will further assume that W is finite dimensional with dimension m. A more general proof can be found on wikipedia.

We start by fixing two bases of V and W and obtain a m\times n matrix A=\begin{pmatrix}  r_1\\  r_2\\  \vdots\\  r_m  \end{pmatrix} of T relative to these bases. (Each r_i is a 1\times n row matrix). Then our theorem basically translates to \text{rank} (A)+\text{nullity}(A)=n. We let \text{Row Space} (A)=R,\text{Null Space} (A)=N and claim that R^\perp=N.

Clearly if x\in N then Ax=0 and so \begin{pmatrix}  r_1\\  r_2\\  \vdots\\  r_m  \end{pmatrix}x=\begin{pmatrix}  r_1x\\  r_2x\\  \vdots\\  r_mx  \end{pmatrix}=\begin{pmatrix}  0\\  0\\  \vdots\\  0  \end{pmatrix} so that each r_i is orthogonal to x. Hence x\in R^\perp. Conversely if x\in R^\perp then x^Tr_i^T=0 so that x^TA^T=0, i.e. Ax=0 following which x\in N.

Now it only remains to invoke the result \dim U+\dim U^\perp=\dim V for any subspace U of an inner product space V to conclude that \dim R+\dim N=\dim \mathbb{R}^n. In other words \text{rank} (A)+\text{nullity}(A)=n.\Box


Filed under Algebra, Linear Algebra, Miscellaneous

The Inclusion Exclusion Principle

The Inclusion Exclusion Principle is one of most fundamental results in combinatorics. It deals with counting through a seive technique: i.e. we first overcount the quantity to be enumerated, then we try to balance out the overcounting by subtracting the extra. We may or may not subtract more then what is needed and so we count again the extra bits. Continuing in this way we “converge” at the correct count.

An easy example is |A\cup B\cup C|=|A|+|B|+|C|-|A\cap B|-|A\cap C|-|B\cap C|+|A\cap B\cap C| where A,B,C are three finite sets.

Here we start off by summing the number of elements in A,B,C seperately. The overcount is compensated by subtracting the number of elements in |A\cap B|,|A\cap C|, |B\cap C|. We actually compensate for a bit more then needed and so to arrive at the correct count we must add |A\cap B\cap C|.

Our goal here is to prove the inclusion-exclusion principle and then to look at a few corollaries. We first establish two lemmas.

Lemma 1: If X is any set containing n elements and k is any field, then the set of all functions f:X\to k is an n-dimensional vector space V over k with the naturally induced operations.

Proof: Let X=\{x_1,\cdots,x_n\}. It is easy to see that V together with the operations f+g, defined by (f+g)(x_i)=f(x_i)+g(x_i) and (\alpha f)(x_i)=\alpha (f(x_i)) for all x_i\in X is a vector space.

We now exhibit a basis for V consisting of n elements. For all x_i\in X let f_{x_i} be the function defined as f_{x_i}(y)=\begin{cases}         \hfill 1    \hfill & \text{ if }y=x_i \\        \hfill 0 \hfill & \text{ otherwise} \\    \end{cases}. Now, we claim that B=\{f_{x_i}:x_i\in X\} is a basis. Clearly it is spanning as for any f\in V we have f=\sum_{x_i\in X}f(x_i)f_{x_i}. It is linearly independent as \sum_{x_i\in X} \alpha_i f_{x_i}=0 means that for any j, with 1\le j\le n, we have \sum_{x_i\in X} \alpha_i f_{x_i}(x_j)=0, i.e. \alpha_j=0. This completes the proof.\Box

Lemma 2: If m\ge 0 then \sum_{i \mathop = 0}^m \left({-1}\right)^i \binom m i = \delta_{0m}.

Proof: If m>0 put x=1, y=-1 in the binomial theorem (x+y)^m=\sum_{i=0}^mx^{m-i}y^i. If m=0, then the sum on the left reduces to only one term: (-1)^0\binom 0 0. This is clearly 1.\Box

We now prove the Inclusion Exclusion principle. This theorem in its purest form, is simply a formula for an inverse of a linear operator. The theorem is as follows:

Theorem 3: Let S be a set with n elements. Let V be the 2^n dimensional vector space of all functions f:\mathcal{P}(S)\to k, where k is some field and \mathcal{P}(S) is the power set of S. Let \phi:V\to V be the linear operator defined by

\displaystyle\phi f(T)=\sum_{Y\supseteq T}f(Y)\text{ for all } T\subseteq S.

Then \phi^{-1} exists and is given by:

\displaystyle\phi^{-1} f(T)=\sum_{Y\supseteq T}(-1)^{|Y-T|}f(Y)\text{ for all } T\subseteq S.

Proof: To show \phi^{-1} as given above is indeed the inverse it suffices to show that \phi^{-1}\phi(f)=f for all f\in V.

Let f\in V. Then,

\displaystyle\phi^{-1}\phi f(T)=\sum_{Y\supseteq T}(-1)^{|Y-T|}\phi f(Y)=\sum_{Y\supseteq T}(-1)^{|Y-T|}\sum_{Z\supseteq Y}f(Z)=\sum_{Z\supseteq T}\Big(\sum_{Z\supseteq Y\supseteq T}(-1)^{|Y-T|}\Big)f(Z)

Now fix Z,T and let m=|Z-T|. Consider \displaystyle\sum_{Z\supseteq Y\supseteq T}(-1)^{|Y-T|}. Any Y is obtained by choosing some elements out of Z-T which has m elements, and taking the union of such elements with T. So for every i, with 0\le i\le m, there are exactly \binom m i ways of choosing a Y, which has i+|T| elements. Any such Y also has i elements in |Y-T|. So \displaystyle\sum_{Z\supseteq Y\supseteq T}(-1)^{|Y-T|}=\sum_{i=0}^m(-1)^i\binom m i=\delta_{0m}. This when substituted in the expression for \phi^{-1}\phi f(T) shows that \phi^{-1}\phi f(T)=f(T), which proves the theorem.\Box

We now discuss some corollaries of the Inclusion Exclusion principle. Let S be a set of properties that elements of a given set A may or may not have. For any T\subseteq S, let A'_T\subseteq A be those elements which have exactly the properties in T and no others. We define a function f_=:S\to \mathbb{R} such that f_=(T)=|A'_T|. Similarly, for any T\subseteq S, let A''_T\subseteq A be those elements which have at least the properties in T. We define a function f_\ge:S\to \mathbb{R} such that f_=(T)=|A''_T|. It is clear that \displaystyle f_\ge(T)=\sum_{Y\supseteq T}f_=(T) for any T\subseteq S, and so by the Inclusion Exclusion principle we conclude that

Corollary 4: For any T\subseteq S, we have \displaystyle f_=(T)=\sum_{Y\supseteq T}(-1)^{|Y-T|}f_\ge (Y).

In particular we have \displaystyle f_=(\emptyset)=\sum_{Y\subseteq S}(-1)^{|Y|}f_\ge (Y), which gives us a formula for the number of elements having none of the properties.\Box

In the above corollary we think of f_\ge(T) as the first approximation to f_=(T). So we “include” that much “quantity” in our initial count. From this we subtract all terms of the type f_=(Y) where Y has just one extra element then T. Thus we “exclude” that much “quantity” from our count. This gives a better approximation. Next we add all terms of the type f_=(Y) where Y has two extra elements then T, and so on. This is the reason behind the terminology inclusion-exclusion.

We now discuss another corollary. Let A be a finite set and let A_1,\cdots A_n be some of its subsets. We define a set of properties S=\{P_1,\cdots,P_n\} which elements of A may or may not enjoy as follows: For any i, with 1\le i\le n, x\in A satisfies the property P_i if and only if x\in A_i. Also for any I\subseteq [n], let \emptyset\ne A_I=\cap_{i\in I}A_i be the set of elements which have at least the properties P_i for i\in I. Define A_\emptyset to be A. By Corollary 4, \displaystyle f_=(\emptyset)=\sum_{I'\subseteq S}(-1)^{|I'|}f_\ge (I')=\sum_{I\subseteq [n]}(-1)^{|I|}|A_I| where in the second equality we correspond each subset of properties with a subset of [n]. We summarize this as

Corollary 5: Let A be a finite set and let A_1,\cdots A_n be some of its subsets. For any \emptyset\ne I\subseteq [n], let A_I=\cap_{i\in I}A_i and let A_\emptyset=A. The number of elements in A-\cup_{i=1}^n A_i is given by \displaystyle\sum_{I\subseteq [n]}(-1)^{|I|}|A_I|=|A|+\sum_{\emptyset\ne I\subseteq [n]}(-1)^{|I|}|A_I|.\Box

A further special case is obtained by considering any finite sets A_1,A_2,\cdots,A_n and letting A=\cup_{i=1}^nA_i. Then the above corollary translates to \sum_{I\subseteq [n]}(-1)^{|I|}|A_I|=0. Considering the case of I=\emptyset seperately, we see that |A|+\displaystyle\sum_{\emptyset\ne I\subseteq [n]}(-1)^{|I|}|\cap_{i\in I}A_i|=0. This easily yields the following corollary.

Corollary 6: Let A_1,A_2\cdots,A_n be any finite sets. For any \emptyset\ne I\subseteq [n], let A_I=\cap_{i\in I}A_i and let A_\emptyset=A. Then \displaystyle|A_1\cup A_2\cup\cdots\cup A_n|=\displaystyle\sum_{\emptyset\ne I\subseteq [n]}(-1)^{|I|-1}|\cap_{i\in I}A_i|.\Box

Now by grouping terms involving the same size of I we can restate both Corollary 5 and 6 as follows.

Corollary 7: Let A be a finite set and let A_1,\cdots A_n be some of its subsets. The number of elements in A-\cup_{i=1}^n A_i is given by
\displaystyle |A|+\sum_{k=1}^n(-1)^{k}\sum_{1\le i_1 < i_2 < \cdots <i_k\le n}|A_{i_1}\cap A_{i_2}\cap\cdots\cap A_{i_k}|.\Box

Corollary 8: If A_1,A_2\cdots,A_n are any finite sets then

\displaystyle|A_1\cup A_2\cup\cdots\cup A_n|=\sum_{k=1}^n(-1)^{k-1}\sum_{1\le i_1< i_2<\cdots<i_k\le n}|A_{i_1}\cap A_{i_2}\cap\cdots\cap A_{i_k}|.\Box

Corollaries 5 to 8 are often referred to as the principle of inclusion-exclusion themselves as in combinatorial settings they are the ones most often used. A further simplified version can also be derived from them when the intersection of any k distinct sets A_i always has the same cardinality N_k. In that case we only need to multiply N_k with the number of such ways to select the k sets to get the value of the inner sums in Corollaries 7 and 8.

Corollary 9: Let A be a finite set and let A_1,\cdots A_n be some of its subsets. Suppose that for any k with 1\le k\le n there exists a natural number N_k so that for any \{i_1,\cdots, i_k\}\subseteq [n] we have A_{i_1}\cap\cdots\cap A_{i_k}=N_k. Then the number of elements in A-\cup_{i=1}^n A_i is given by |A|+\sum_{k=1}^n(-1)^k\binom{n}{k}N_k and |A_1\cup A_2\cup\cdots\cup A_n|=\sum_{k=1}^n(-1)^{k-1}\binom{n}{k}N_k.\Box

Leave a comment

Filed under Combinatorics, Set theory

Graph Automorphisms

This post is concerning automorphisms of graphs, which quantify the symmetry existing within the graph structure. Given two graphs G and H, a bijection f:V(G)\to V(H) which maintains adjacency, i.e. xy\in E(G)\Leftrightarrow f(x)f(y)\in E(H), is called an isomorphism and the graphs G and H are called isomorphic. Clearly isomorphic graphs are essentially the same, with the superficial difference between them on account of different notation used in defining the vertex set. A isomorphism from the graph G to itself is called an automorphism. It is easy to see that the set of all automorphisms on a graph G together with the operation of composition of functions forms a group. This group is called the automorphism group of the graph, and is denoted by A(G).

In the remainder of this post we investigate some well known graphs and find out their automorphism groups.

The first graph we take up is the complete graph K_n. Any permutation of its n vertices is in fact an automorphism for adjacency is never lost. Its automorphism group is therefore S_n.

The next graph is the complete bipartite graph K_{m,n}. First consider the case m\not= n. The m vertices in the first partite set can be permuted in m! ways and similarly n! ways for the second partite set. Corresponding to each of these m!n! limited permutations we get automorphisms because adjacency is never disturbed. On the other hand, no automorphism can result from swapping a vertex from the first partite set and the second partite set because unless such a swap is done in its entirety (i.e. all the vertices from the first partite set swap places with the vertices in the second partite set), adjacency will be lost. A swap can be done in entirety only if m=n which is not the case we are considering. Hence no further automorphisms can result. Moreover by the multiplication rule it is simple to observe that the automorphism group would be isomorphic to S_m\times S_n.

In the case of m=n, we first pair off the vertices in the two partite sets against each other. This is also an automorphism, say f. Now for each of the m!m! ways of permuting vertices within partite sets, an additional automorphism arises. It is obtained in this fashion: After permuting the vertices within the partite sets by the particular way we swap each vertex with its pair in the other partite set. Clearly this yields 2m!m! automorphisms and furthermore no more are possible. Since every element of A(K_{m,m}) can be written as a unique product of an automorphism collection of the type covered in counting the first m!^2 ways (which is not hard to see is a normal subgroup, being of index 2) and of the subgroup \{Id,f\} so we see that the automorphism group is S_m\times S_m\rtimes \mathbb{Z}_2.

The next graph we take up is the cycle graph C_n. Firstly note that any automorphism can be obtained in this way: A given vertex v may be mapped to any of the n vertices available (including itself). As soon as that is done, an adjacent vertex to v has only two choices left: it can either be in the counter clockwise direction to v or in the clockwise direction to v. Once that choice is also made, no other choices are required. Hence we get 2n automorphisms this way and there can be no others. Also, it is clear that two kinds of automorphisms suffice to generate this group: rotation, and swapping the notion of clockwise and counter clockwise (assuming we draw the cycle graph as equally spaced points on the unit circle; there is no loss of generality in doing that). But both these automorphisms also generate the dihedral group D_n which also has 2n elements. It follows that A(C_n)=D_n.

The final graph we take up is the well known Petersen graph. Instead of directly considering what possible functions are there in its automorphism group (although such an approach is possible) we approach the problem through the concept of line graphs.

Definition: A line graph L(G) of a graph G is the graph whose vertices are in one to one correspondence with the edges of G, two vertices of L(G) being adjacent if and only if the corresponding edges of G are adjacent.

Lemma 1: L(K_5) is the complement of the Petersen graph.
Proof: It is clear that if the vertices of K_5 are labelled 1,2,\ldots,5 then its 10 edges are the {5 \choose 2} 2-subsets of \{1,\cdots,5\}. The line graph L(K_5) thus has 10 vertices, labeled by these 10 2-subsets \{i,j\}. Two vertices \{i,j\}, \{k,\ell\} are adjacent in L(K_5) iff the two 2-subsets have a nontrivial overlap. The complement of L(K_5) is the graph with the same 10 vertices, and with two vertices being adjacent iff the corresponding two 2-subsets are disjoint. But this is the very definition of the Petersen graph.\Box

Lemma 2: A(G) is equal to A(\bar{G}).
Proof: If \sigma\in A(G) then for any two vertices x,y we have xy\in E(G)\Leftrightarrow \sigma(x)\sigma(y)\in E(G), i.e. xy\not\in E(G)\Leftrightarrow \sigma(x)\sigma(y)\not\in E(G), i.e. xy\in E(\bar{G})\Leftrightarrow \sigma(x)\sigma(y)\in E(\bar{G}) so that \sigma\in A(\bar{G}). The reverse implication follows by replacing G by \bar{G}. \Box

Theorem 3: The automorphism group of the Petersen graph is S_5.
Proof: In view of Lemma 1 and 2 it suffices to find out A(L(K_5)) for the automorphism group of the Petersen graph is going to be the same. We let K_5 have the vertex set \{1,\cdots,5\} in the sequel.

Take any automorphism f of K_5. If we have two edges ab,cd\in K_5 with f(a)f(b)=f(c)f(d), then either of two cases arise. Either f(a)=f(c) or not. If f(a)=f(c) then obviously f(b)=f(d) and so by injectivity of f we have ab=cd. If f(a)\not=f(c) then it must be that f(a)=f(d). This means that f(b)=f(c) and again by injectivity we have ab=cd. What this means is that the function induced by f on E(K_5) in the natural way is injective. It is also surjective as for any xy\in E(K_5) clearly f\{f^{-1}xf^{-1}y\}=xy. Finally, this function is an automorphism since \{xy,xz\}\in E(L(K_5)) clearly implies and is implied by \{f(x)f(y),f(x)f(z)\}\in E(L(K_5)) as there is a common vertex. As our definition of the induced function is obtained in a definite way we have shown that every automorphism of K_5 induces a unique automorphism of L(K_5). Moreover, it is easy to see that if f_1,f_2 are two automorphisms then the automorphism induced by f_1\circ f_2 is the same as the automorphism induced by f_1 composed by f_2.

We now show that given an automorphism of L(K_5) we can obtain an automorphism of K_5 which induces it in the natural way. Let \pi\in A(L(K_5)). It is easy to see that the 4-cliques of L(K_5) originate from the stars K_{1,4} of K_5. So L(K_5) has exactly 5 4-cliques, say C_1,\cdots ,C_5 where C_i contains 4 vertices corresponding to the 4 edges in K_5 that are incident to a vertex i in K_5. Since \pi is an automorphism it sends 4-cliques to 4-cliques. Also, \pi must send two different 4-cliques C_i,C_j with i\ne j to different 4-cliques, because if it sends them to the same 4-clique then a collection of at least 5 vertices is mapped to a collection of 4 vertices, a contradiction to the injectivity of \pi. So \pi induces a permutation of the C_i‘s.

Now suppose \pi and \pi' are two different automorphisms in A(L(K_5)). Then they differ on at least vertex in L(K_5), say on the vertex ij\in E(K_5). Now given any vertex xy in L(K_5) consider the intersection of the 4-cliques C_x and C_y. If pq is some vertex in C_x\cap C_y then pq as an edge in K_5 is part of stars with centers x and y, i.e. pq=xy. Hence the intersection contains only the vertex xy. Every vertex of L(K_5) arises in this way. So if \pi(ij)\ne \pi'(ij), then either \pi (C_i)\ne \pi'(C_i) or \pi(C_j)\ne \pi'(C_j) for otherwise \pi (C_i\cap C_j)=\pi'(C_i\cap C_j).

Hence every automorphism of L(K_5) induces a unique permutation of the C_i‘s. Moreover distinct automorphisms induce distinct permutations so that the automorphisms and the permutations can be put in one-one correspondence. Consider an automorphism f of the vertices of K_5 where f(i)=j if C_i\to C_j in the permutation corresponding to \pi. Now a vertex \{x,y\}\in E(K_5) of L(K_5). This is also the intersection of the 4-cliques C_x and C_y and so \pi(\{x,y\})=\pi(C_x\cap C_y)=\pi(C_x)\cap\pi (C_y)=C_{f(x)}\cap C_{f(y)}=f(x)f(y). This shows that f induces \pi as an automorphism.

Hence we have shown that A(K_5)\equiv A(L(K_5)). So the Petersen graph has the automorphism group S_5. \Box

Leave a comment

Filed under Combinatorics, Graph theory, Group Theory

The infinitude of the primes

As we all know, a (natural) number p>1 is called prime if its positive divisors are only 1 and p. The fact that prime numbers are infinite in number was proved by Euclid (although it is not clear whether he discovered it) and the proof is a little gem of mathematical reasoning. We present his proof below:

Theorem: There are infinitely many prime numbers.

Proof: Consider any list of prime numbers, say p_1,p_2\cdots p_n. The number P=p_1p_2\cdots p_n+1 is clearly not on the list. If P is not prime, then by definition there is a prime number p which divides P. Clearly p\ne p_i for any i as otherwise p divides 1, a contradiction.  So p is a prime number not on the list. If, on the other hand, P is prime, then it is anyway a prime number not on the list. Hence our list is incomplete.\Box

We now give another beautiful proof for the same theorem owing to Euler. Although it does not strictly qualify as a valid mathematical proof by modern standards and is really nothing more then heuristic reasoning yet its “wow factor” is so high that as an exception it is still referred to as a proof. It can be recast into a more rigorous form but we do not so here so that its original taste is not disturbed. The essential argument here is that if the primes were finite then the product \prod_{\text{p prime}}\frac{1}{1-1/p} would be finite which is not the case.

Theorem: There are infinitely many prime numbers.

Proof: Let P denote the set of all primes. Now consider the product \prod_{p\in P}\frac{1}{1-1/p} i.e.


by the formula of the geometric series.

Next we observe that if we multiply out the terms (assuming such a multiplication is valid) then the reciprocal of every positive integer will occur exactly once by the uniqueness of the fundamental theorem of arithmetic. If n=p_1^{\alpha_1}\cdots p_k^{\alpha_k} is the unique factorization of n then \frac{1}{n} would result in the multiplication only when \frac{1}{p_1^{\alpha_1}},\cdots \frac{1}{p_k^{\alpha_k}} are multiplied. Hence the above product

\prod_{p\in P}\frac{1}{1-1/p}=\sum_{n=1}^\infty \frac{1}{n}. The latter harmonic series is well known to be divergent. Hence there must be infinitely many terms in P.\Box

Leave a comment

Filed under Algebra, Number theory

The algebraic numbers are countable

By definition an algebraic number is a complex number which satisfies some polynomial a_0+a_1x+\cdots a_nx^n\in\mathbb{Z}[x]. Every rational number m/n is algebraic as m/n satisfies nx-m. Further irrational numbers can also be algebraic, \sqrt{n},n\in\mathbb{N} clearly satisfies x^2-n. Similarly purely imaginary numbers can be algebraic: i obviously satisfies x^2+1.

Just as algebraic numbers are called so because they lie within the range of the “classical algebra” (by which we understand manipulation of the integers) to be “described entirely”, a non-algebraic number is called transcendental because one needs to transcend this “algebra” to describe it. There are only a few mathematically significant numbers which are known to be transcendental. For 2500 years a debate ranged as to \pi was transcendental or not, before the question was settled in the affirmative by Lindemann in 1882. Similarly e is transcendental. (We do not know whether \pi+e is transcendental or not).

Surprisingly, even though the transcendental numbers seem fewer then the algebraic numbers they actually exist in greater abundance. In fact, almost every real number is transcendental. This is because the set of algebraic numbers is countable and hence has Lebesgue measure zero. Our aim in this post is to formally prove the countable nature of the algebraic numbers.

Theorem: Let A be the set of all algebraic numbers. Then A is countable.

Proof: We define a height h of a polynomial a_0+a_1x+\cdots a_nx^n\in\mathbb{Z}[x] as h=n+\sum_{i=0}^n|a_i|. Clearly for a fixed h there are only finitely many choices for n and a_i and so there are only finitely many polynomials of fixed height.

Now we make a list of all the algebraic numbers in the following way: Consider any height h\in\mathbb{N} and for all the finitely many polynomials of this height, write down all the finitely many roots of these polynomials in the list. Keep repeating for all possible heights. It is clear that no algebraic number will be missed out in this list. This proves that A is countable.\Box

Leave a comment

Filed under Algebra, Real Analysis