Tag Archives: abstract algebra

Generating Functions

In this post we will discuss generating functions. Generating functions are a powerful tool which can be used to solve many problems both in combinatorics and also in analysis.

Let R^\infty denote the set of all sequences of real numbers. For all i\ge 1 we let x^i denote the sequence which has value 1 for its (i+1)th term and all its other terms are zero. The symbol 1 stands for the sequence (1,0,0,\cdots). Also for any real number \alpha we define the product of \alpha and a sequence \alpha(a_0,a_1,\cdots) as (\alpha a_0,\alpha a_1,\cdots). We let two sequences {a_n} and {b_n} be equal if a_i=b_i for all i. We define the sum of {a_n} and {b_n} by the sequence {a_n+b_n} and the product by the sequence {c_n} where c_i=\sum_{j=1}^ia_jb_{i-j}. Clearly the sequence (a_0,a_1,\cdots ) is equal to the sequence a_0+a_1x+a_2x^2\cdots which we will also denote simply by \sum a_ix^i. Note that a_0 here stands for the sequence obtained as a product of a_0 and 1, i.e. a_0(1,0,0\cdots). Algebraically speaking \mathbb {R}^\infty, equipped with these operations, is an \mathbb {R}-algebra.

More importantly there is an analytic viewpoint of \mathbb {R}^\infty also. Readers who are familiar with the theory of power series can consider the elements of \mathbb {R}^\infty to be power series, i.e. each element is basically a function with its domain an open interval (or simply \{0\}). By standard theorems in analysis, if \sum a_ix^i and \sum b_ix^i both converge for |x|0 then for all such x, \sum a_ix^i=\sum b_ix^i if and only if a_i=b_i for all $i$. Hence the approach of considering \sum a_ix^i as a purely formal object may be considered equivalent to considering it as a power series.

However, we will soon see as to why convergence issues do not play any role in our context as long as the power series converges for at least one non zero x and there is more value in interpreting \sum a_ix^i as simply an element of \mathbb {R}^\infty.

Let (a_n) be a real sequence such that the power series \sum a_ix^i converges for at least one non zero x. Then the function f which sends such an x to the power series \sum a_ix^i is called the generating function of the sequence. We frequently abuse notation and refer to the value of the generating function at some non-zero point x as the generating function (which is akin to referring the function f as f(x)).

Let (a_n) be the constant sequence of one’s. It is well known that for any real number x, if |x|<1 the series 1+x+x^2+\cdots converges to 1/(1-x). So the generating function of (a_n) is f where f(x)=1/(1-x) for all x at which the series converges.

The reason for requiring convergence at a non zero point is as follows. As soon as we have convergence at a non zero x, by a theorem of analysis it follows that there is convergence in an open interval around 0. Now, f has a unique power series expansion in that interval and so we are guaranteed that there is a one-one correspondence between the purely discrete object (1,1,\cdots) thought of as an element of \mathbb{R}^\infty and the generating function f. This can be exploited in the reverse direction, for if we wish to recover our sequence from the function f, then since f is defined by f(x)=1/(1-x)=1+x+x^2+\cdots x^n there is absolutely no ambiguity, and we cannot get back any other sequence. In fact, we may say that our sequence has been encoded within the definition of f, as a closed form expression 1/(1-x) .

If convergence was only given at 0, then such a one-one correspondence is not possible, since any closed form analytic function f, which is a_0 at 0 would become the generating function to the sequence (a_0,a_1,a_2,\cdots). So for any sequence (a_0,a_1,\cdots) we will consider its generating function to be defined by a_0+a_1x+a_2x^2\cdots as long as there is a non zero $x$ for which there is convergence, and once we have done that will not bother about any convergence issues at all.

The reader may be wondering what was the point of giving an algebraic approach initially, for a generating function really seems to have to do more with a power series. Furthermore in the notation of the first paragraph when we were considering a sequence as an element of \mathbb{R}^\infty we gave found that in our algebra (a_0,a_1,\cdots) is nothing but a_0+a_1x+a_2x^2+\cdots. This was not a power series but simply our notation for a sequence. It may appear confusing to have the same notation for two different objects, but it has been deliberately adopted for a very good reason. During our computations, we often manipulate our series so that we may no longer be sure whether convergence of a given series at a non-zero point is guaranteed. This poses a mathematical problem of the validity of our operations. However our ambiguous notation comes to our rescue at that instant, for what we really are doing at that point, without explicitly saying so, is not dealing with the series a_0+a_1x+a_2x^2+\cdots. Instead we are manipulating the sequence a_0+a_1x+a_2x^2+\cdots=(a_0,a_1,a_2,\cdots) with which there are absolutely no concepts of convergence attached. Of course if we need closed form expressions or some other analytic properties have to be established we need convergence so that one can use the one-one correspondence between the sequence and the generating function and dive in the world of analysis. In this way, a constant interplay between the discrete world of sequences and the analytic world of sequences brings out the real power of generating functions.


1 Comment

Filed under Algebra, Combinatorics, Real Analysis

The Rank Nullity Theorem

One of the important results in linear algebra is the rank nullity theorem. Here I am going to present a proof of it which is slightly less known. The reason I like this proof is because it ties together many concepts and results quite nicely, and also because I independently thought of it.

The theorem (as is well known) says that if V,W are vector spaces with n=\dim V<\infty and T:V\to W a linear map then \text{rank} (T)+\text{nullity}(T)=n.

In this proof I will further assume that W is finite dimensional with dimension m. A more general proof can be found on wikipedia.

We start by fixing two bases of V and W and obtain a m\times n matrix A=\begin{pmatrix}  r_1\\  r_2\\  \vdots\\  r_m  \end{pmatrix} of T relative to these bases. (Each r_i is a 1\times n row matrix). Then our theorem basically translates to \text{rank} (A)+\text{nullity}(A)=n. We let \text{Row Space} (A)=R,\text{Null Space} (A)=N and claim that R^\perp=N.

Clearly if x\in N then Ax=0 and so \begin{pmatrix}  r_1\\  r_2\\  \vdots\\  r_m  \end{pmatrix}x=\begin{pmatrix}  r_1x\\  r_2x\\  \vdots\\  r_mx  \end{pmatrix}=\begin{pmatrix}  0\\  0\\  \vdots\\  0  \end{pmatrix} so that each r_i is orthogonal to x. Hence x\in R^\perp. Conversely if x\in R^\perp then x^Tr_i^T=0 so that x^TA^T=0, i.e. Ax=0 following which x\in N.

Now it only remains to invoke the result \dim U+\dim U^\perp=\dim V for any subspace U of an inner product space V to conclude that \dim R+\dim N=\dim \mathbb{R}^n. In other words \text{rank} (A)+\text{nullity}(A)=n.\Box


Filed under Algebra, Linear Algebra, Miscellaneous

Graph Automorphisms

This post is concerning automorphisms of graphs, which quantify the symmetry existing within the graph structure. Given two graphs G and H, a bijection f:V(G)\to V(H) which maintains adjacency, i.e. xy\in E(G)\Leftrightarrow f(x)f(y)\in E(H), is called an isomorphism and the graphs G and H are called isomorphic. Clearly isomorphic graphs are essentially the same, with the superficial difference between them on account of different notation used in defining the vertex set. A isomorphism from the graph G to itself is called an automorphism. It is easy to see that the set of all automorphisms on a graph G together with the operation of composition of functions forms a group. This group is called the automorphism group of the graph, and is denoted by A(G).

In the remainder of this post we investigate some well known graphs and find out their automorphism groups.

The first graph we take up is the complete graph K_n. Any permutation of its n vertices is in fact an automorphism for adjacency is never lost. Its automorphism group is therefore S_n.

The next graph is the complete bipartite graph K_{m,n}. First consider the case m\not= n. The m vertices in the first partite set can be permuted in m! ways and similarly n! ways for the second partite set. Corresponding to each of these m!n! limited permutations we get automorphisms because adjacency is never disturbed. On the other hand, no automorphism can result from swapping a vertex from the first partite set and the second partite set because unless such a swap is done in its entirety (i.e. all the vertices from the first partite set swap places with the vertices in the second partite set), adjacency will be lost. A swap can be done in entirety only if m=n which is not the case we are considering. Hence no further automorphisms can result. Moreover by the multiplication rule it is simple to observe that the automorphism group would be isomorphic to S_m\times S_n.

In the case of m=n, we first pair off the vertices in the two partite sets against each other. This is also an automorphism, say f. Now for each of the m!m! ways of permuting vertices within partite sets, an additional automorphism arises. It is obtained in this fashion: After permuting the vertices within the partite sets by the particular way we swap each vertex with its pair in the other partite set. Clearly this yields 2m!m! automorphisms and furthermore no more are possible. Since every element of A(K_{m,m}) can be written as a unique product of an automorphism collection of the type covered in counting the first m!^2 ways (which is not hard to see is a normal subgroup, being of index 2) and of the subgroup \{Id,f\} so we see that the automorphism group is S_m\times S_m\rtimes \mathbb{Z}_2.

The next graph we take up is the cycle graph C_n. Firstly note that any automorphism can be obtained in this way: A given vertex v may be mapped to any of the n vertices available (including itself). As soon as that is done, an adjacent vertex to v has only two choices left: it can either be in the counter clockwise direction to v or in the clockwise direction to v. Once that choice is also made, no other choices are required. Hence we get 2n automorphisms this way and there can be no others. Also, it is clear that two kinds of automorphisms suffice to generate this group: rotation, and swapping the notion of clockwise and counter clockwise (assuming we draw the cycle graph as equally spaced points on the unit circle; there is no loss of generality in doing that). But both these automorphisms also generate the dihedral group D_n which also has 2n elements. It follows that A(C_n)=D_n.

The final graph we take up is the well known Petersen graph. Instead of directly considering what possible functions are there in its automorphism group (although such an approach is possible) we approach the problem through the concept of line graphs.

Definition: A line graph L(G) of a graph G is the graph whose vertices are in one to one correspondence with the edges of G, two vertices of L(G) being adjacent if and only if the corresponding edges of G are adjacent.

Lemma 1: L(K_5) is the complement of the Petersen graph.
Proof: It is clear that if the vertices of K_5 are labelled 1,2,\ldots,5 then its 10 edges are the {5 \choose 2} 2-subsets of \{1,\cdots,5\}. The line graph L(K_5) thus has 10 vertices, labeled by these 10 2-subsets \{i,j\}. Two vertices \{i,j\}, \{k,\ell\} are adjacent in L(K_5) iff the two 2-subsets have a nontrivial overlap. The complement of L(K_5) is the graph with the same 10 vertices, and with two vertices being adjacent iff the corresponding two 2-subsets are disjoint. But this is the very definition of the Petersen graph.\Box

Lemma 2: A(G) is equal to A(\bar{G}).
Proof: If \sigma\in A(G) then for any two vertices x,y we have xy\in E(G)\Leftrightarrow \sigma(x)\sigma(y)\in E(G), i.e. xy\not\in E(G)\Leftrightarrow \sigma(x)\sigma(y)\not\in E(G), i.e. xy\in E(\bar{G})\Leftrightarrow \sigma(x)\sigma(y)\in E(\bar{G}) so that \sigma\in A(\bar{G}). The reverse implication follows by replacing G by \bar{G}. \Box

Theorem 3: The automorphism group of the Petersen graph is S_5.
Proof: In view of Lemma 1 and 2 it suffices to find out A(L(K_5)) for the automorphism group of the Petersen graph is going to be the same. We let K_5 have the vertex set \{1,\cdots,5\} in the sequel.

Take any automorphism f of K_5. If we have two edges ab,cd\in K_5 with f(a)f(b)=f(c)f(d), then either of two cases arise. Either f(a)=f(c) or not. If f(a)=f(c) then obviously f(b)=f(d) and so by injectivity of f we have ab=cd. If f(a)\not=f(c) then it must be that f(a)=f(d). This means that f(b)=f(c) and again by injectivity we have ab=cd. What this means is that the function induced by f on E(K_5) in the natural way is injective. It is also surjective as for any xy\in E(K_5) clearly f\{f^{-1}xf^{-1}y\}=xy. Finally, this function is an automorphism since \{xy,xz\}\in E(L(K_5)) clearly implies and is implied by \{f(x)f(y),f(x)f(z)\}\in E(L(K_5)) as there is a common vertex. As our definition of the induced function is obtained in a definite way we have shown that every automorphism of K_5 induces a unique automorphism of L(K_5). Moreover, it is easy to see that if f_1,f_2 are two automorphisms then the automorphism induced by f_1\circ f_2 is the same as the automorphism induced by f_1 composed by f_2.

We now show that given an automorphism of L(K_5) we can obtain an automorphism of K_5 which induces it in the natural way. Let \pi\in A(L(K_5)). It is easy to see that the 4-cliques of L(K_5) originate from the stars K_{1,4} of K_5. So L(K_5) has exactly 5 4-cliques, say C_1,\cdots ,C_5 where C_i contains 4 vertices corresponding to the 4 edges in K_5 that are incident to a vertex i in K_5. Since \pi is an automorphism it sends 4-cliques to 4-cliques. Also, \pi must send two different 4-cliques C_i,C_j with i\ne j to different 4-cliques, because if it sends them to the same 4-clique then a collection of at least 5 vertices is mapped to a collection of 4 vertices, a contradiction to the injectivity of \pi. So \pi induces a permutation of the C_i‘s.

Now suppose \pi and \pi' are two different automorphisms in A(L(K_5)). Then they differ on at least vertex in L(K_5), say on the vertex ij\in E(K_5). Now given any vertex xy in L(K_5) consider the intersection of the 4-cliques C_x and C_y. If pq is some vertex in C_x\cap C_y then pq as an edge in K_5 is part of stars with centers x and y, i.e. pq=xy. Hence the intersection contains only the vertex xy. Every vertex of L(K_5) arises in this way. So if \pi(ij)\ne \pi'(ij), then either \pi (C_i)\ne \pi'(C_i) or \pi(C_j)\ne \pi'(C_j) for otherwise \pi (C_i\cap C_j)=\pi'(C_i\cap C_j).

Hence every automorphism of L(K_5) induces a unique permutation of the C_i‘s. Moreover distinct automorphisms induce distinct permutations so that the automorphisms and the permutations can be put in one-one correspondence. Consider an automorphism f of the vertices of K_5 where f(i)=j if C_i\to C_j in the permutation corresponding to \pi. Now a vertex \{x,y\}\in E(K_5) of L(K_5). This is also the intersection of the 4-cliques C_x and C_y and so \pi(\{x,y\})=\pi(C_x\cap C_y)=\pi(C_x)\cap\pi (C_y)=C_{f(x)}\cap C_{f(y)}=f(x)f(y). This shows that f induces \pi as an automorphism.

Hence we have shown that A(K_5)\equiv A(L(K_5)). So the Petersen graph has the automorphism group S_5. \Box

Leave a comment

Filed under Combinatorics, Graph theory, Group Theory

Mathematics and brainvita

In this post, I will analyse the single player game brainvita (also known as peg solitaire). The game is described by having a board with holes having the pattern:

Initially all holes except the central one are filled with pegs. The player may jump a peg horizontally or vertically over another peg into an empty hole. The jumped peg is then removed by the player. The objective is to finish the game with just one peg left.

The aim of this post is to show that there are (at most) 5 ending board positions in brainvita. In other words, assuming the player wins, his last peg left on the board can be only among 5 fixed board positions.

The proof uses abstract algebra. Suppose that the pegs correspond to coordinates in the plane in the natural fashion described below: The central peg is at (0,0), and all others have integer coordinates as well. For convenience adjacent pegs may be taken a unit distance apart. Some coordinates are described below:

Now consider all possible ways of placing pegs on the board. There are 2^{33} of them. Many of these ways are impossible to reach in a valid game. Regardless of this fact, let W be the set of all possible ways and for a given w\in W, let C_w denote the set of all the coordinates where there is a peg in w. Now define a function f:W\to \frac{\mathbb{Z}_2[x]}{(x^2+x+1)} by f(w)=\sum_{(m,n)\in C_w}x^{m+n} where \frac{\mathbb{Z}_2[x]}{(x^2+x+1)} is the finite field on 4 elements. (If there are no pegs on the board we define f to be 0.)

The function f has the property that for any given way w of placing pegs on the board, if a legal move is performed to yield another way w' then f(w)=f(w'). To see this, suppose that in the way w there were pegs at (m,n) and (m+1,n) positions and the hole at the (m+2,n) position was empty. A legal move may be performed by jumping the (m,n) peg over the (m+1,n) peg to yield a way w' where the holes at (m,n) and (m+1,n) are empty and there is a peg at (m+2,n). Clearly the only change between f(w) and f(w') occurs on account of these three coordinates and f(w)-f(w')=x^{m+n}+x^{m+1+n}-x^{m+2+n}. But as x^{m+n}+x^{m+1+n}-x^{m+2+n}=x^{m+n}(1+x-x^2)=x^{m+n}.0=0 in \frac{\mathbb{Z}_2[x]}{(x^2+x+1)} so f(w)-f(w')=0 following which f(w)=f(w'). The same holds if instead of moving a peg to the right we had moved it above, below or to the left. Moreover all this is regardless of the fact whether the way w can actually be reached by a valid sequence of moves in the game.

Now we define another function g:W\to \frac{\mathbb{Z}_2[x]}{(x^2+x+1)} by g(w)=\sum_{(m,n)\in C_w}x^{m-n} (the no pegs way means g is 0) and by a similar piece of reasoning conclude that g also has the same property enjoyed by f. It is also easy to observe that both f and g are onto functions and so f(W)\times g(W) has 16 elements.

Next we partition the set W as follows. Any w\in W corresponds with a pair (f(w),g(w)) and so with exactly one among the 16 elements of f(W)\times g(W). In this way all the 2^{33} elements of W are partitioned in 16 parts. Further each part has the curious property that for any w in that part, any sequence of legal moves will never change the value of (f(w),g(w)) and so the resultant w' will also be in the same part. It is now obvious that among these 16 parts one and only part consists of all the w's which can actually occur in a game of brainvita. All other w's correspond to ways which can never occur in a legal game.

What is the value of (f(w),g(w)) that this one special part corresponds to? The initial board (assuming it corresponds with i\in W) is set up so that f(i)=g(i)=1. So any way w that arises during the game also has f(w)=g(w)=1. Assume now that the game has been successfully finished with a single peg at (m,n), and that this corresponds to the way w^*. We therefore have f(w^*)=g(w^*)=1. But by definition f(w^*)=x^{m+n} and g(w^*)=x^{m-n}. So we must have x^{m+n}=1. Since the multiplicative group \{1,x,x^2\} is cyclic of order 3 so as a consequence of Lagrange’s theorem we must have 3|(m+n). Similarly we have 3|(m-n). Combining these facts together we have 3 dividing both m and n. The only five coordinates which satisfy this condition are (-3,0),(0,-3),(3,0),(0,3) and (0,0). So the player has to end up at one among these 5 positions. This completes the proof.

It should be noted that we have only found an upper bound on the number of positions the final peg can end up at. However by explicitly giving a sequence of moves for each of these 5 coordinates it is possible to show that all these endings are indeed possible.

1 Comment

Filed under Algebra, Game Theory