Category Archives: Real Analysis

Generating Functions

In this post we will discuss generating functions. Generating functions are a powerful tool which can be used to solve many problems both in combinatorics and also in analysis.

Let R^\infty denote the set of all sequences of real numbers. For all i\ge 1 we let x^i denote the sequence which has value 1 for its (i+1)th term and all its other terms are zero. The symbol 1 stands for the sequence (1,0,0,\cdots). Also for any real number \alpha we define the product of \alpha and a sequence \alpha(a_0,a_1,\cdots) as (\alpha a_0,\alpha a_1,\cdots). We let two sequences {a_n} and {b_n} be equal if a_i=b_i for all i. We define the sum of {a_n} and {b_n} by the sequence {a_n+b_n} and the product by the sequence {c_n} where c_i=\sum_{j=1}^ia_jb_{i-j}. Clearly the sequence (a_0,a_1,\cdots ) is equal to the sequence a_0+a_1x+a_2x^2\cdots which we will also denote simply by \sum a_ix^i. Note that a_0 here stands for the sequence obtained as a product of a_0 and 1, i.e. a_0(1,0,0\cdots). Algebraically speaking \mathbb {R}^\infty, equipped with these operations, is an \mathbb {R}-algebra.

More importantly there is an analytic viewpoint of \mathbb {R}^\infty also. Readers who are familiar with the theory of power series can consider the elements of \mathbb {R}^\infty to be power series, i.e. each element is basically a function with its domain an open interval (or simply \{0\}). By standard theorems in analysis, if \sum a_ix^i and \sum b_ix^i both converge for |x|0 then for all such x, \sum a_ix^i=\sum b_ix^i if and only if a_i=b_i for all $i$. Hence the approach of considering \sum a_ix^i as a purely formal object may be considered equivalent to considering it as a power series.

However, we will soon see as to why convergence issues do not play any role in our context as long as the power series converges for at least one non zero x and there is more value in interpreting \sum a_ix^i as simply an element of \mathbb {R}^\infty.

Let (a_n) be a real sequence such that the power series \sum a_ix^i converges for at least one non zero x. Then the function f which sends such an x to the power series \sum a_ix^i is called the generating function of the sequence. We frequently abuse notation and refer to the value of the generating function at some non-zero point x as the generating function (which is akin to referring the function f as f(x)).

Let (a_n) be the constant sequence of one’s. It is well known that for any real number x, if |x|<1 the series 1+x+x^2+\cdots converges to 1/(1-x). So the generating function of (a_n) is f where f(x)=1/(1-x) for all x at which the series converges.

The reason for requiring convergence at a non zero point is as follows. As soon as we have convergence at a non zero x, by a theorem of analysis it follows that there is convergence in an open interval around 0. Now, f has a unique power series expansion in that interval and so we are guaranteed that there is a one-one correspondence between the purely discrete object (1,1,\cdots) thought of as an element of \mathbb{R}^\infty and the generating function f. This can be exploited in the reverse direction, for if we wish to recover our sequence from the function f, then since f is defined by f(x)=1/(1-x)=1+x+x^2+\cdots x^n there is absolutely no ambiguity, and we cannot get back any other sequence. In fact, we may say that our sequence has been encoded within the definition of f, as a closed form expression 1/(1-x) .

If convergence was only given at 0, then such a one-one correspondence is not possible, since any closed form analytic function f, which is a_0 at 0 would become the generating function to the sequence (a_0,a_1,a_2,\cdots). So for any sequence (a_0,a_1,\cdots) we will consider its generating function to be defined by a_0+a_1x+a_2x^2\cdots as long as there is a non zero $x$ for which there is convergence, and once we have done that will not bother about any convergence issues at all.

The reader may be wondering what was the point of giving an algebraic approach initially, for a generating function really seems to have to do more with a power series. Furthermore in the notation of the first paragraph when we were considering a sequence as an element of \mathbb{R}^\infty we gave found that in our algebra (a_0,a_1,\cdots) is nothing but a_0+a_1x+a_2x^2+\cdots. This was not a power series but simply our notation for a sequence. It may appear confusing to have the same notation for two different objects, but it has been deliberately adopted for a very good reason. During our computations, we often manipulate our series so that we may no longer be sure whether convergence of a given series at a non-zero point is guaranteed. This poses a mathematical problem of the validity of our operations. However our ambiguous notation comes to our rescue at that instant, for what we really are doing at that point, without explicitly saying so, is not dealing with the series a_0+a_1x+a_2x^2+\cdots. Instead we are manipulating the sequence a_0+a_1x+a_2x^2+\cdots=(a_0,a_1,a_2,\cdots) with which there are absolutely no concepts of convergence attached. Of course if we need closed form expressions or some other analytic properties have to be established we need convergence so that one can use the one-one correspondence between the sequence and the generating function and dive in the world of analysis. In this way, a constant interplay between the discrete world of sequences and the analytic world of sequences brings out the real power of generating functions.


1 Comment

Filed under Algebra, Combinatorics, Real Analysis

The algebraic numbers are countable

By definition an algebraic number is a complex number which satisfies some polynomial a_0+a_1x+\cdots a_nx^n\in\mathbb{Z}[x]. Every rational number m/n is algebraic as m/n satisfies nx-m. Further irrational numbers can also be algebraic, \sqrt{n},n\in\mathbb{N} clearly satisfies x^2-n. Similarly purely imaginary numbers can be algebraic: i obviously satisfies x^2+1.

Just as algebraic numbers are called so because they lie within the range of the “classical algebra” (by which we understand manipulation of the integers) to be “described entirely”, a non-algebraic number is called transcendental because one needs to transcend this “algebra” to describe it. There are only a few mathematically significant numbers which are known to be transcendental. For 2500 years a debate ranged as to \pi was transcendental or not, before the question was settled in the affirmative by Lindemann in 1882. Similarly e is transcendental. (We do not know whether \pi+e is transcendental or not).

Surprisingly, even though the transcendental numbers seem fewer then the algebraic numbers they actually exist in greater abundance. In fact, almost every real number is transcendental. This is because the set of algebraic numbers is countable and hence has Lebesgue measure zero. Our aim in this post is to formally prove the countable nature of the algebraic numbers.

Theorem: Let A be the set of all algebraic numbers. Then A is countable.

Proof: We define a height h of a polynomial a_0+a_1x+\cdots a_nx^n\in\mathbb{Z}[x] as h=n+\sum_{i=0}^n|a_i|. Clearly for a fixed h there are only finitely many choices for n and a_i and so there are only finitely many polynomials of fixed height.

Now we make a list of all the algebraic numbers in the following way: Consider any height h\in\mathbb{N} and for all the finitely many polynomials of this height, write down all the finitely many roots of these polynomials in the list. Keep repeating for all possible heights. It is clear that no algebraic number will be missed out in this list. This proves that A is countable.\Box

Leave a comment

Filed under Algebra, Real Analysis

Real numbers are uncountable

One of the slickest proofs of all time is using Georg Cantor‘s 1891 diagonal argument in proving that the real numbers \mathbb{R}  constitute an uncountable set, that is, they cannot be put in a bijective correspondence with \mathbb{N}. Strictly speaking, Cantor original paper only proved that the set of all 0,1-sequences is uncountable. However since it is easy to argue that the set of all such sequences and \mathbb{R} have the same cardinality, indirectly we get a proof for the uncountable nature of the reals. Moreover, the argument given by Cantor in his proof can be easily modified to give a direct proof for the uncountability of the reals. This is what we do presently.

Theorem: \mathbb{R} is uncountable.

Proof: We will accept that every real number x has a decimal expansion, x=N.x_1x_2\cdots and it is uniquely determined by x if one agrees never to terminate the expansion with an infinite string of 9‘s. The proof is based on contradiction, and so we suppose that \mathbb{R} is not uncountable. Then it must be that \mathbb{R} is countable and we can make a list of all the elements of \mathbb{R} along with their decimal expansion as follows:


We now consider the real number 0.y_1y_2y_3\cdots which is defined by y_i=1 if x_{ii}\ne 1 and y_i=2 otherwise. It is clear that this number differs from every x_i in the list in at least the i‘th position. Hence this list is incomplete, a contradiction.\Box

Leave a comment

Filed under Real Analysis, Set theory

Niven’s proof for the irrationality of π

In this post I will present a beautiful proof owing to Ivan Niven on the irrationality of \pi. An analytical definition of \pi is that \pi is twice the smallest positive x for which \cos x equals 0. (This definition is better then the usual circumference/diameter  one since it is not dependent on Euclidean axioms).

Theorem (Niven): \pi is irrational.

Proof: Let, by way of contradiction \pi=a/b with a,b\in\mathbb{N}. Now let 

f(x)=\frac{x^n(a-bx)^n}{n!} and 


where we will specify n later. Now n!f(x)=a_0x^n+a_1x^{n+1}+\cdots a_nx^{2n} with a_i\in\mathbb{Z} and so f(0)=0. Not only this, f^{(j)}(0)=0 for 1\le j\le n-1. For n\le j\le 2n we have n!f^{(j)}(x)=n!b_0+b_1x+\cdots b_nx^n with b_i\in\mathbb{Z} and so f^{(j)}(0) is integral in this case as well. Also note that f(x)=f(\pi-x) and so for any j\ge 0,j\in \mathbb{Z} we have f^{(j)}(x)=(-1)^jf^{(j)}(\pi-x) following which f^{(j)}(\pi)=(-1)^jf^{(j)}(0) i.e. f^{(j)}(\pi) is also integral.

Now it follows by elementary calculus that \frac{d}{dx}(F'(x)\sin x-F(x)\cos x)=(F''(x)+F(x))\sin x=f(x)\sin x and so \int_0^\pi f(x)\sin x dx=(F'(x)\sin x-F(x)\cos x)_0^\pi=F(\pi)+F(0). Now F(\pi)+F(0) is integral because f^{(j)}(\pi) and f^{(j)}(0) are both integers for any j. It is also easy to see that for 0<x<\pi we have 0<f(x)<\frac{a^n\pi^n}{n!} and hence 0<f(x)\sin x<\frac{a^n\pi^n}{n!}. But now  0<\int_0^\pi f(x)\sin x dx\le\frac{\pi(a\pi)^n}{n!} and as n! grows faster then any exponential so this upper bound for the integral can be made arbitrarily small. Hence the value of the integral lies between 0 and 1. This is the contradiction which proves the result.\Box

One final word. Even though the proof is credited to Ivan Niven, it is very much possible that the idea was given by Hermite much earlier and all did Niven was to tweak Hermite’s idea to give this proof.

Leave a comment

Filed under Real Analysis

The Cantor Set

This is a post regarding the basic properties of the Cantor set.

We start with the definition: The Cantor set is obtained by first constructing a sequence (C_n) of closed sets and then taking the intersection of the sets in this sequence. The sequence is constructed as follows:

1. Start with [0,1] and remove the middle third, i.e. the interval (\frac{1}{3},\frac{2}{3}) to obtain the set C_1=[0,\frac{1}{3}]\cup[\frac{2}{3},1]. So C_1 contains 2 disjoint closed intervals each of length \frac{1}{3}.

2. Next remove the middle third of each of these two intervals leaving C_2=[0,\frac{1}{9}]\cup[\frac{2}{9},\frac{1}{3}]\cup[\frac{2}{3},\frac{7}{9}]\cup[\frac{8}{9},1] consisting of 2^2 disjoint closed intervals each of length \frac{1}{3^2}.

3. Assuming C_n has been constructed and consists of 2^n disjoint closed intervals each of length \frac{1}{3^n}, remove the middle thirds of all these intervals to obtain C_{n+1} consisting of 2^{n+1} disjoint closed intervals each of length \frac{1}{3^{n+1}}.

By induction we have our sequence (C_n).

Now, we define the Cantor set C as C=\cap_{n=1}^\infty C_n.

We now describe the properties of the Cantor set:

1. In the usual topology on \mathbb{R} the Cantor set is closed (being an intersection of closed sets). Moreover since it is bounded so by the Heine Borel theorem it is compact. What’s extraordinary is that every compact metric space is a continuous image of the Cantor set! The proof may be found here.

2. The Cantor set is uncountable. We prove this by considering the ternary expansion of all numbers in [0,1]. All such numbers may have the form 0.a_1a_2\cdots with a_i\in\{0,1,2\} (including 1=0.22\cdots). We claim that x\in C iff x has a ternary expansion of the form 0.a_1a_2\cdots with a_i\in\{0,2\}.

To prove this consider the construction of C through the ternary lens: In ternary the construction of C_1 involves removing all numbers in (0.011\cdots,0.2) from [0.00\cdots,0.22\cdots]. This shows that in C_1 every number has a ternary expansion of the form 0.a_1a_2\cdots with a_1\in\{0,2\} and a_i\in\{0,1,2\}\forall i>1. Conversely every number of this form is in C_1. Likewise the construction of C_2 involves removing all numbers in (0.0011\cdots,0.02) and in (0.2011\cdots,0.22) from C_1. This shows that in C_2 every number has a ternary expansion of the form 0.a_1a_2\cdots with a_1,a_2\in\{0,2\} and a_i\in\{0,1,2\}\forall i>2. Again conversely every number of this form is in C_2. Continuing in this way we can conclude that x\in C_n iff x=0.a_1\cdots a_na_{n+1}a_{n+2}\cdots with a_1\cdots a_n\in\{0,2\} and a_i\in\{0,1,2\}\forall i>n. Hence by definition of C our claim is established.

Now assume that C is countable and has been enumerated in the list described below:


Here each d_i^{(j)}\in \{0,2\}.

Now let x=0.a_1a_2a_3\cdots where a_i=0 if d_i^{(i)}=2 and a_i=2 otherwise. Then x\notin C despite being of the requisite ternary form. Hence C is uncountable.

3. The Cantor set has Lebesgue measure zero. We define a set A to be of Lebesgue measure zero if \forall \epsilon>0\exists a sequence of intervals (I_n) such that A\subset\cup_{n=1}^\infty I_n and \sum_{n=1}^\infty l(I_n)<\epsilon. Here if the end points of I_n are a_n and b_n (with a_n\le b_n) then l(I_n)=b_n-a_n. (The proof that this definition is consistent with the Lebesgue measure may be found in this book.) Now given \epsilon>0 choose n so that (\frac{2}{3})^n<\epsilon. Since C\subset C_n and the total length of C_n is (\frac{2}{3})^n, so it is clear that C is a set of Lebesgue measure zero. Together with point 2, we see that this establishes the existence of uncountable sets of zero measure!

Leave a comment

Filed under Measure Theory, Real Analysis, Set theory