Theory - Moment generating functions

In order to show why the CLT is true, we introduce the technique of moment generating functions. Recall that the moment of a distribution is simply . Write for this value.

Recall the power series for :

The function has the property of being a bijective differentiable map from to , and it converts addition to multiplication: .

Given a random variable , we can compose with to obtain a new variable. Define the moment generating function of as follows:

This is a function of and returns values in . It is called the moment generating function because it contains the data of all the higher moments . They can be extracted by taking derivatives and evaluating at zero:

It is reasonable to consider as a formal power series in the variable that has the higher moments for coefficients.

Example - Moment generating function of a standard normal

We compute where . From the formula for expected value of a function of a random variable, we have:

Complete the square in the exponent: . Thus:

The last factor can be taken outside the integral:

Exercise - Moment generating function of an exponential variable

Compute for .

Moment generating functions have the remarkable property of encoding the distribution itself:

Distributions determined by MGFs

Assume and both converge. If , then .

Moreover, if for any interval of values , then for all and .

Be careful about moments vs. generating functions!

Sometimes the moments all exist, but they grow so fast that the moment generating function does not converge. For example, the log-normal distribution for has this property.

The fact above does not apply when this happens.

When moment generating functions approximate each other, their corresponding distributions also approximate each other:

Distributions converge when MGFs converge

Suppose that for all on some interval . (In particular, assume that converges on some such interval.) Then for any , we have:

Exercise Using an MGF

Suppose is nonnegative and when and when . Find a bound on using (a) Markov’s Inequality, and (b) Chebyshev’s Inequality.

Theory - Proof of CLT

The main role of moment generating functions in the proof of the CLT is to convert the sum into a product by putting the sum into an exponent.

We have , and recall , so and . First, compute the MGF of . We have:

Exchange the sum in the exponent for a product of exponentials:

Now since the are independent, the factors are also independent of each other. Use the product rule when are independent to obtain:

Now expand the exponential in its Taylor series and use linearity of expectation:

We don’t give a complete argument for the final approximation, but a few remarks are worthwhile. For fixed , and assuming the moments have adequately bounded growth in , the series in each factor converges for all . Using Taylor’s theorem we could write an error term as a shrinking function of . The real trick of analysis is to show that in the product of factors, these error terms shrink fast enough that the limit value is not affected.

In any case, the factors of the last line are independent of , so we have:

But is the MGF of . Therefore , so .