3.5 The Central Limit Theorem

As a final topic in probability we will briefly discuss why the Normal distribution is so important and widely known. The reason behind this is the existence of a theorem, called the Central Limit Theorem which is perhaps the most important theorem in probability which has far-reaching consequences in the world of statistics.

Let’s first state theorem. Suppose you have random variables \(X_1,\dots, X_n\) which have the following properties:

they are all independent of each other;
they all have the same mean \(\mu\);
the all have the same standard deviation \(\sigma^2\).

Consider the random variable \[ \bar{X}_n= \frac{X_1+\cdots X_n}{n}. \] Then it holds that \[ \lim_{n\rightarrow + \infty} \frac{\bar{X}_n-\mu}{\sigma/\sqrt{n}} = Z \] where \(Z\) is the standard normal random variable.

We can also state the theorem as \[ \lim_{n\rightarrow + \infty} \bar{X}_n = Y \] where \(Y\) is a Normal random variable with mean \(\mu\) and variance \(\sigma^2/n\).

The interpretation of the Central Limit Theorem is as follows. The sample mean \(\bar{X}_n\) of independent random variables with the same mean and variance can be approximated by a Normal distribution, if the sample size \(n\) is large. Notice that we made no assumption whatsoever about the distribution of the \(X_i\)’s and still we were able to deduce the distribution of the sample mean.

The existence of this theorem is the reason why you used so often Normal probabilities to construct confidence intervals or to carry out tests of hypothesis. As you will continue study statistics, you will see that the assumption of Normality of data is made most often and is justified by the central limit theorem.