3.3 CLT simulated example with Bernoulli distributed population

Let's now try this experiment again, but this time with a discrete distribution: the \(\text{BERN}(0.3)\) distribution, i.e. with the parameter \(p = 0.3\). Note that, according to the Bernoulli distribution, this means we have \(\mu = p = 0.3\) and \(\sigma^2 = p(1 - p) = 0.21\)

For this simulation, we have generated 100 observations from the \(\text{BERN}(0.3)\) distribution. To understand what types of things this distribution could represent, we could think of a biased coin that, when flipped, had a \(p = 0.3\) chance of being a 'Head', and therefore a \(1 - p = 1 - 0.3 = 0.7\) chance of being a 'Tail'. So, if we flipped the coin, we would expect 'Heads' to show up about 30 times, and 'Tails' to show up about 70 times.

The random sample of 100 observations is represented in the green histogram below and, as expected, we can observe that about 70% of the observations are at \(x = 0\) (representing 'Tails'), and about 30% of the observations are at \(x = 1\) (representing 'Heads'). The blue line is a normal density curve. Not surprisingly, this normal density curve does not fit the histogram well at all.

The red histograms represent sample means estimated as follows. Consider, for example, the first histogram of means with \(n = 5\). For that particular example, we generated \(n = 5\) observations from the \(\text{BERN}(0.3)\) distribution, and estimated the sample mean, \(\bar{x}\), from that sample. We then repeated this a further 9,999 times so that we obtain 10,000 estimates of \(\bar{x}\). These 10,000 estimates are represented in the first red histogram below. The blue line is the normal density curve we obtain via the Central Limit Theorem. That is, it is the normal density curve with \(\mu = 0.3\) and \(\sigma^2 = \frac{p(1 - p)}{n} = \frac{0.7}{5} = 0.14\). The same procedure has been followed for the second two red histograms but with \(n = 30\) and \(n = 60\) respectively.

Of interest here is the following:

Since the underlying distrution is Bernoulli, which is a discrete distribution with only two possible outcomes, it is no surprise that the green histogram shows observations occurring at only two locations (\(x = 0\) and \(x = 1\))
All four histograms are centered around \(p = 0.3\)
The data in the red histogram of means with \(n = 5\) are more symmetric than the green histogram, but still with some skew to the right and not fitting the normal curve well
As \(n\) increases, the red histograms appear to grow closer to data that resembles a normal distribution. Again, this is remarkable, this time considering that our underlying distribution was not even continuous!
The red histograms display less variability as \(n\) increases.