28  Central Limit Theorem

Example 28.1 A random sample of \(n\) customers at a snack bar is selected, independently. Let \(S_n\) by the total dollar amount spent by the \(n\) customers in the sample, and let \(\bar{X_n}=S_n/n\) represent the sample mean dollar amount spent by the \(n\) customers in the sample.

Assume that

  • 40% of customers spend 5 dollars
  • 40% of customers spend 6 dollars
  • 20% of customers spend 7 dollars

Let \(X\) denote the amount spent by a single randomly selected customer.

  1. Find \(\text{E}(X)\). We’ll call this value the population mean \(\mu\); explain what this means.

  2. Find \(\text{SD}(X)\). We’ll call this value the population standard deviation \(\sigma\); explain what this means.

  3. Randomly select two customers, independently, and let \(X_1\) and \(X_2\) denote the amounts spent by the two people selected. Make a table of all possible \((X_1, X_2)\) pairs and their probabilities.

  4. Use the table from the previous part to find the distribution of \(\bar{X}_2\). Interpret in words in context what this distribution represents.

  5. Compute and interpret \(\text{P}(\bar{X}_{2} > 6)\).

  6. Compute \(\text{E}(\bar{X}_2)\). How does it relate to \(\mu\)?

  7. There are 3 “means” in the previous part. What do all the means mean?

  8. Compute \(\text{Var}(\bar{X}_2)\) and \(\text{SD}(\bar{X}_2)\). How do these values relate to the population variance \(\sigma^2\) and the population standard deviation \(\sigma\)?

  9. Describe in words in context what \(\text{SD}(\bar{X}_2)\) measures variability of.

Example 28.2 Continuing Example 28.1. Now suppose we take a random sample of 30 customers; consider \(n=30\) and \(\bar{X}_{30}\).

  1. Find \(\text{E}(\bar{X}_{30})\), \(\text{Var}(\bar{X}_{30})\), and \(\text{SD}(\bar{X}_{30})\).

  2. What does \(\text{SD}(\bar{X}_{30})\) measure the variability of?

  3. Use simulation to determine the approximate shape of the distribution of \(\bar{X}_{30}\).

  4. Simulation shows that \(\bar{X}_{30}\) has an approximate Normal distribution. Use this Normal distribution to approximate \(\text{P}(\bar{X}_{30} > 6)\), and interpret the probability.

Example 28.3 Continuing Example 28.1. Now suppose that each customer spends 5 dollars with probability 0.4, 6 with probability 0.4, 7 with probability 0.19, and 30 with probability 0.01 (maybe they treat a few friends).

  1. Use simulation to approximate the distribution of \(\bar{X}_{30}\); is it approximately Normal?

  2. Use simulation to approximate the distribution of \(\bar{X}_{100}\); is it approximately Normal?

  3. Use simulation to approximate the distribution of \(\bar{X}_{300}\); is it approximately Normal?

Example 28.4 Ten independent resistors are connected in series. Each has a resistance Uniformly distributed between 215 and 225 ohms, independently. Approximate the probability that the mean resistance of this series system is between 219 and 221 ohms.

Example 28.5 A fair six-sided die is rolled until the total sum of all rolls exceeds 300. Approximate the probability that at least 80 rolls are necessary.

Example 28.6 Suppose that average annual income for U.S. households is about $100,000, and that the standard deviation of income is about $230,000.

  1. Based on this information alone, what can you say about the probability that a single randomly selected household has an income above $120,000?

  2. Donny Don’t says: “Since a sample size of one million is extremely large, the CLT says that the incomes in a sample of one million households should follow a Normal distribution”. Do you agree? If not, explain to Donny what the CLT does say.

  3. Use the CLT to approximate the probability that in a sample of 1000 households the sample mean income is above $120,000.

  4. Donny says: “Wait, the distribution of household income is highly skewed to the right. Wouldn’t it be more appropriate to use the sample median than the sample mean”? Do you agree? Explain.