Homework 8

Problem 1

In roulette, a bet on a single number has a 1/38 probability of success and pays 35-to-1. That is, if you bet 1 dollar, your net winnings are -1 with probability 37/38 and +35 with probability 1/38. Consider betting on a single number on each of \(n\) spins of a roulette wheel. Let \(\bar{X}_n\) be your average net winnings per bet.

For each of the values \(n = 10\), \(n = 100\), \(n = 1000\):

  1. Compute \(\text{E}(\bar{X}_n)\)
  2. Compute \(\text{SD}(\bar{X}_n)\)
  3. Run a simulation to determine if the distribution of \(\bar{X}_n\) is approximately Normal
  4. Use simulation to approximate \(\text{P}(\bar{X}_n >0)\), the probability that you come out ahead after \(n\) bets
  5. If \(n=1000\) use the Central Limit Theorem to approximate \(\text{P}(\bar{X}_{1000} >0)\), the probability that you come out ahead after 1000 bets.
  6. The casino wants to determine how many bets on a single number are needed before they have (at least) a 99% probability of making a profit. (Remember, the casino profits if you lose; that is, if \(\bar{X}_n <0\).) Use the Central Limit Theorem to determine the minimum number of bets (keeping in mind that \(n\) must be an integer). You can assume that whatever \(n\) is, it’s large for the CLT to kick in.

Problem 2

The standard measurement of the alcohol content of drinks is alcohol by volume (ABV), which is given as the volume of ethanol as a percent of the total volume of the drink. A report states that on average the ABV for beer is 4.5%. Is this true? In a sample of 67 brands of beer, the mean ABV is 4.61 percent and the standard deviation of ABV is 0.754 percent. Note: the data set and some R summaries are provided in Canvas so you can check your answers. But you should answer these questions by hand, using only the information provided here.

  1. Use R to summarize the sample data. Then describe the main features in a sentence or two.
  2. Compute an interpret a 95% confidence interval for the appropriate population mean.
  3. Write a clearly worded sentence reporting your confidence interval in context.
  4. One of the brands of beer in the sample is O’Doul’s, a non-alcoholic beer. The ABV for O’Doul’s is 0.4 (it has a bit of alcohol.) Suppose O’Doul’s is removed from the data set. Compute the sample mean ABV of the remaining 66 brands. (You can do this with filter in tidyverse; see how I filtered the 2-week babies in the feeding length data.)
  5. Continuing the previous part, compute the sample SD of ABV of the remaining 66 brands. Compare to the original sample mean; which mean is larger — with or without O’Doul’s? Explain briefly.
  6. The sample SD of ABV of the remaining 66 brands is 0.55 percent. Compare to the original sample SD; which SD is larger — with or without O’Doul’s? Explain briefly.
  7. Compute the 95% confidence interval based on the sample with O’Doul’s removed. Compute to the original interval, both in terms of center of the CI and its width. Explain briefly.
  8. Based on the interval based on the sample with O’Doul’s removed, is 4.5% a plausible value of the parameter? Explain briefly.
  9. Which of the analyses is more appropriate: with or without O’Doul’s? Explain your reasoning.

Problem 4

(I had much longer instructions written, but I think they weren’t helpful. If you have questions about how the applet is working, don’t hesitate to ask.)

You are going to use a simulation applet that randomly generates confidence interals for a population mean to help you understand some ideas. The applet calculates a confidence interval for each set of randomly generated data.

  • In the box for “Statistic” select means “Means”.
  • In the box next to “Method” select “t”.
  • Start with “Number of intervals” equal to 1 at first to see what is happening, but then change it to lots.

Experiment with different simulations; be sure to change

  • Distribution
  • Population Mean
  • Population SD
  • Sample size
  • Confidence level

Then write a paragraph or two or some bullet points of your main observations. Based on this applet, what can you say about how confidence intervals work? How does changing the inputs affect the confidence intervals and the Results?