Assume a Poisson() model for the number of home runs hit (in total by both teams) in a MLB game. Let be a random sample of home run counts for games.
Suppose we want to estimate , the probability that any single game has exactly 1 HR. Consider two estimators of
Compute the value of based on the sample (3, 0, 1, 4, 0). Write a clearly worded sentence reporting in context this estimate of .
Compute the value of based on the sample (3, 0, 1, 4, 0). Write a clearly worded sentence reporting in context your estimate of .
Which of these two estimators is the MLE of in this situation? Explain, without doing any calculations.
It can be shown that is an unbiased estimator of . Explain what this means.
Is an unbiased estimator of ? Explain. (You don’t have to derive anything; just apply a general principle.)
Suppose and . Explain in full detail how you would use simulation to approximate the bias of .
Conduct the simulation from the previous part and plot the approximate of when and .
Explain in full detail how you would use simulation to approximate the bias function of when .
Conduct the simulation from the previous part and plot the approximate bias function when . For what values of does tend to overestimate ? Underestimate? For what values of is the bias the worst?
Solution
For the sample (3, 0, 1, 4, 0), the value of is 8/5 and the value of is . Using this estimator, based on this sample we would estimate that 32.3 percent of baseball games have 1 home run.
For the sample (3, 0, 1, 4, 0) there is 1 game with 1 HR so . Using this estimator, based on this sample we would estimate that 20 percent of baseball games have 1 home run.
We know that the MLE of for a Poisson distribution is , so by the invariance principle of MLEs, is the MLE of .
Over many samples, does not consistent over or under estimate .
No. is an unbiased estimator of but nonlinear transformations do not preserve unbiasedness. is not unbiased estimator of .
For
Simulate a sample of size from a Poisson(2.3) distribution, say (3, 0, 1, 4, 0) for .
For the simulated sample compute the sample mean and the value of , say 8/5 and .
Repeat many times, simulating many samples of size and many values of for the chosen .
Average the values of and subtract to approximate the bias when .
See below.
We carry out the process in the previous part for many values of . Choose a grid of values, say 0.01, 0.02, , 10
Choose a value of from the grid, say 2.3
Simulate a sample of size from a Poisson() distribution, say (3, 0, 1, 4, 0) for .
For the simulated sample compute the sample mean and the value of , say 8/5 and .
Repeat many times, simulating many samples of size and many values of for the chosen .
Average the values of and subtract to approximate the bias for the chosen .
Repeat the above process for all the values of in the grid and plot the approximate bias as a function of to approximate the bias function.
See below.
Problem 2
Continuing Problem 2.
It can be shown that . Compute when and . Then write a clearly worded sentence interpreting this value.
Suppose and . Explain in full detail how you would use simulation to approximate the variance of .
Conduct the simulation from the previous part and approximate the variance of when and . Then write a clearly worded sentence interpreting this value.
Which estimator has smaller variance when (and )? Answer, but then explain why this information alone is not really useful.
Explain in full detail how you would use simulation to approximate the variance function of (if ).
Conduct the simulation from the previous part and plot the approximate variance function. Compare to the variance function of . Based on variability alone, which estimator is preferred?
Solution
When , , so if This number measures the sample-to-sample variance of sample proportions of games with 1 HR over many samples of 5 games, assuming that .
For
Simulate a sample of size from a Poisson(2.3) distribution, say (3, 0, 1, 4, 0) for .
For the simulated sample compute the sample mean and the value of , say 8/5 and .
Repeat many times, simulating many samples of size and many values of for the chosen .
Average the values of , for each value of find its deviation from the average, and then average the squared deviations. This number approximates when
See below. The value measures the sample-to-sample variance of values of (estimates of the probability that a game has 1 HR), assuming that .
has smaller variance when , but that by itself isn’t useful because we don’t know what is.
We carry out the process in the previous part for many values of . Choose a grid of values, say 0.01, 0.02, , 10
Choose a value of from the grid, say 2.3
Simulate a sample of size from a Poisson() distribution, say (3, 0, 1, 4, 0) for .
For the simulated sample compute the sample mean and the value of , say 8/5 and .
Repeat many times, simulating many samples of size and many values of for the chosen .
Average the values of , for each value of find its deviation from the average, and then average the squared deviations. This number approximates for the chosen
Repeat the above process for all the values of in the grid and plot the approximate bias as a function of to approximate the bias function.
See below. Seems like has smaller variance regardless of .
Problem 3
Continuing Problems 2 and 3
Compute when and . (You can do the next part first if you want, but it helps to work with specific numbers first.)
Derive the MSE function of . (Hint: see facts from previous parts.)
Suppose (and ). Explain in full detail how you would use simulation to approximate the MSE of .
Conduct the simulation from the previous part and approximate the MSE of when (and ).
Which estimator has smaller MSE when (and )? Answer, but then explain why this information alone is not really useful.
Conduct the simulation from the previous part and plot the approximate MSE function. Compare to the MSE function of .
Compare the MSEs of the two estimators for and a few other values of . Is there a clear preference between these two estimators? Discuss.
Solution
Since is unbiased, its MSE is just its variance.
Since is unbiased, its MSE function is just its variance function.
For
Simulate a sample of size from a Poisson(2.3) distribution, say (3, 0, 1, 4, 0) for .
For the simulated sample compute the sample mean and the value of , say 8/5 and .
Repeat many times, simulating many samples of size and many values of for the chosen .
For each value of subtract and square the difference, then average these squared differences to approximate the MSE when .
See below. The squared bias is fairly small compared to the variance, so the MSE is pretty close to the variance function.
has smaller MSE when , but this isn’t really useful because we don’t know what is.
See below.
It seems that (at least for the values of that we considered) has smaller MSE than for all . The sample proportion doesn’t use the fact that the population distribution is Poisson, and it treats each value as either 1 or not. uses all the data and the fact that the population distribution is Poisson, so its better tuned to this problem.
Simulation
For mu = 2.3
n =5mu =2.3N_samples =10000theta_hats =vector(length = N_samples)for (i in1:N_samples) {# simulate a sample of size n from the Poisson(mu) population x =rpois(n, mu)# compute theta-hat theta_hats[i] =mean(x) *exp(-mean(x))}# Approximate the bias of theta-hat for this mumean(theta_hats) - mu *exp(-mu)
[1] 0.003518489
# Approximate the variance of theta-hat for this muvar(theta_hats)
[1] 0.005725971
# Approximate the MSE of theta-hat for this mumean((theta_hats - mu *exp(-mu)) ^2)
[1] 0.005737778
For all mu
n =5# grid of mu values: 0.1, 0.2, 0.3, ..., 4.9, 5mus =seq(0.1, 5, 0.1)N_samples =10000# initialize vectors that will record bias, var, mse as functions of mubias_thetahat =vector()var_thetahat =vector()mse_thetahat =vector()for (mu in mus) {# true value of theta for this mu theta = mu *exp(-mu)# for each mu simulate many samples and many values of sigma-hat^2 theta_hats =vector(length = N_samples)for (i in1:N_samples) {# simulate a sample of size n from the Poisson(mu) population x =rpois(n, mu)# compute theta-hat theta_hats[i] =mean(x) *exp(-mean(x)) }# Approximate bias of theta-hat for this mu and add to the list bias_thetahat =c(bias_thetahat, mean(theta_hats) - theta)# Approximate variance of sigma-hat^2 for this mu and add to the list var_thetahat =c(var_thetahat, var(theta_hats))# Approximate MSE of sigma-hat^2 for this mu and add to the list# Can just do bias^2 + var, but can also use definition of MSE mse_thetahat =c(mse_thetahat, mean((theta_hats - theta) ^2))}
Bias function
Any “wiggles” are due to natural simulation margin of error (can make these go away by simulating even more samples)
plot(mus, bias_thetahat,xlab ="mu",ylab ="Bias of thetahat")
Squared bias function
Any “wiggles” are due to natural simulation margin of error (can make these go away by simulating even more samples)
plot(mus, bias_thetahat ^2,xlab ="mu",ylab ="Squared Bias of thetahat",ylim =c(0, 0.05)) # set common y-axis scale
Variance function
Any “wiggles” are due to natural simulation margin of error (can make these go away by simulating even more samples)
plot(mus, var_thetahat,xlab ="mu",ylab ="Variance",ylim =c(0, 0.05), # set common y-axis scalemain ="Variance functions of theta-hat (dots) and p-hat (dashed)")curve(x *exp(-x) * (1- x *exp(-x)) / n, add =TRUE,lty =2)
MSE function
Any “wiggles” are due to natural simulation margin of error (can make these go away by simulating even more samples)
plot(mus, mse_thetahat,xlab ="mu",ylab ="MSE",ylim =c(0, 0.05), # set common y-axis scalemain ="MSE functions of theta-hat (dots) and p-hat (dashed)")curve(x *exp(-x) * (1- x *exp(-x)) / n, add =TRUE,lty =2)
Different n
n =50# grid of mu values: 0.1, 0.2, 0.3, ..., 4.9, 5mus =seq(0.1, 5, 0.1)N_samples =10000# initialize vectors that will record bias, var, mse as functions of mubias_thetahat =vector()var_thetahat =vector()mse_thetahat =vector()for (mu in mus) {# true value of theta for this mu theta = mu *exp(-mu)# for each mu simulate many samples and many values of sigma-hat^2 theta_hats =vector(length = N_samples)for (i in1:N_samples) {# simulate a sample of size n from the Poisson(mu) population x =rpois(n, mu)# compute theta-hat theta_hats[i] =mean(x) *exp(-mean(x)) }# Approximate bias of theta-hat for this mu and add to the list bias_thetahat =c(bias_thetahat, mean(theta_hats) - theta)# Approximate variance of sigma-hat^2 for this mu and add to the list var_thetahat =c(var_thetahat, var(theta_hats))# Approximate MSE of sigma-hat^2 for this mu and add to the list# Can just do bias^2 + var, but can also use definition of MSE mse_thetahat =c(mse_thetahat, mean((theta_hats - theta) ^2))}
plot(mus, mse_thetahat,xlab ="mu",ylab ="MSE",ylim =c(0, 0.006), # set common y-axis scalemain ="n = 50:MSE functions of theta-hat (dots) and p-hat (dashed)")curve(x *exp(-x) * (1- x *exp(-x)) / n, add =TRUE,lty =2)
n =20# grid of mu values: 0.1, 0.2, 0.3, ..., 4.9, 5mus =seq(0.1, 5, 0.1)N_samples =10000# initialize vectors that will record bias, var, mse as functions of mubias_thetahat =vector()var_thetahat =vector()mse_thetahat =vector()for (mu in mus) {# true value of theta for this mu theta = mu *exp(-mu)# for each mu simulate many samples and many values of sigma-hat^2 theta_hats =vector(length = N_samples)for (i in1:N_samples) {# simulate a sample of size n from the Poisson(mu) population x =rpois(n, mu)# compute theta-hat theta_hats[i] =mean(x) *exp(-mean(x)) }# Approximate bias of theta-hat for this mu and add to the list bias_thetahat =c(bias_thetahat, mean(theta_hats) - theta)# Approximate variance of sigma-hat^2 for this mu and add to the list var_thetahat =c(var_thetahat, var(theta_hats))# Approximate MSE of sigma-hat^2 for this mu and add to the list# Can just do bias^2 + var, but can also use definition of MSE mse_thetahat =c(mse_thetahat, mean((theta_hats - theta) ^2))}
plot(mus, mse_thetahat,xlab ="mu",ylab ="MSE",ylim =c(0, 0.015), # set common y-axis scalemain ="n = 20:MSE functions of theta-hat (dots) and p-hat (dashed)")curve(x *exp(-x) * (1- x *exp(-x)) / n, add =TRUE,lty =2)