7.2 Finite Sample Properties of Estimators

Consider ˆθ as a random variable. In general, the pdf of ˆθ, f(ˆθ), depends on the pdf’s of the random variables {Rt}Tt=1. The exact form of f(ˆθ) may be very complicated. Sometimes we can use analytic calculations to determine the exact form of f(ˆθ). This can be done, for example, when ˆθ=ˆμ. In general, the exact form of f(ˆθ) is often too difficult to derive exactly. When f(ˆθ) is too difficult to compute we can often approximate f(ˆθ) using either Monte Carlo simulation techniques or the Central Limit Theorem (CLT). In Monte Carlo simulation, we use the computer to simulate many different realizations of the random returns {Rt}Tt=1 and on each simulated sample we evaluate the estimator ˆθ. The Monte Carlo approximation of f(ˆθ) is the empirical distribution of ˆθ over the different simulated samples. For a given sample size T, Monte Carlo simulation gives a very accurate approximation to f(ˆθ) if the number of simulated samples is very large. The CLT approximation of f(ˆθ) is a normal distribution approximation that becomes more accurate as the sample size T gets very large. An advantage of the CLT approximation is that it is often easy to compute. The disadvantage is that the accuracy of the approximation depends on the estimator ˆθ and sample size T.

For analysis purposes, we often focus on certain characteristics of f(ˆθ), like its expected value (center), variance and standard deviation (spread about expected value). The expected value of an estimator is related to the concept of estimator bias, and the variance/standard deviation of an estimator is related to the concept of estimator precision. Different realizations of the random variables {Rt}Tt=1 will produce different values of ˆθ. Some values of ˆθ will be bigger than θ and some will be smaller. Intuitively, a good estimator of θ is one that is on average correct (unbiased) and never gets too far away from θ (small variance). That is, a good estimator will have small bias and high precision.

7.2.1 Bias

Bias concerns the location or center of f(ˆθ) in relation to θ. If f(ˆθ) is centered away from θ, then we say ˆθ is a biased estimator of θ. If f(ˆθ) is centered at θ, then we say that ˆθ is an unbiased estimator of θ. Formally, we have the following definitions:

Definition 4.1 The estimation error is the difference between the estimator and the parameter being estimated:

error(ˆθ,θ)=ˆθθ.

Definition 2.4 The bias of an estimator ˆθ of θ is the expected estimation error: bias(ˆθ,θ)=E[error(ˆθ,θ)]=E[ˆθ]θ.

Definition 2.5 An estimator ˆθ of θ is unbiased if bias(ˆθ,θ)=0; i.e., if E[ˆθ]=θ or E[error(ˆθ,θ)]=0.

Unbiasedness is a desirable property of an estimator. It means that the estimator produces the correct answer “on average”, where “on average” means over many hypothetical realizations of the random variables {Rt}Tt=1. It is important to keep in mind that an unbiased estimator for θ may not be very close to θ for a particular sample, and that a biased estimator may actually be quite close to θ. For example, consider two estimators of θ, ˆθ1 and ˆθ2. The pdfs of ˆθ1 and ˆθ2 are illustrated in Figure 7.1. ˆθ1 is an unbiased estimator of θ with a large variance, and ˆθ2 is a biased estimator of θ with a small variance. Consider first, the pdf of ˆθ1. The center of the distribution is at the true value θ=0, E[ˆθ1]=0, but the distribution is very widely spread out about θ=0. That is, var(ˆθ1) is large. On average (over many hypothetical samples) the value of ˆθ1 will be close to θ, but in any given sample the value of ˆθ1 can be quite a bit above or below θ. Hence, unbiasedness by itself does not guarantee a good estimator of θ. Now consider the pdf for ˆθ2. The center of the pdf is slightly higher than θ=0, i.e., bias(ˆθ2,θ)>0, but the spread of the distribution is small. Although the value of ˆθ2 is not equal to 0 on average we might prefer the estimator ˆθ2 over ˆθ1 because it is generally closer to θ=0 on average than ˆθ1.

While unbiasedness is a desirable property of an estimator ˆθ of θ, it, by itself, is not enough to determine if ˆθ is a good estimator. Being correct on average means that ˆθ is seldom exactly correct for any given sample. In some samples ˆθ is less than θ, and some samples ˆθ is greater than θ. More importantly, we need to know how far ˆθ typically is from θ. That is, we need to know about the magnitude of the spread of the distribution of ˆθ about its average value. This will tell us the precision of ˆθ.

Distributions of competing estimators for $\theta=0$. $\hat{\theta}_{1}$ is unbiased but has high variance, and $\hat{\theta}_{2}$ is biased but has low variance.

Figure 7.1: Distributions of competing estimators for θ=0. ˆθ1 is unbiased but has high variance, and ˆθ2 is biased but has low variance.

7.2.2 Precision

An estimate is, hopefully, our best guess of the true (but unknown) value of θ. Our guess most certainly will be wrong, but we hope it will not be too far off. A precise estimate is one in which the variability of the estimation error is small. The variability of the estimation error is captured by the mean squared error.

Definition 7.2 The mean squared error of an estimator ˆθ of θ is given by: mse(ˆθ,θ)=E[(ˆθθ)2]=E[error(ˆθ,θ)2]

The mean squared error measures the expected squared deviation of ˆθ from θ. If this expected deviation is small, then we know that ˆθ will almost always be close to θ. Alternatively, if the mean squared is large then it is possible to see samples for which ˆθ is quite far from θ. A useful decomposition of mse(ˆθ,θ) is: mse(ˆθ,θ)=E[(ˆθE[ˆθ])2]+(E[ˆθ]θ)2=var(ˆθ)+bias(ˆθ,θ)2 The derivation of this result is straightforward. Use the add and subtract trick and write ˆθθ=ˆθE[ˆθ]+E[ˆθ]θ. Then square both sides giving (ˆθθ)2=(ˆθE[ˆθ])2+2(ˆθE[ˆθ])(E[ˆθ]θ)+(E[ˆθ]θ)2. Taking expectations of both sides yields,

mse(ˆθ,θ)=E[(ˆθE[ˆθ])]2+2(E[ˆθ]E[ˆθ])(E[ˆθ]θ)+E[(E[ˆθ]θ)2]=E[(ˆθE[ˆθ])]2+E[(E[ˆθ]θ)2]=var(ˆθ)+bias(ˆθ,θ)2.

The result states that for any estimator ˆθ of θ, mse(ˆθ,θ) can be split into a variance component, var(ˆθ), and a squared bias component, bias(ˆθ,θ)2. Clearly, mse(ˆθ,θ) will be small only if both components are small. If an estimator is unbiased then mse(ˆθ,θ)=var(ˆθ)=E[(ˆθθ)2] is just the squared deviation of ˆθ about θ. Hence, an unbiased estimator ˆθ of θ is good, if it has a small variance.

The mse(ˆθ,θ) and var(ˆθ) are based on squared deviations and so are not in the same units of measurement as θ. Measures of precision that are in the same units as θ are the root mean square error and the standard error.

Definition 2.6 The root mean square error and the standard error of an estimator ˆθ are:

rmse(ˆθ,θ)=mse(ˆθ,θ),

se(ˆθ)=var(ˆθ).

If bias(ˆθ,θ)0 then the precision of ˆθ is typically measured by se(ˆθ).

With the concepts of bias and precision in hand, we can state what defines a good estimator.

Definition 7.3 A good estimator ˆθ of θ has a small bias (7.3) and a small standard error (7.6).

7.2.3 Estimated standard errors

Standard error formulas se(ˆθ) that we will derive often depend on the unknown values of the GWN model parameters. For example, later on we will show that se(ˆμ)=σ2/T which depends on the GWN parameter σ2 which is unknown. Hence, the formula for se(ˆμ) is a practically useless quantity as we cannot compute its value because we don’t the know the value of σ2. Fortunately, we can create a practically useful formula by replacing the unknown quantity σ2 with an good estimate ˆσ2 that we compute from the data. This gives rise the estimated asymptotic standard error ^se(ˆμ)=ˆσ2/T.