Chapter 9 Parameter Estimation
9.1 Introduction
In this section, we consider the general definition of a statistic as a summary of a random sample. Statistics are used as estimators of population quantities with an estimate denoting a given realisation of an estimator. We explore key properties that we wish estimators to have such as unbiasedness, efficiency and consistency. We study the properties of the sample mean and sample variance as estimators of the population mean and variance, respectively.
9.2 Preliminaries
Statistic
A statistic, T(X)T(X), is any function of the random sample.
Note that since T(X)T(X) is a function of random variables, it is also a random variable. Hence it will also have all the properties of a random variable. Most importantly, it has a distribution associated with it.
Estimator
A statistic that is used for the purpose of estimating an unknown population parameter is called an estimator.
Estimate
A realised value of an estimator, T(x)T(x), that is the value of T(X)T(X) evaluated at a particular outcome of the random sample, is called an estimate.
That is, if we let Y=T(X)Y=T(X) then YY is a random variable and y=T(x)y=T(x) is a realisation of the random variable YY based on the sample x=(x1,x2,…,xn)x=(x1,x2,…,xn). The properties of the estimator T(X)T(X) will typically depend upon nn, the number of observations in the random sample.
Average Income
Suppose that we want to estimate the average annual income in the U.K. Let X1,X2,…,XnX1,X2,…,Xn be a random sample of annual incomes. Possible estimators might include:
- T1(X)=X1+X2+⋯+XnnT1(X)=X1+X2+⋯+Xnn;
- T2(X)=min{X1,X2,…,Xn}T2(X)=min{X1,X2,…,Xn};
- T3(X)=X1T3(X)=X1.
Which of these is the best choice of estimator?
9.3 Judging estimators
Let θθ be a population parameter we wish to estimate. Since any function of the sample data is a potential estimator of θθ, how should we determine whether an estimator is good or not? What qualities should our estimator have?
Quality 1: Unbiasedness
Unbiased
The estimator T(X)T(X) is an unbiased estimate of θθ if E[T(X)]=θ.E[T(X)]=θ. Otherwise, we say that the estimator T(X)T(X) is biased and we define B(T)=E[T(X)]−θB(T)=E[T(X)]−θ to be the bias of TT.
Asymptotically unbiased
If B(T)→0B(T)→0 as the sample size n→∞n→∞, then we say that T(X)T(X) is asymptotically unbiased for θθ.
Quality 2: Small variance
Efficiency
If two estimators T1(X)T1(X) and T2(X)T2(X) are both unbiased for θθ, then T1(X)T1(X) is said to be more efficient than T2(X)T2(X) ifWe would ideally like an estimator that is unbiased with a small variance. So given multiple unbiased estimators, we choose the most efficient estimator (the estimator with the smallest variance).
For comparing an estimator with a biased estimator, we can use the mean-square error to quantify the trade-off between bias and variance:
Mean-square error
The mean-square error of an estimator is defined by
Prove MSE(T)=var(T)+(B(T))2MSE(T)=var(T)+(B(T))2.
Watch Video 16 for the proof of Example 9.3.5.
Video 16: Derivation of MSE
Proof of Example 9.3.5.
as required.
Quality 3: Consistency
Consistency
An estimator T(X)T(X) is said to be a consistent estimator for θθ if
That is, as nn becomes large the probability that T(X)T(X) differs from θθ by more than ϵϵ, for any positive ϵϵ, becomes small and goes to 0 as n→∞n→∞.
This third desirable property can sometimes be established using the following theorem:
Consistency Theorem
If E[T(X)]→θE[T(X)]→θ and Var(T(X))→0Var(T(X))→0 as n→∞n→∞, then T(X)T(X) is a consistent estimator for θθ.
Note that the Consistency Theorem gives sufficient but not necessary conditions for consistency. Since by Example 9.3.5 MSE(T)=var(T)+(B(T))2MSE(T)=var(T)+(B(T))2, the Consistency Theorem implies that if MSE(T)→0MSE(T)→0 as n→∞n→∞, then T(X)T(X) is a consistent estimator for θθ.
Suppose X1,X2,…,XnX1,X2,…,Xn is a random sample from any population with mean μμ and variance σ2σ2. The sample mean is ˉX=1n∑ni=1Xi¯X=1n∑ni=1Xi and is an estimator of μμ. What are the properties of ˉX¯X?
Given that ˉX is an unbiased estimator the mean-square error of ˉX is equal to var(ˉX)=σ2n.
Since E[ˉX]→μ and var(ˉX)→0 as n→∞, it follows from the Consistency Theorem that ˉX is a consistent estimator for μ.
We return to Average Income Example concerning the average annual income in the UK.
It follows from Example 9.3.8 thatis an unbiased and consistent estimator of the mean annual income.
Let L denote the lowest annual income in the UK. ThenExcept in the case n=1, the mean of T2(X) will be below the mean annual income (the exact value will depend on the distribution of annual incomes) and will become smaller as n increases with the limit L as n→∞.
The final estimator T3(X)=X1 is unbiased as E[X1] is the average annual income. However, for all n=1,2,…, var(T3(X))=var(X1) and unless the annual income is constant, var(X1)>0. Therefore T3(X) is not a consistent estimator since the estimator, and hence its variance, does not change as we increase the sample size.
9.4 Sample Variance
Variance Estimator
Suppose X1,X2,…,Xn is a random sample from any population with mean μ and variance σ2. Consider the estimatorBefore considering the estimator ˆσ2 in Example 9.4.1 we prove Lemma 9.4.2 which is useful in manipulating sums of squares.
Splitting square
and the Lemma follows.
We check whether the variance estimator ˆσ2 is biased or unbiased:
Hence E[ˆσ2]≠σ2=Var(Xi) and so ˆσ2 is a biased, although asymptotically unbiased, estimator for σ2. Under weak additional conditions, such as E[X41]<∞, it can be shown that ˆσ2 is a consistent estimator.
It follows from Variance Estimator that given a random sample X1,X2,…,Xn, the quantity,is an unbiased estimator of σ2. This is the definition of the sample variance that we gave in Section 2.3.
It can be shown that s2=1n−1(n∑i=1X2i−(∑ni=1Xi)2n)=1n−1(n∑i=1X2i−nˉX2).
Given observed data x1,x2,…,xn, then we define the sample variance by
Student Exercises
Attempt the exercises below.
Suggest a reasonable statistical model for each of the following situations, and say which parameter or function of the parameter(s) in the model is likely to be of main interest:
- The number of reportable accidents that occur in the University in the month of October is ascertained, with a view to estimating the overall accident rate for the academic year;
- In a laboratory test the times to failure of 10 computer hard disk units are measured, to enable the manufacturer to quote for the mean time to failure in sales literature.
Of course in practice one needs to check whether the suggested models are reasonable, e.g. by examining a histogram.
Solution to Exercise 9.1.
- The number of October accidents could be Po(θ) (if accidents occurred at random and independently).
Parameter: θ, the expected number of accidents per month.
Function of parameter of interest is 12θ.
- Failure times T1,T2,…,T10 could be independent Exp(θ) (if disk failures occurred at random and independently).
Function of parameter of interest is the mean failure time, 1/θ.
Suppose that a surveyor is trying to determine the area of a rectangular field, in which the measured length Y1 and the measured width Y2 are independent random variables taking values according to the following distributions:
The calculated area A=Y1Y2 is also a random variable, and is used to estimate the true area. If the true length and width are 10 and 5, respectively.
- Is Y1 an unbiased estimator of the true length?
- Is Y2 an unbiased estimator of the true width?
- Is A an unbiased estimator of the true area?
Solution to Exercise 9.2.
- Yes Y1 is an unbiased estimator, since
E[Y1]=8×0.25+10×0.25+11×0.5=10. - Yes Y2 is an unbiased estimator, since
E[Y2]=4×0.5+6×0.5=5. - Yes A is an unbiased estimator, since by independence
E[A]=E[Y1Y2]=E[Y1]E[Y2] and thereforeE[A]=10×5=50.