Chapter 13 Sufficiency and Unbiasedness(Lecture on 02/11/2020)
The main theorem in this part relates sufficient statistics to unbiased estimates.
so ϕ(T) is unbiased for τ(θ). Also, VarθW=Varθ[E(W|T)]+Eθ[Var(W|T)]=Varθϕ(T)+Eθ[Var(W|T)]≥Varθϕ(T) Hence, ϕ(T) is uniformly better than W, and it only remains to show that ϕ(T) is indeed an estimator. That is, show ϕ(T) is a function of only sample and in particular, independent of θ. This follows from the definition of sufficiency, and the fact that W is a function only of the sample, that the distribution of W|T is independent of θ. Hence, ϕ(T) is a uniformly better unbiased estimator of τ(θ).
Conditioning any unbiased estimator on a sufficient statistic will result in a uniform improvement, so we need consider only statisitcs that are functions of a sufficient statistic in searching for best unbiased estimators.
- Conditioning on anything will result in an improvement of unbiased estimator, but the resulting quantity will depend on θ and thus resulting to not be an estimator unless conditioning on sufficient statistics.
Theorem 13.2 If W is a best unbiased estimator of τ(θ), then W is unique.
Proof. Suppose W′ is another best unbiased estimator, and consider the estimator W∗=12(W+W′). Note that EθW∗=τ(θ) and VarθW∗=Varθ(12W+12W′)=14VarθW+VarθW′+12Covθ(W,W′)≤14VarθW+VarθW′+12[(VarθW)(VarθW′)]1/2=VarθW
If the above inequality is strict, the the best unbiasedness of W is contradicted, so we must have equality for all θ, which implies the condition W′=a(θ)W+b(θ). Now using properties of covariance, we have Covθ(W,W′)=Covθ[W,a(θ)W+b(θ)]=Covθ[W,a(θ)W]=a(θ)VarθW
But from (13.3) we have Covθ(W,W′)=VarθW, we get a(θ)=1 and since EθW′=τ(θ), we must have b(θ)=0. Thus, W=W′ and we get uniqueness.
Proof. If W is best unbiased, the theorem states that W must satisfy Covθ(W,U)=0 for all θ and any U satisfying EθU=0. To verify this, conisder W and U as unbiased estimator of τ(θ) and 0, then the estimator ϕa=W+aU where a is a constant is also a unbiased estimator of τ(θ). The variance of ϕa is Varθϕa=Varθ(W+aU)=VarθW+2aCovθ(W,U)+a2VarθU
Now if for some θ=θ0, Covθ0(W,U)<0, then we can make 2aCovθ(W,U)+a2VarθU<0 by choosing a∈(0,−2Covθ0(W,U)/Varθ0U) and result in ϕa better than W at θ=θ0 and W cannot be best unbiased. On the other hand, it can be shown that if Covθ(W,U)>0 for some θ∗0, then we can also choose a∈(−2Covθ0(W,U)/Varθ0U,0) to make ϕa better than W at θ=θ∗0 and W not best unbiased. Therefore, if W is indeed unbiased, we must have Covθ(W,U)=0 for all θ and any U satisfying EθU=0, the necessity is estabilished.
Now suppose we have an unbiased estimator W that is uncorrelated with all unbiased estimator of 0. Let W′ be any other unbiased estimator of τ(θ), we proceed to show W is better than W′. Write W′=W+(W′−W) we have the variance of W′ as VarθW′=VarθW+Varθ(W′−W)+2Covθ(W,W′−W)=VarθW+Varθ(W′−W)>VarθW Since W′ is arbitrary, it follows that W is the best unbiased estimator of τ(θ).Example 13.1 (Unibased Estimators of Zero) Let X be an observation from Unif(θ,θ+1), then EθX=∫θ+1θxdx=θ+12 and so X−12 is an unbiased estimator of θ and VarθX=112.
Now we proceed to find unbiased estimator of zero. If function h(x) satisfies ∫θ+1θh(x)dx=0,∀θ then 0=ddθ∫θ+1θh(x)dx=h(θ+1)−h(θ),∀θ Function h(x) is periodic with period 1. Such a function is h(x)=sin(2πx). Now Covθ(X−12,sin(2πX))=Covθ(X,sin(2πX))=∫θ+1θxsin(2πx)dx=−xcos(2πx)2π|θ+1θ+∫θ+1θcos(2πx)2πdx=−xcos(2πθ)2π Hence X−12 is correlated with an unbiased estimator of zero, and cannot be a best unbiased estimator of θ.Example 13.3 (Binomial Best Unbiased Estimation) Let X1,⋯,Xn be i.i.d. Bin(k,θ) with k known. The problem is to estimate the probability of exactly one success from a Bin(k,θ), that is, estimate τ(θ)=Pθ(X=1)=kθ(1−θ)k−1 Now ∑ni=1Xi∼Bin(kn,theta) is a complete sufficient statistic, but no unbiased estimator based on it is immediately evident. When in this situation, try for the simplest solution. The simple-minded estimator h(X1)={1Xi=10o.w. satisfies \begin{equation} E_{\theta}h(X_1)=\sum_{x_1=0}^kh(x_1){k \choose x_1}\theta^{x_1}(1-\theta)^{k-x_1}=k\theta(1-\theta)^{k-1} \tag{13.15} \end{equation} and hence is an unbiased estimator of k\theta(1-\theta)^{k-1}. Now we have \begin{equation} \phi(\sum_{i=1}^nX_i)=E(h(X_1)|\sum_{i=1}^nX_i) \tag{13.16} \end{equation} is the best unbiased estimator of k\theta(1-\theta)^{k-1}. Now we are left with evaluate \phi. Suppose we observe \sum_{i=1}^nX_i=t, then \begin{equation} \begin{split} \phi(t)&=E(h(X_1)|\sum_{i=1}^nX_i=t)\\ &=p(X_1=1|\sum_{i=1}^nX_i=t)\\ &=\frac{p_{\theta}(X_1=1,\sum_{i=1}^nX_i=t)}{p_{\theta}(\sum_{i=1}^nX_i=t)}\\ &=\frac{p_{\theta}(X_1=1,\sum_{i=2}^nX_i=t-1)}{p_{\theta}(\sum_{i=1}^nX_i=t)}\\ &=\frac{P_{\theta}(X_1=1)P_{\theta}(\sum_{i=2}^nX_i=t-1))}{p_{\theta}(\sum_{i=1}^nX_i=t)} \end{split} \tag{13.17} \end{equation}
Now since X_1\sim Bin(k,\theta), \sum_{i=2}^nX_i\sim Bin(k(n-1),\theta) and \sum_{i=1}^nX_i\sim Bin(kn,\theta), we have
\begin{equation} \phi(t)=k\frac{{{k(n-1)} \choose {t-1}}}{{{kn} \choose t}} \tag{13.18} \end{equation}
Note that all of the \thetas cancel as they must since \sum_{i=1}^nX_i is sufficient. Hence, the best unbiased estimator of k\theta(1-\theta)^{k-1} is \begin{equation} \phi(\sum_{i=1}^nX_i)=k\frac{{{k(n-1)} \choose {\sum_{i=1}^nX_i-1}}}{{{kn} \choose {\sum_{i=1}^nX_i}}} \tag{13.19} \end{equation}