Chapter 13 Sufficiency and Unbiasedness(Lecture on 02/11/2020)
The main theorem in this part relates sufficient statistics to unbiased estimates.
so \(\phi(T)\) is unbiased for \(\tau(\theta)\). Also, \[\begin{equation} \begin{split} Var_{\theta}W&=Var_{\theta}[E(W|T)]+E_{\theta}[Var(W|T)]\\ &=Var_{\theta}\phi(T)+E_{\theta}[Var(W|T)]\\ \geq Var_{\theta}\phi(T) \end{split} \tag{13.2} \end{equation}\] Hence, \(\phi(T)\) is uniformly better than \(W\), and it only remains to show that \(\phi(T)\) is indeed an estimator. That is, show \(\phi(T)\) is a function of only sample and in particular, independent of \(\theta\). This follows from the definition of sufficiency, and the fact that \(W\) is a function only of the sample, that the distribution of \(W|T\) is independent of \(\theta\). Hence, \(\phi(T)\) is a uniformly better unbiased estimator of \(\tau(\theta)\).
Conditioning any unbiased estimator on a sufficient statistic will result in a uniform improvement, so we need consider only statisitcs that are functions of a sufficient statistic in searching for best unbiased estimators.
- Conditioning on anything will result in an improvement of unbiased estimator, but the resulting quantity will depend on \(\theta\) and thus resulting to not be an estimator unless conditioning on sufficient statistics.
Theorem 13.2 If \(W\) is a best unbiased estimator of \(\tau(\theta)\), then \(W\) is unique.
Proof. Suppose \(W^{\prime}\) is another best unbiased estimator, and consider the estimator \(W^{*}=\frac{1}{2}(W+W^{\prime})\). Note that \(E_{\theta}W^*=\tau(\theta)\) and \[\begin{equation} \begin{split} Var_{\theta}W^*&=Var_{\theta}(\frac{1}{2}W+\frac{1}{2}W^{\prime})\\ &=\frac{1}{4}Var_{\theta}W+Var_{\theta}W^{\prime}+\frac{1}{2}Cov_{\theta}(W,W^{\prime})\\ &\leq \frac{1}{4}Var_{\theta}W+Var_{\theta}W^{\prime}+\frac{1}{2}[(Var_{\theta}W)(Var_{\theta}W^{\prime})]^{1/2}\\ &=Var_{\theta}W \end{split} \tag{13.3} \end{equation}\]
If the above inequality is strict, the the best unbiasedness of \(W\) is contradicted, so we must have equality for all \(\theta\), which implies the condition \(W^{\prime}=a(\theta)W+b(\theta)\). Now using properties of covariance, we have \[\begin{equation} \begin{split} Cov_{\theta}(W,W^{\prime})&=Cov_{\theta}[W,a(\theta)W+b(\theta)]\\ &=Cov_{\theta}[W,a(\theta)W]\\ &=a(\theta)Var_{\theta}W \end{split} \tag{13.4} \end{equation}\]
But from (13.3) we have \(Cov_{\theta}(W,W^{\prime})=Var_{\theta}W\), we get \(a(\theta)=1\) and since \(E_{\theta}W^{\prime}=\tau(\theta)\), we must have \(b(\theta)=0\). Thus, \(W=W^{\prime}\) and we get uniqueness.
Proof. If \(W\) is best unbiased, the theorem states that \(W\) must satisfy \(Cov_{\theta}(W,U)=0\) for all \(\theta\) and any \(U\) satisfying \(E_{\theta}U=0\). To verify this, conisder \(W\) and \(U\) as unbiased estimator of \(\tau(\theta)\) and 0, then the estimator \[\begin{equation} \phi_a=W+aU \tag{13.5} \end{equation}\] where a is a constant is also a unbiased estimator of \(\tau(\theta)\). The variance of \(\phi_a\) is \[\begin{equation} Var_{\theta}\phi_a=Var_{\theta}(W+aU)=Var_{\theta}W+2aCov_{\theta}(W,U)+a^2Var_{\theta}U \tag{13.6} \end{equation}\]
Now if for some \(\theta=\theta_0\), \(Cov_{\theta_0}(W,U)<0\), then we can make \(2aCov_{\theta}(W,U)+a^2Var_{\theta}U<0\) by choosing \(a\in(0,-2Cov_{\theta_0}(W,U)/Var_{\theta_0}U)\) and result in \(\phi_a\) better than \(W\) at \(\theta=\theta_0\) and \(W\) cannot be best unbiased. On the other hand, it can be shown that if \(Cov_{\theta}(W,U)>0\) for some \(\theta_0^*\), then we can also choose \(a\in(-2Cov_{\theta_0}(W,U)/Var_{\theta_0}U,0)\) to make \(\phi_a\) better than \(W\) at \(\theta=\theta_0^*\) and \(W\) not best unbiased. Therefore, if \(W\) is indeed unbiased, we must have \(Cov_{\theta}(W,U)=0\) for all \(\theta\) and any \(U\) satisfying \(E_{\theta}U=0\), the necessity is estabilished.
Now suppose we have an unbiased estimator \(W\) that is uncorrelated with all unbiased estimator of 0. Let \(W^{\prime}\) be any other unbiased estimator of \(\tau(\theta)\), we proceed to show \(W\) is better than \(W^{\prime}\). Write \[\begin{equation} W^{\prime}=W+(W^{\prime}-W) \tag{13.7} \end{equation}\] we have the variance of \(W^{\prime}\) as \[\begin{equation} \begin{split} Var_{\theta}W^{\prime}&=Var_{\theta}W+Var_{\theta}(W^{\prime}-W)+2Cov_{\theta}(W,W^{\prime}-W)\\ &=Var_{\theta}W+Var_{\theta}(W^{\prime}-W)>Var_{\theta}W \end{split} \tag{13.8} \end{equation}\] Since \(W^{\prime}\) is arbitrary, it follows that \(W\) is the best unbiased estimator of \(\tau(\theta)\).Example 13.1 (Unibased Estimators of Zero) Let \(X\) be an observation from \(Unif(\theta,\theta+1)\), then \[\begin{equation} E_{\theta}X=\int_{\theta}^{\theta+1}xdx=\theta+\frac{1}{2} \tag{13.9} \end{equation}\] and so \(X-\frac{1}{2}\) is an unbiased estimator of \(\theta\) and \(Var_{\theta}X=\frac{1}{12}\).
Now we proceed to find unbiased estimator of zero. If function \(h(x)\) satisfies \[\begin{equation} \int_{\theta}^{\theta+1}h(x)dx=0,\quad \forall\theta \tag{13.10} \end{equation}\] then \[\begin{equation} 0=\frac{d}{d\theta}\int_{\theta}^{\theta+1}h(x)dx=h(\theta+1)-h(\theta),\forall \theta \tag{13.11} \end{equation}\] Function \(h(x)\) is periodic with period 1. Such a function is \(h(x)=\sin(2\pi x)\). Now \[\begin{equation} \begin{split} Cov_{\theta}(X-\frac{1}{2},\sin(2\pi X))&=Cov_{\theta}(X,\sin(2\pi X))\\ &=\int_{\theta}^{\theta+1}x\sin(2\pi x)dx\\ &=-\frac{x\cos(2\pi x)}{2\pi}|_{\theta}^{\theta+1}+\int_{\theta}^{\theta+1}\frac{cos(2\pi x)}{2\pi}dx\\ &=-\frac{x\cos(2\pi\theta)}{2\pi} \end{split} \tag{13.12} \end{equation}\] Hence \(X-\frac{1}{2}\) is correlated with an unbiased estimator of zero, and cannot be a best unbiased estimator of \(\theta\).Example 13.3 (Binomial Best Unbiased Estimation) Let \(X_1,\cdots,X_n\) be i.i.d. \(Bin(k,\theta)\) with \(k\) known. The problem is to estimate the probability of exactly one success from a \(Bin(k,\theta)\), that is, estimate \[\begin{equation} \tau(\theta)=P_{\theta}(X=1)=k\theta(1-\theta)^{k-1} \tag{13.13} \end{equation}\] Now \(\sum_{i=1}^nX_i\sim Bin(kn,theta)\) is a complete sufficient statistic, but no unbiased estimator based on it is immediately evident. When in this situation, try for the simplest solution. The simple-minded estimator \[\begin{equation} h(X_1)=\left\{\begin{aligned} 1 & \quad X_i=1\\ 0 & \quad o.w. \end{aligned} \right. \tag{13.14} \end{equation}\] satisfies \[\begin{equation} E_{\theta}h(X_1)=\sum_{x_1=0}^kh(x_1){k \choose x_1}\theta^{x_1}(1-\theta)^{k-x_1}=k\theta(1-\theta)^{k-1} \tag{13.15} \end{equation}\] and hence is an unbiased estimator of \(k\theta(1-\theta)^{k-1}\). Now we have \[\begin{equation} \phi(\sum_{i=1}^nX_i)=E(h(X_1)|\sum_{i=1}^nX_i) \tag{13.16} \end{equation}\] is the best unbiased estimator of \(k\theta(1-\theta)^{k-1}\). Now we are left with evaluate \(\phi\). Suppose we observe \(\sum_{i=1}^nX_i=t\), then \[\begin{equation} \begin{split} \phi(t)&=E(h(X_1)|\sum_{i=1}^nX_i=t)\\ &=p(X_1=1|\sum_{i=1}^nX_i=t)\\ &=\frac{p_{\theta}(X_1=1,\sum_{i=1}^nX_i=t)}{p_{\theta}(\sum_{i=1}^nX_i=t)}\\ &=\frac{p_{\theta}(X_1=1,\sum_{i=2}^nX_i=t-1)}{p_{\theta}(\sum_{i=1}^nX_i=t)}\\ &=\frac{P_{\theta}(X_1=1)P_{\theta}(\sum_{i=2}^nX_i=t-1))}{p_{\theta}(\sum_{i=1}^nX_i=t)} \end{split} \tag{13.17} \end{equation}\]
Now since \(X_1\sim Bin(k,\theta)\), \(\sum_{i=2}^nX_i\sim Bin(k(n-1),\theta)\) and \(\sum_{i=1}^nX_i\sim Bin(kn,\theta)\), we have
\[\begin{equation} \phi(t)=k\frac{{{k(n-1)} \choose {t-1}}}{{{kn} \choose t}} \tag{13.18} \end{equation}\]
Note that all of the \(\theta\)s cancel as they must since \(\sum_{i=1}^nX_i\) is sufficient. Hence, the best unbiased estimator of \(k\theta(1-\theta)^{k-1}\) is \[\begin{equation} \phi(\sum_{i=1}^nX_i)=k\frac{{{k(n-1)} \choose {\sum_{i=1}^nX_i-1}}}{{{kn} \choose {\sum_{i=1}^nX_i}}} \tag{13.19} \end{equation}\]