6.4 Asymptotic tests

As evidenced in Section 5.4, another primary source of inferential tools beyond normal populations are asymptotic results. Assume that we want to test the hypotheses

$H_0:\theta=\theta_0$ vs. $H_1:\theta>\theta_0;$
$H_0:\theta=\theta_0$ vs. $H_1:\theta<\theta_0;$
$H_0:\theta=\theta_0$ vs. $H_1:\theta\neq \theta_0.$

If we know a test statistic that, under $H_0,$ has an asymptotic normal distribution, that is

$\begin{align} Z=\frac{\hat{\theta}-\theta_0}{\hat{\sigma}(\theta_0)}\stackrel{d}{\longrightarrow}\mathcal{N}(0,1), \tag{6.4} \end{align}$

then the asymptotic critical regions are given by

$\begin{align*} C_a=\{Z>z_{\alpha}\}, \quad C_b=\{Z<-z_{\alpha}\}, \quad C_c=\{|Z|>z_{\alpha/2}\}. \end{align*}$

Note that despite not knowing the exact distribution of $Z$ under $H_0,$ $Z$ is a test statistic because its distribution is known asymptotically.

An especially relevant instance of (6.4) is given by likelihood theory (Theorem 4.1 under $H_0$ ):

$\begin{align*} Z=\frac{\hat{\theta}_{\mathrm{MLE}}-\theta_0}{1\big/\sqrt{n\mathcal{I}(\theta_0)}}=\sqrt{n\mathcal{I}(\theta_0)}\left(\hat{\theta}_{\mathrm{MLE}}-\theta_0\right)\stackrel{d}{\longrightarrow}\mathcal{N}(0,1). \end{align*}$

Due to Corollary 4.3, another asymptotic pivot is

$\begin{align*} Z=\sqrt{n\hat{\mathcal{I}}(\theta_0)}\left(\hat{\theta}_{\mathrm{MLE}}-\theta_0\right)\stackrel{d}{\longrightarrow}\mathcal{N}(0,1),\quad \hat{\mathcal{I}}(\theta_0)=\frac{1}{n}\sum_{i=1}^n \left(\left.\frac{\partial\log f(X_i;\theta)}{\partial\theta}\right\vert_{\theta=\theta_0}\right)^2, \end{align*}$

which is always straightforward to compute from the srs $(X_1,\ldots,X_n)$ from $X\sim f(\cdot;\theta)$ (analogous if $X$ is discrete).

Other test statistics can be obtained from non-normal asymptotic distributions (see, e.g., the forthcoming Theorem 6.2).

Let us see some examples of asymptotic tests.

Example 6.12 Let $(X_1,\ldots,X_n)$ be a srs of a rv $X$ with mean $\mu$ and variance $\sigma^2,$ both unknown. We want to test:

$H_0:\mu=\mu_0$ vs. $\mu>\mu_0;$
$H_0:\mu=\mu_0$ vs. $\mu<\mu_0;$
$H_0:\mu=\mu_0$ vs. $\mu\neq\mu_0.$

For that, employing the CLT (Theorem 2.5) we know that under $H_0:\mu=\mu_0,$

$\begin{align*} Z=\frac{\bar{X}-\mu_0}{S'/\sqrt{n}}\stackrel{d}{\longrightarrow} \mathcal{N}(0,1). \end{align*}$

Therefore, $Z$ is a test statistic and $H_0$ is rejected if the observed value of $Z$ belongs to the corresponding critical region ( $C_a,$ $C_b,$ or $C_c$ ).

Example 6.13 Let $(X_1,\ldots,X_n)$ be a srs of a rv $\Gamma(k,1/\theta)$ with $k$ known and $\theta$ unknown. We want to test:

$H_0:\theta=\theta_0$ vs. $H_1:\theta>\theta_0;$
$H_0:\theta=\theta_0$ vs. $H_1:\theta<\theta_0;$
$H_0:\theta=\theta_0$ vs. $H_1:\theta\neq\theta_0.$

We use that, by Example 4.12, $\mathcal{I}(\theta)=k/\theta^2.$ Then, we know that under $H_0:\theta=\theta_0,$

$\begin{align*} Z=\frac{\hat{\theta}_{\mathrm{MLE}}-\theta_0}{\theta_0/\sqrt{nk}}\stackrel{d}{\longrightarrow} \mathcal{N}(0,1). \end{align*}$

Therefore, $Z$ is a test statistic and $H_0$ is rejected if the observed value of $Z$ belongs to the corresponding critical region ( $C_a,$ $C_b,$ or $C_c$ ).

Example 6.14 A certain machine has to be repaired if more than $10\%$ of the items that it produces per day are defective. A srs of $n=100$ items of the daily production contains $15$ that are defective and the foreman decides that the machine has to be repaired. Is the sample supporting his decision at a significance level $\alpha=0.01$ ?

Let $Y$ be the number of defective items that were found. Then $Y\sim\mathrm{Bin}(n,p).$ We want to test

$\begin{align*} H_0: p=0.10\quad \text{vs.}\quad H_1:p>0.10. \end{align*}$

Because of the CLT (Theorem 2.5), $Y$ has a normal asymptotic distribution, so under $H_0:p=p_0$ it follows that

$\begin{align*} Z=\frac{\hat{p}-p_0}{\sqrt{p_0(1- p_0)/n}}\stackrel{d}{\longrightarrow}\mathcal{N}(0,1). \end{align*}$

Therefore, $Z$ is a test statistic with observed value

$\begin{align*} Z=\frac{0.15-0.10}{\sqrt{0.1\times 0.9/100}}=5/3. \end{align*}$

The rejection region is

$\begin{align*} C=\{Z>z_{0.01}\approx2.33\}. \end{align*}$

Since $Z=5/3\approx1.67<2.33,$ the sample does not provide enough evidence supporting the foreman decision, that is, that the actual percentage of defective items the machine is producing is above $10\%.$ The machine should not be repaired and the larger proportion of defective items in the batch can be attributed to chance.