6.4 Asymptotic tests

As evidenced in Section 5.4, another primary source of inferential tools beyond normal populations are asymptotic results. Assume that we want to test the hypotheses

  1. \(H_0:\theta=\theta_0\) vs. \(H_1:\theta>\theta_0;\)
  2. \(H_0:\theta=\theta_0\) vs. \(H_1:\theta<\theta_0;\)
  3. \(H_0:\theta=\theta_0\) vs. \(H_1:\theta\neq \theta_0.\)

If we know a test statistic that, under \(H_0,\) has an asymptotic normal distribution, that is

\[\begin{align} Z=\frac{\hat{\theta}-\theta_0}{\hat{\sigma}(\theta_0)}\stackrel{d}{\longrightarrow}\mathcal{N}(0,1), \tag{6.4} \end{align}\]

then the asymptotic critical regions are given by

\[\begin{align*} C_a=\{Z>z_{\alpha}\}, \quad C_b=\{Z<-z_{\alpha}\}, \quad C_c=\{|Z|>z_{\alpha/2}\}. \end{align*}\]

Note that despite not knowing the exact distribution of \(Z\) under \(H_0,\) \(Z\) is a test statistic because its distribution is known asymptotically.

An especially relevant instance of (6.4) is given by likelihood theory (Theorem 4.1 under \(H_0\)):

\[\begin{align*} Z=\frac{\hat{\theta}_{\mathrm{MLE}}-\theta_0}{1\big/\sqrt{n\mathcal{I}(\theta_0)}}=\sqrt{n\mathcal{I}(\theta_0)}\left(\hat{\theta}_{\mathrm{MLE}}-\theta_0\right)\stackrel{d}{\longrightarrow}\mathcal{N}(0,1). \end{align*}\]

Due to Corollary 4.3, another asymptotic pivot is

\[\begin{align*} Z=\sqrt{n\hat{\mathcal{I}}(\theta_0)}\left(\hat{\theta}_{\mathrm{MLE}}-\theta_0\right)\stackrel{d}{\longrightarrow}\mathcal{N}(0,1),\quad \hat{\mathcal{I}}(\theta_0)=\frac{1}{n}\sum_{i=1}^n \left(\left.\frac{\partial\log f(X_i;\theta)}{\partial\theta}\right\vert_{\theta=\theta_0}\right)^2, \end{align*}\]

which is always straightforward to compute from the srs \((X_1,\ldots,X_n)\) from \(X\sim f(\cdot;\theta)\) (analogous if \(X\) is discrete).

Other test statistics can be obtained from non-normal asymptotic distributions (see, e.g., the forthcoming Theorem 6.2).

Let us see some examples of asymptotic tests.

Example 6.12 Let \((X_1,\ldots,X_n)\) be a srs of a rv \(X\) with mean \(\mu\) and variance \(\sigma^2,\) both unknown. We want to test:

  1. \(H_0:\mu=\mu_0\) vs. \(\mu>\mu_0;\)
  2. \(H_0:\mu=\mu_0\) vs. \(\mu<\mu_0;\)
  3. \(H_0:\mu=\mu_0\) vs. \(\mu\neq\mu_0.\)

For that, employing the CLT (Theorem 2.5) we know that under \(H_0:\mu=\mu_0,\)

\[\begin{align*} Z=\frac{\bar{X}-\mu_0}{S'/\sqrt{n}}\stackrel{d}{\longrightarrow} \mathcal{N}(0,1). \end{align*}\]

Therefore, \(Z\) is a test statistic and \(H_0\) is rejected if the observed value of \(Z\) belongs to the corresponding critical region (\(C_a,\) \(C_b,\) or \(C_c\)).

Example 6.13 Let \((X_1,\ldots,X_n)\) be a srs of a rv \(\Gamma(k,1/\theta)\) with \(k\) known and \(\theta\) unknown. We want to test:

  1. \(H_0:\theta=\theta_0\) vs. \(H_1:\theta>\theta_0;\)
  2. \(H_0:\theta=\theta_0\) vs. \(H_1:\theta<\theta_0;\)
  3. \(H_0:\theta=\theta_0\) vs. \(H_1:\theta\neq\theta_0.\)

We use that, by Example 4.12, \(\mathcal{I}(\theta)=k/\theta^2.\) Then, we know that under \(H_0:\theta=\theta_0,\)

\[\begin{align*} Z=\frac{\hat{\theta}_{\mathrm{MLE}}-\theta_0}{\theta_0/\sqrt{nk}}\stackrel{d}{\longrightarrow} \mathcal{N}(0,1). \end{align*}\]

Therefore, \(Z\) is a test statistic and \(H_0\) is rejected if the observed value of \(Z\) belongs to the corresponding critical region (\(C_a,\) \(C_b,\) or \(C_c\)).

Example 6.14 A certain machine has to be repaired if more than \(10\%\) of the items that it produces per day are defective. A srs of \(n=100\) items of the daily production contains \(15\) that are defective and the foreman decides that the machine has to be repaired. Is the sample supporting his decision at a significance level \(\alpha=0.01\)?

Let \(Y\) be the number of defective items that were found. Then \(Y\sim\mathrm{Bin}(n,p).\) We want to test

\[\begin{align*} H_0: p=0.10\quad \text{vs.}\quad H_1:p>0.10. \end{align*}\]

Because of the CLT (Theorem 2.5), \(Y\) has a normal asymptotic distribution, so under \(H_0:p=p_0\) it follows that

\[\begin{align*} Z=\frac{\hat{p}-p_0}{\sqrt{p_0(1- p_0)/n}}\stackrel{d}{\longrightarrow}\mathcal{N}(0,1). \end{align*}\]

Therefore, \(Z\) is a test statistic with observed value

\[\begin{align*} Z=\frac{0.15-0.10}{\sqrt{0.1\times 0.9/100}}=5/3. \end{align*}\]

The rejection region is

\[\begin{align*} C=\{Z>z_{0.01}\approx2.33\}. \end{align*}\]

Since \(Z=5/3\approx1.67<2.33,\) the sample does not provide enough evidence supporting the foreman decision, that is, that the actual percentage of defective items the machine is producing is above \(10\%.\) The machine should not be repaired and the larger proportion of defective items in the batch can be attributed to chance.