19.2 Wald test

\[ W = (\hat{\theta}-\theta_0)'[cov(\hat{\theta})]^{-1}(\hat{\theta}-\theta_0) \\ W \sim \chi_q^2 \]

where \(cov(\hat{\theta})\) is given by the inverse Fisher Information matrix evaluated at \(\hat{\theta}\) and q is the rank of \(cov(\hat{\theta})\), which is the number of non-redundant parameters in \(\theta\)

Alternatively,

\[ t_W=\frac{(\hat{\theta}-\theta_0)^2}{I(\theta_0)^{-1}} \sim \chi^2_{(v)} \]

where v is the degree of freedom.

Equivalently,

\[ s_W= \frac{\hat{\theta}-\theta_0}{\sqrt{I(\hat{\theta})^{-1}}} \sim Z \]

How far away in the distribution your sample estimate is from the hypothesized population parameter.

For a null value, what is the probability you would obtained a realization “more extreme” or “worse” than the estimate you actually obtained.

Significance Level (\(\alpha\)) and Confidence Level (\(1-\alpha\))

  • The significance level is the benchmark in which the probability is so low that we would have to reject the null
  • The confidence level is the probability that sets the bounds on how far away the realization of the estimator would have to be to reject the null.

Test Statistics

  • Standardized (transform) the estimator and null value to a test statistic that always has the same distribution
  • Test Statistic for the OLS estimator for a single hypothesis

\[ T = \frac{\sqrt{n}(\hat{\beta}_j-\beta_{j0})}{\sqrt{n}SE(\hat{\beta_j})} \sim^a N(0,1) \]

Equivalently,

\[ T = \frac{(\hat{\beta}_j-\beta_{j0})}{SE(\hat{\beta_j})} \sim^a N(0,1) \]

the test statistic is another random variable that is a function of the data and null hypothesis.

  • T denotes the random variable test statistic
  • t denotes the single realization of the test statistic

Evaluating Test Statistic: determine whether or not we reject or fail to reject the null hypothesis at a given significance / confidence level

Three equivalent ways

  1. Critical Value

  2. P-value

  3. Confidence Interval

  4. Critical Value

For a given significance level, will determine the critical value (c)
* One-sided: \(H_0: \beta_j \ge \beta_{j0}\)

\[ P(T<c|H_0)=\alpha \]

Reject the null if \(t<c\)

  • One-sided: \(H_0: \beta_j \le \beta_{j0}\)

\[ P(T>c|H_0)=\alpha \]

Reject the null if \(t>c\)

  • TWo-sided: \(H_0: \beta_j \neq \beta_{j0}\)

\[ P(|T|>c|H_0)=\alpha \]

Reject the null if \(|t|>c\)

  1. p-value

Calculate the probability that the test statistic was worse than the realization you have

  • One-sided: \(H_0: \beta_j \ge \beta_{j0}\)

\[ \text{p-value} = P(T<t|H_0) \]

  • One-sided: \(H_0: \beta_j \le \beta_{j0}\)

\[ \text{p-value} = P(T>t|H_0) \]

  • Two-sided: \(H_0: \beta_j \neq \beta_{j0}\)

\[ \text{p-value} = P(|T|<t|H_0) \]

reject the null if p-value \(< \alpha\)

  1. Confidence Interval

Using the critical value associated with a null hypothesis and significance level, create an interval

\[ CI(\hat{\beta}_j)_{\alpha} = [\hat{\beta}_j-(c \times SE(\hat{\beta}_j)),\hat{\beta}_j+(c \times SE(\hat{\beta}_j))] \]

If the null set lies outside the interval then we reject the null.

  • We are not testing whether the true population value is close to the estimate, we are testing that given a field true population value of the parameter, how like it is that we observed this estimate.
  • Can be interpreted as we believe with \((1-\alpha)\times 100 \%\) probability that the confidence interval captures the true parameter value.

With stronger assumption (A1-A6), we could consider Finite Sample Properties

\[ T = \frac{\hat{\beta}_j-\beta_{j0}}{SE(\hat{\beta}_j)} \sim T(n-k) \]

  • This above distributional derivation is strongly dependent on A4 and A5
  • T has a student t-distribution because the numerator is normal and the denominator is \(\chi^2\).
  • Critical value and p-values will be calculated from the student t-distribution rather than the standard normal distribution.
  • \(n \to \infty\), \(T(n-k)\) is asymptotically standard normal.

Rule of thumb

  • if \(n-k>120\): the critical values and p-values from the t-distribution are (almost) the same as the critical values and p-values from the standard normal distribution.

  • if \(n-k<120\)

    • if (A1-A6) hold then the t-test is an exact finite distribution test
    • if (A1-A3a, A5) hold, because the t-distribution is asymptotically normal, computing the critical values from a t-distribution is still a valid asymptotic test (i.e., not quite the right critical values and p0values, the difference goes away as \(n \to \infty\))

19.2.1 Multiple Hypothesis

  • test multiple parameters as the same time

    • \(H_0: \beta_1 = 0\ \& \ \beta_2 = 0\)
    • \(H_0: \beta_1 = 1\ \& \ \beta_2 = 0\)
  • perform a series of simply hypothesis does not answer the question (joint distribution vs. two marginal distributions).

  • The test statistic is based on a restriction written in matrix form.

\[ y=\beta_0+x_1\beta_1 + x_2\beta_2 + x_3\beta_3 + \epsilon \]

Null hypothesis is \(H_0: \beta_1 = 0\) & \(\beta_2=0\) can be rewritten as \(H_0: \mathbf{R}\beta -\mathbf{q}=0\) where

  • \(\mathbf{R}\) is a m x k matrix where m is the number of restrictions and k is the number of parameters. \(\mathbf{q}\) is a k x 1 vector
  • \(\mathbf{R}\) “picks up” the relevant parameters while \(\mathbf{q}\) is a the null value of the parameter

\[ \mathbf{R}= \left( \begin{array}{cccc} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ \end{array} \right), \mathbf{q} = \left( \begin{array}{c} 0 \\ 0 \\ \end{array} \right) \]

Test Statistic for OLS estimator for a multiple hypothesis

\[ F = \frac{(\mathbf{R\hat{\beta}-q})\hat{\Sigma}^{-1}(\mathbf{R\hat{\beta}-q})}{m} \sim^a F(m,n-k) \]

  • \(\hat{\Sigma}^{-1}\) is the estimator for the asymptotic variance-covariance matrix

    • if A4 holds, both the homoskedastic and heteroskedastic versions produce valid estimator
    • If A4 does not hold, only the heteroskedastic version produces valid estimators.
  • When m = 1, there is only a single restriction, then the F-statistic is the t-statistic squared.

  • F distribution is strictly positive, check F-Distribution for more details.

19.2.2 Linear Combination

Testing multiple parameters as the same time

\[ H_0: \beta_1 -\beta_2 = 0 \\ H_0: \beta_1 - \beta_2 > 0 \\ H_0: \beta_1 - 2*\beta_2 =0 \]

Each is a single restriction on a function of the parameters.

Null hypothesis:

\[ H_0: \beta_1 -\beta_2 = 0 \]

can be rewritten as

\[ H_0: \mathbf{R}\beta -\mathbf{q}=0 \]

where \(\mathbf{R}\)=(0 1 -1 0 0) and \(\mathbf{q}=0\)

19.2.3 Application

library("car")
## Loading required package: carData
# Multiple hypothesis
mod.davis <- lm(weight ~ repwt, data=Davis)
linearHypothesis(mod.davis, c("(Intercept) = 0", "repwt = 1"),white.adjust = TRUE)
## Linear hypothesis test
## 
## Hypothesis:
## (Intercept) = 0
## repwt = 1
## 
## Model 1: restricted model
## Model 2: weight ~ repwt
## 
## Note: Coefficient covariance matrix supplied.
## 
##   Res.Df Df      F  Pr(>F)  
## 1    183                    
## 2    181  2 3.3896 0.03588 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Linear Combination
mod.duncan <- lm(prestige ~ income + education, data=Duncan)
linearHypothesis(mod.duncan, "1*income - 1*education = 0")
## Linear hypothesis test
## 
## Hypothesis:
## income - education = 0
## 
## Model 1: restricted model
## Model 2: prestige ~ income + education
## 
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1     43 7518.9                           
## 2     42 7506.7  1    12.195 0.0682 0.7952

19.2.4 Nonlinear

Suppose that we have q nonlinear functions of the parameters
\[ \mathbf{h}(\theta) = \{ h_1 (\theta), ..., h_q (\theta)\}' \]

The,n, the Jacobian matrix (\(\mathbf{H}(\theta)\)), of rank q is

\[ \mathbf{H}_{q \times p}(\theta) = \left( \begin{array} {ccc} \frac{\partial h_1(\theta)}{\partial \theta_1} & ... & \frac{\partial h_1(\theta)}{\partial \theta_p} \\ . & . & . \\ \frac{\partial h_q(\theta)}{\partial \theta_1} & ... & \frac{\partial h_q(\theta)}{\partial \theta_p} \end{array} \right) \]

where the null hypothesis \(H_0: \mathbf{h} (\theta) = 0\) can be tested agiasnt the 2-sided alternative with the Wald statistic

\[ W = \frac{\mathbf{h(\hat{\theta})'\{H(\hat{\theta})[F(\hat{\theta})'F(\hat{\theta})]^{-1}H(\hat{\theta})'\}^{-1}h(\hat{\theta})}}{s^2q} \sim F_{q,n-p} \]