Chapter 21 Basic Hypothesis Tests for Linear Models
21.1 Introduction
In this section we consider the application of hypothesis testing to linear models. Suppose that we are given the linear model,where are independent and identically distributed. We are interested in testing the hypothesis that a coefficient is equal to some value . In particular, we are most interested in as setting means that is not important in predicting , see Section 21.2. We can also construct confidence intervals for and in Section 21.4 extend hypothesis testing to multiple (all) parameters to test whether or not a linear model is useful in a given modelling scenario.
21.2 Tests on a single parameter
Given the linear model,where , we want to test vs. at significance level where is some constant. Typically, we might choose (common alternatives or ).
The decision rule is to reject if where is the standard error of the parameter. Recall from Section 17 that .
A special case of the above test occurs when we choose . The test vs. at level has the decision rule to reject ifNote that if we reject we are claiming that the explanatory variable is useful in predicting the response variable when all the other variables are included in the model.
The test statistic is often reported in the output from statistical software such as R.
Fuel consumption
A dataset considers fuel consumption for 50 US states plus Washington DC, that is observations. The response fuel is fuel consumption measured in gallons per person. The predictors considered are dlic, the percentage of licensed drivers, tax, motor fuel tax in US cents per gallon, inc, income per person in $1,000s and road, the log of the number of miles of federal highway. Fitting a linear model of the form
using R, the output is
Estimate | Standard Error | |
154.19 | 194.906 | |
4.719 | 1.285 | |
-4.228 | 2.030 | |
-6.135 | 2.194 | |
26.755 | 9.337 |
Test vs. at significance level .
Watch Video 31 for a work through in R of testing the null hypothesis.
Video 31: Fuel consumption example.
Hypothesis test for .
The decision rule is to reject if
So we reject and conclude that the tax variable is useful for prediction of fuel after having included the other variables.
We note that the -values is and therefore would not reject the null hypothesis at significance level .21.3 Confidence intervals for parameters
Recall thatFuel consumption (continued)
Consider Example 21.2.1 (Fuel consumption), construct a 95% confidence interval for .
This confidence interval does not contain 0 (just) as we would expect from the calculation of the -value in Example 20.2.1 (Fuel consumption) above.
21.4 Tests for the existence of regression
We want to test
for some at significance level .
Note that if we reject we are saying that the modelhas some ability to explain the variance that we are observing in . That is, there exists a linear relationship between the explanatory variables and the response variable.
If is the model deviance under the null hypothesis and is the model deviance under the alternative hypothesis, then the decision rule is to reject ifFor the data in Example 21.2.1 (Fuel consumption), the two competing models are
The models have residual sum of squares and , respectively. We test vs. for some at level .
Therefore, we reject and can say that the linear model has some power in explaining the variability in fuel.
Note that the -value for the test is . This is given in R by 1-pf(11.99,4,46)
and is reported in the last line of summary()
for a linear model in R.