Chapter 6 Multivariate OLS: Where the Action Is

6.1 Computing Corner

Packages needed for this chapter.

library(car)
library(broom)
library(estimatr)
library(lm.beta)
library(tidyverse)

In this chapter you will learn the basics of estimating multivariate OLS models.

6.1.1 Multiple Regression

To estimate a multiple regression (a regression with more than one independent variable) use the same function lm but change the formula argument to include the additional variables. In a simple regression, the formula argument was of the form y ~ x. In a multiple regression, the formula argument takes the form y ~ x1 + x2. To include additional variables, extend the argument in a similar manner y ~ x1 + x2 + x3 + …. The remaining arguments are the same as in the simple regression. You can assign the results to an object just as with a simple regression. The object returned will be the list of 12 elements, but the objects in the list will change to reflect the additional variable(s).

We can use summary, broom::tidy, etc. to view the output.

To make use of the results, you can use any of the functions described in Chapter 3 of this manual. You can also make use of any of the subsetting commands as well.

Estimate a regression with robust standard errors with estimatr::lm_robust with the modified function argument. Alternatively we can add the argument robust = TRUE to jtools::summ.

6.1.2 Multicollinearity

You can directly estimate the VIF’s with the vif() function from the car package. To estimate the VIF’s call ols %>% vif() where ols is the object you created with the lm call.

6.1.3 Standardized Coefficients

Estimate standardized regression coefficients with lm.beta::lm.beta(). ols %>% lm.beta(). Alternatively we can add the argument scale = TRUE to jtools::summ

6.1.4 F tests

F tests in econometrics are generally about the joint significance of multiple variables. Suppose, we estimate the regression on $i=1,2,\ldots n$ observations. $y_i=\beta_0+\beta_1x_{1,i}+\beta_2x_{2,i}+\cdots+\beta_mx_{i,m}+\beta_{m+1}x_{m+1,i}+\cdots+\beta_kx_{i,k} + \epsilon_i$

To test the joint significance of the $\beta_1,\ldots,\beta_m$ in the model we would use an F test to perform the following hypothesis test: $H_0: \beta_1=\beta_2=\cdots=\beta_m=0$ $H_1:\text{@ least one }\beta_j\ne0$

An F test compares the difference in the residual sum of squares under the null and alternative hypotheses. If this difference in large enough relative to the unrestricted standard error, we have evidence to reject the null hypothesis in favor of the alternative hypothesis. The mechanics of the test are as follows:

Estimate the model that does not hold under the null hypothesis, that is, the model above and call it the unrestricted model and retrieve the residual sum of squares. Retrieve the residual sum of squares, $rss_u$ . The residuals from unrestricted model will have $n-k-1$ degrees of freedom. The unrestricted model, U, is: $\text{U: }y_i=\beta_0+\beta_1x_{1,i}+\beta_2x_{2,i}+\cdots+\beta_mx_{i,m}+\beta_{m+1}x_{m+1,i}+\cdots+\beta_kx_{i,k} + \epsilon_i$
Estimate the model that holds under the null hypothesis Restrict the model so that the null hypothesis holds. That restricted model, R, is $\text{R: }y_i=\beta_0+\beta_{m+1}x_{m+1,i}+\beta_{m+2}x_{m+2,i}+\cdots+\beta_kx_{k,i} + \eta_i$ . Retrieve the residual sum of squares $rss_r$ The residual from restricted model will have $n-m-1$ degrees of freedom.
Calculate the difference in the residual sum of squares $rss_r - rss_u$ and divide by its degrees of of freedom $q = (n-m-1)-(n-k-1) = k-m$ . q is the number of restrictions placed on the model. A simple way to calculate the number of restrictions is to count the number of equal signs ( $=$ ) in the null hypothesis.
Calculate $rss_u/(n-k-1)$
Divide the result from 3 by the result from 4. This will give you an F statistic with $k-m$ and $n-k-1$ degrees of freedom.

$F_c=\frac{\frac{rss_r-rss_u}{q}}{\frac{rss_u}{n-k-1}}$

The F-test (Wald test) can be used for any number of restrictions on the unrestricted model. For example, suppose we would like to know if a production function with a Cobb-Douglas form has constant returns to scale. The Cobb-Douglas function for output as a function of labor and capital takes the form $q=al^\alpha k^\beta\epsilon$ . If constant returns to scale hold, $\alpha+\beta=1$ . So we test the following hypothesis: $H_0:\alpha+\beta=1$ $H_1:\alpha+\beta\ne1$

To test this hypothesis form the unrestricted and restricted forms of the model, estimate the models, retrieve the sum of squared residuals, and calculate the F statistic. In the form presented above, the Cobb-Douglas model is not linear in the parameters so it can’t be estimated with OLS. We can make it linear in the parameters by taking the logarithm of both sides. $\ln(q)=\ln(al^\alpha k^\beta\epsilon)$ $\text{U: }\ln(q)=\gamma+\alpha \ln(l)+\beta\ln(k)+\epsilon$ .

Form the restricted model by imposing the null hypothesis on the parameters. From the null hypothesis, $\beta=1-\alpha$ . Substituting for $\beta$ in the restricted model yields the restricted model. $\text{R: }\ln(q)-\ln(k)=\gamma+\alpha[\ln(l)-\ln(k)]+\eta$

The F-stat is: $F_c=\frac{rss_r-rss_u}{\frac{rss_u}{n-k-1}}$

The degrees of freedom are $q=1$ (the number of equal signs in the null hypothesis) and $n-k-1$ .

6.1.4.1 F-test for overall significance.

Estimate the model $y_i=\beta_0+\beta_1x_{1,i}+\beta_2x_{2,i}+\cdots+\beta_kx_{k,i}+\epsilon_i$ . Test the hypothesis $H_0: \beta_1=\beta_2=\cdots=\beta_k=0$ $H_1:\text{@ least one }\beta_j\ne0$

If we reject the null hypothesis, we can say that we have explained some variation in $y$ with variation in at least one of the $x's$ . In other words, we have a model that is significant. If we fail to reject the null hypothesis, our model has no explanatory power. There is no need to calculate the F-statistic to perform this test because it is reported as a matter of course in the base R call summary or in glance from the broom package. The degrees of freedom are $q=k$ (the number of coefficients estimated - 1) and $n-k-1$ .

summary will report the F-statistic, its degrees of freedom (numerator and denominator), and the p-value. glance reports the F as “statistic”, the p-value as “p.value”, $k$ as “df”, and $n-k-1$ as “df.residual”. Note that this test is also a test for the significance of $R^2$ .

6.1.4.2 F-test of linear restrictions

The test we performed above are tests of linear restrictions of the parameters. These hypotheses can be tested directly using car::linearHypothesis from the car package. Performing a test of linear restrictions using car::linearHypothesis requires two arguments: model and hypothesis.matrix.

Let the unrestricted model be $y=\beta_0+\beta_1x_1+\beta_2x_2+\beta_3x_3+\epsilon$ Estimate the model as df %>% lm(y ~ x1 + x2 + x3, .), where df is the data frame containing the data.

Let’s test the hypothesis $\beta_2=\beta_3=0$ versus at that one of the $\beta\text{'s}\ne0$ by including an argument for the null hypothesis as:

df %>% 
  lm(y ~ x1 + x2 + x3, .) %>% 
  car::linearHypothesis(hypothesis.matrix = c("x2 = 0", "x3 = 0"))

The result will be an F-test on the restrictions. The F-statistic, its degrees of freedom, and p-value will be returned.

Let’s test the linear restriction for the Cobb-Douglas model above.²⁰

df %>% 
  lm(log(q) ~ log(l) + log(k), .) %>% 
  car::linearHypothesis(c("log(l) = log(k)"))

Again, the F-statistic, its degrees of freedom, and p-value will be returned.

6.1.4.3 Examples

The Motor Trend Car Road Test (mtcars) data set is part of the datasets in base R. The data were extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models). See ?mtcars for more information on the data. data(mtcars) will load the data into your global environment as mtcars.²¹ We will perform each of the F-tests described above: overall significance, joint significance of a subset of variables, and equality of two coefficients.

6.1.4.3.1 Multiple Regression

Suppose we want to estimate the mpg as a function of the number of cylinders, the displacement, and the gross horsepower, our (unrestricted) model is $mpg=\beta_0+\beta_1cyl+\beta_2disp+\beta_3hp+\epsilon$ .

Let’s estimate the unrestricted model both with and without robust errors.

# estimate model without robust standard errors
mtcars %>% 
  lm(mpg ~ cyl + disp + hp, .) %>% 
  broom::tidy()

# A tibble: 4 x 5
  term        estimate std.error statistic  p.value
  <chr>          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)  34.2       2.59       13.2  1.54e-13
2 cyl          -1.23      0.797      -1.54 1.35e- 1
3 disp         -0.0188    0.0104     -1.81 8.09e- 2
4 hp           -0.0147    0.0147     -1.00 3.25e- 1

# estimate model with robust standard errors
mtcars %>%
  estimatr::lm_robust(mpg ~ cyl + disp + hp, .) %>% 
  broom::tidy()

         term estimate std.error statistic           p.value conf.low conf.high
1 (Intercept)  34.1849    2.4700     13.84 0.000000000000048  29.1253  39.24451
2         cyl  -1.2274    0.5967     -2.06 0.049121075438813  -2.4498  -0.00506
3        disp  -0.0188    0.0083     -2.27 0.031138440490781  -0.0358  -0.00183
4          hp  -0.0147    0.0109     -1.34 0.190818678697032  -0.0371   0.00775
  df outcome
1 28     mpg
2 28     mpg
3 28     mpg
4 28     mpg

# report robust se's 

mtcars %>% 
  lm(mpg ~ cyl + disp + hp, .) %>% 
  jtools::summ(robust = TRUE)

Observations	32
Dependent variable	mpg
Type	OLS linear regression

F(3,28)	30.88
R²	0.77
Adj. R²	0.74

	Est.	S.E.	t val.	p
(Intercept)	34.18	2.61	13.09	0.00
cyl	-1.23	0.64	-1.92	0.06
disp	-0.02	0.01	-2.05	0.05
hp	-0.01	0.01	-1.15	0.26
Standard errors: Robust, type = HC3

6.1.4.4 Multicollinearity

Using the model above $mpg=\beta_0+\beta_1cyl+\beta_2disp+\beta_3hp+\epsilon$ .

We can calculate the VIF’s as follows:

mtcars %>% 
  lm(mpg ~ cyl + disp + hp, .) %>% 
  vif()

 cyl disp   hp 
6.73 5.52 3.35

mtcars %>% 
  lm(mpg ~ cyl + disp + hp, .) %>% 
  jtools::summ(vifs = TRUE)

Observations	32
Dependent variable	mpg
Type	OLS linear regression

F(3,28)	30.88
R²	0.77
Adj. R²	0.74

	Est.	S.E.	t val.	p	VIF
(Intercept)	34.18	2.59	13.19	0.00	NA
cyl	-1.23	0.80	-1.54	0.13	6.73
disp	-0.02	0.01	-1.81	0.08	5.52
hp	-0.01	0.01	-1.00	0.32	3.35
Standard errors: OLS

6.1.4.5 Standardize Regression Coefficients

Using the model $mpg=\beta_0+\beta_1cyl+\beta_2disp+\beta_3hp+\epsilon$ , estimate standardized regression coefficients as follows:

mtcars %>% 
  lm(mpg ~ cyl + disp + hp, .) %>% 
  jtools::summ(scale = TRUE)

Observations	32
Dependent variable	mpg
Type	OLS linear regression

F(3,28)	30.88
R²	0.77
Adj. R²	0.74

	Est.	S.E.	t val.	p
(Intercept)	20.09	0.54	37.20	0.00
cyl	-2.19	1.42	-1.54	0.13
disp	-2.33	1.29	-1.81	0.08
hp	-1.01	1.00	-1.00	0.32
Standard errors: OLS; Continuous predictors are mean-centered and scaled by 1 s.d.

6.1.4.6 F-test for Overall significance

Suppose we want to estimate the mpg as a function of the number of cylinders, the displacement, and the gross horsepower, then our (unrestricted) model is $mpg=\beta_0+\beta_1cyl+\beta_2disp+\beta_3hp+\epsilon$ .

Let’s estimate the unrestricted model

mtcars %>% 
  lm(mpg ~ cyl + disp + hp, .) %>% 
  jtools::summ()

Observations	32
Dependent variable	mpg
Type	OLS linear regression

F(3,28)	30.88
R²	0.77
Adj. R²	0.74

	Est.	S.E.	t val.	p
(Intercept)	34.18	2.59	13.19	0.00
cyl	-1.23	0.80	-1.54	0.13
disp	-0.02	0.01	-1.81	0.08
hp	-0.01	0.01	-1.00	0.32
Standard errors: OLS

The test for overall significance is:

$\begin{align}H_0&:\beta_1=\beta_2=\beta_3=0\\ H_1&: \text{@ least one }\beta_j\ne0\end{align}$

Recall that the F-test is reported as a matter of course in summary from base R and glance from the broom package.

ols_u <- 
  mtcars %>% 
  lm(mpg ~ cyl + disp + hp, .)

ols_u %>% 
  summary()


Call:
lm(formula = mpg ~ cyl + disp + hp, data = .)

Residuals:
   Min     1Q Median     3Q    Max 
-4.089 -2.085 -0.774  1.397  6.918 

Coefficients:
            Estimate Std. Error t value         Pr(>|t|)    
(Intercept)  34.1849     2.5908   13.19 0.00000000000015 ***
cyl          -1.2274     0.7973   -1.54            0.135    
disp         -0.0188     0.0104   -1.81            0.081 .  
hp           -0.0147     0.0147   -1.00            0.325    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.06 on 28 degrees of freedom
Multiple R-squared:  0.768,	Adjusted R-squared:  0.743 
F-statistic: 30.9 on 3 and 28 DF,  p-value: 0.00000000505

mtcars %>% 
  lm(mpg ~ cyl + disp + hp, .) %>%
  glance()

# A tibble: 1 x 12
  r.squared adj.r.squared sigma statistic p.value    df logLik   AIC   BIC
      <dbl>         <dbl> <dbl>     <dbl>   <dbl> <dbl>  <dbl> <dbl> <dbl>
1     0.768         0.743  3.06      30.9 5.05e-9     3  -79.0  168.  175.
# ... with 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>

So we see that $F=30.877$ , $q=3$ , and $df2=28$ . The critical F with $\alpha=.05$ is $2.947$ . Since the calculated F-stat is greater than the critical F-stat, we reject $H_0$ in favor of $H_1$ . That is, the explanatory power of the model is statistical significant.

Here we estimated our model, assigned it to the object ols_u, and piped that object into the summary functions. We should note, that the pipe operator (%>%) allows us to skip this assignment. Skipping this assignment can make our code clearer by making the object we are manipulating more clear than it’s name alone can make it. In the examples below, we will avoid assigning intermediate objects where we can.

6.1.4.7 F-test of Joint Significance

Suppose we’d like to add the weight $wt$ , number of gears $gear$ , and number of carburetors $carb$ together increase the explanatory power of the model at the $\alpha=.05$ , level of significance. Our unrestricted model becomes: $mpg=\beta_0+\beta_1cyl+\beta_2disp+\beta_3hp+\beta_4wt+\beta_5gear+\beta_6carb+\eta$ .

The null and alternative hypotheses are:

$\begin{align} H_0&:\beta_4=\beta_5=\beta_6=0 \\ H_1&:\text{@ least one }\beta_j\ne0 \end{align}$

6.1.4.8 Perform the test “manually”

We can retrieve $ssr$ and a degrees of freedom with broom::glance. broom::glance reports information about the entire model including sum of squared residuals and degrees of freedom, etc., and returns a tibble. We can use dplyr::select to get the values we want in a tibble. The sum of squared residuals, $\sum\hat{e}^{2}_{i}$ , is named deviance while the degrees of freedom we want is named df.residual.

# estimate the unrestricted model and retrieve ssr and df
unrestricted <- 
mtcars %>% 
  lm(mpg ~ cyl + disp + hp + wt + gear + carb, .) %>% 
  broom::glance() %>% 
  select(ssr_u = deviance, df_u = df.residual) 

# estimate the restricted model and retrieve ssr and df
restricted <- 
mtcars %>% 
  lm(mpg ~ cyl + disp + disp + hp, .) %>% 
  broom::glance() %>% 
  select(ssr_r = deviance, df_r = df.residual)

# combine the tibbles and calculate F and F_crit.
# bind_cols "stacks" the tibbles on top of each other

bind_cols(unrestricted, restricted) %>% 
  mutate(q = df_r - df_u,
         numerator = (ssr_r - ssr_u) / q,
         denominator = ssr_u / df_u,
         F_calc = numerator / denominator,
         F_crit = qf(.95, df_r - df_u, df_u),
         p_value = pf(F_calc, q, df_u, lower.tail = FALSE)) %>% 
  select(F_calc, F_crit, p_value, df_u, df_r, q) ->
  wald

wald

# A tibble: 1 x 6
  F_calc F_crit p_value  df_u  df_r     q
   <dbl>  <dbl>   <dbl> <int> <int> <int>
1   4.80   2.99 0.00897    25    28     3

Since 4.796 is greater than 2.991 we can reject $H_0$ in favor of $H_1$ and conclude that wt, am, and carb add significant explanatory power to the model. We can also see that the p-value for our calculated F-statistic is 0.009. Since this is less than $\alpha=.05$ we reject $H_0$ .

6.1.4.9 Perform the test with `linearHypothesis`

mtcars %>% 
  lm(mpg ~ cyl + disp + hp + wt + gear + carb, .) %>% 
  linearHypothesis(c("wt", "gear", "carb"))

Linear hypothesis test

Hypothesis:
wt = 0
gear = 0
carb = 0

Model 1: restricted model
Model 2: mpg ~ cyl + disp + hp + wt + gear + carb

  Res.Df RSS Df Sum of Sq   F Pr(>F)   
1     28 261                           
2     25 166  3      95.5 4.8  0.009 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Of course, we have the same result.

6.1.4.10 Test of Linear Restrictions

Let the model be $\ln(mpg)=\beta_0+\beta_1\ln(cyl)+\beta_2\ln(wt)+\epsilon$ . Suppose we’d like to test $H_0:\beta_1+\beta_2=-1$ against $H_0:\beta_1+\beta_2\ne-1$

6.1.4.10.1 Perform the Test “Manually”

Form the restricted model under $H_0$ . If $H_0$ holds, $\beta_2=-1-\beta_1$ . Substituting into the unrestricted model yields the restricted model: $\text{R: }\ln(mpg)+\ln(wt)=\beta_0+\beta_1(\ln(cyl)-\ln(wt))+\eta$

unrestricted <- 
mtcars %>% 
  lm(log(mpg) ~ log(cyl) + log(wt), .) %>% 
  broom::glance() %>% 
  select(ssr_u = deviance, df_u = df.residual) 

# estimate the restricted model and retrieve ssr and df
restricted <- 
mtcars %>% 
  lm(I(log(mpg)+log(wt)) ~ I(log(cyl) - log(wt)), .) %>% 
  broom::glance() %>% 
  select(ssr_r = deviance, df_r = df.residual)

# combine the tibbles and calculate F and F_crit.
# bind_cols "stacks" the tibbles on top of each other

bind_cols(unrestricted, restricted) %>% 
  mutate(q = df_r - df_u,
         numerator = (ssr_r - ssr_u) / (q),
         denominator = ssr_u / df_u,
         F_calc = numerator / denominator,
         F_crit = qf(.95, df_r - df_u, df_u),
         p_value = pf(F_calc, q, df_u, lower.tail = FALSE)) %>% 
  select(F_calc, F_crit, p_value, df_u, df_r, q) ->
  wald

wald

# A tibble: 1 x 6
  F_calc F_crit p_value  df_u  df_r     q
   <dbl>  <dbl>   <dbl> <int> <int> <int>
1   1.29   4.18   0.266    29    30     1

Since 1.289 is less than 4.183 we can fail to reject $H_0$ and conclude that we have no evidence to suggest that $\beta_1+\beta_2\ne1$ . We can also see that the p-value for our calculated F-statistic is 0.266. Since this is greater than $\alpha=.05$ we fail to reject $H_0$ .

6.1.4.11 Perform the test with `linearHypothesis`

mtcars %>% 
  lm(log(mpg) ~ log(cyl) + log(wt), .) %>%
  linearHypothesis(c("log(cyl) + log(wt) = -1"))

Linear hypothesis test

Hypothesis:
log(cyl)  + log(wt) = - 1

Model 1: restricted model
Model 2: log(mpg) ~ log(cyl) + log(wt)

  Res.Df   RSS Df Sum of Sq    F Pr(>F)
1     30 0.419                         
2     29 0.401  1    0.0178 1.29   0.27

There can be no values <= 0.↩︎
This step is not necessary. You can use mtcars or any base R dataset without first loading into the global environment.↩︎