# Chapter 6 Fixed or random effects

This section was originally prepared for the Adanced Methods of Political Analysis (Poli 706) in Spring 2019, which I served as a TA for Tobias Heinrich. The goal of this part is to address one common question we encounter in our research: when to use fixed or random effects?

## 6.1 Prerequisite

Load the following packages for this section.

library(modelr)
library(tidyverse)
library(foreign)
library(plm)

## 6.2 Introduction

Let us write down the following model (from Wooldridge p.435).

$$y_{it}=\beta_{1}x_{it} + a_{i} + u_{it}$$

Essentially, we have some “fixed” errors ($$a_{i}$$) over time. When we add the additional assumption that $$a_{i}$$ is uncorrelated with any explanatory variable, we have the random effects model.

Let us get started by using the following data to run a simple OLS.

Panel <- read.dta("http://dss.princeton.edu/training/Panel101.dta")
ols <-lm(y ~ x1, data=Panel)

Now plot the fitted line.

grid <- data.frame(Intercept=1, x1=seq_range(Panel$x1, 10)) grid$pred <- predict(ols,grid)
ggplot(Panel, aes(x1)) +
geom_point(aes(y = y)) +
geom_line(aes(y = pred), data = grid, colour = "red", size = 1)

## 6.3 Fixed effects

There are two common ways to fit fixed effects. The first one is to addd dummy variables for the group indicator.

fixed_dum <-lm(y ~ x1 + factor(country) - 1, data=Panel)
summary(fixed_dum)
##
## Call:
## lm(formula = y ~ x1 + factor(country) - 1, data = Panel)
##
## Residuals:
##        Min         1Q     Median         3Q        Max
## -8.634e+09 -9.697e+08  5.405e+08  1.386e+09  5.612e+09
##
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)
## x1                2.476e+09  1.107e+09   2.237  0.02889 *
## factor(country)A  8.805e+08  9.618e+08   0.916  0.36347
## factor(country)B -1.058e+09  1.051e+09  -1.006  0.31811
## factor(country)C -1.723e+09  1.632e+09  -1.056  0.29508
## factor(country)D  3.163e+09  9.095e+08   3.478  0.00093 ***
## factor(country)E -6.026e+08  1.064e+09  -0.566  0.57329
## factor(country)F  2.011e+09  1.123e+09   1.791  0.07821 .
## factor(country)G -9.847e+08  1.493e+09  -0.660  0.51190
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.796e+09 on 62 degrees of freedom
## Multiple R-squared:  0.4402, Adjusted R-squared:  0.368
## F-statistic: 6.095 on 8 and 62 DF,  p-value: 8.892e-06
grid_fixdum <- expand(data.frame(Intercept=1, x1=seq_range(Panel$x1, 14), country=unique(Panel$country)),
x1, country)
grid_fixdum\$pred <- predict(fixed_dum,grid_fixdum)
ggplot(Panel, aes(x1)) +
geom_point(aes(y = y, colour=country)) +
geom_smooth(aes(y = y), method='lm', se=FALSE, colour="black")+
geom_line(data=grid_fixdum, aes(x=x1, y=pred, colour=country))

The second way is to incorporate fixed effects directly. In R, it is typically referred to as “within” model.

fixed <- plm(y ~ x1, data=Panel, index=c("country", "year"), model="within")
summary(fixed)
## Oneway (individual) effect Within Model
##
## Call:
## plm(formula = y ~ x1, data = Panel, model = "within", index = c("country",
##     "year"))
##
## Balanced Panel: n = 7, T = 10, N = 70
##
## Residuals:
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max.
## -8.63e+09 -9.70e+08  5.40e+08  0.00e+00  1.39e+09  5.61e+09
##
## Coefficients:
##      Estimate Std. Error t-value Pr(>|t|)
## x1 2475617827 1106675594   2.237  0.02889 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares:    5.2364e+20
## Residual Sum of Squares: 4.8454e+20
## R-Squared:      0.074684
## Adj. R-Squared: -0.029788
## F-statistic: 5.00411 on 1 and 62 DF, p-value: 0.028892

We can then extract the fixed effects.

fixef(fixed) 
##           A           B           C           D           E           F
##   880542404 -1057858363 -1722810755  3162826897  -602622000  2010731793
##           G
##  -984717493

We can also test whether the fixed effects model is better than OLS.

pFtest(fixed, ols) 
##
##  F test for individual effects
##
## data:  y ~ x1
## F = 2.9655, df1 = 6, df2 = 62, p-value = 0.01307
## alternative hypothesis: significant effects

## 6.4 Random effects

random <- plm(y ~ x1, data=Panel, index=c("country", "year"), model="random")
summary(random)
## Oneway (individual) effect Random Effect Model
##    (Swamy-Arora's transformation)
##
## Call:
## plm(formula = y ~ x1, data = Panel, model = "random", index = c("country",
##     "year"))
##
## Balanced Panel: n = 7, T = 10, N = 70
##
## Effects:
##                     var   std.dev share
## idiosyncratic 7.815e+18 2.796e+09 0.873
## individual    1.133e+18 1.065e+09 0.127
## theta: 0.3611
##
## Residuals:
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max.
## -8.94e+09 -1.51e+09  2.82e+08  0.00e+00  1.56e+09  6.63e+09
##
## Coefficients:
##               Estimate Std. Error z-value Pr(>|z|)
## (Intercept) 1037014284  790626206  1.3116   0.1896
## x1          1247001782  902145601  1.3823   0.1669
##
## Total Sum of Squares:    5.6595e+20
## Residual Sum of Squares: 5.5048e+20
## R-Squared:      0.02733
## Adj. R-Squared: 0.013026
## Chisq: 1.91065 on 1 DF, p-value: 0.16689

### 6.4.1 Fixed or random

You can run a Hausman test (which tests whether the unique errors are correlated with the regressors, the null is they are not). If the p-value is significant, then you choose fixed effects (since the unique errors are correlated with the regressors).

phtest(fixed, random)
##
##  Hausman Test
##
## data:  y ~ x1
## chisq = 3.674, df = 1, p-value = 0.05527
## alternative hypothesis: one model is inconsistent

## 6.5 More tests

You can also add time fixed effects.

fixed_time_dum <- plm(y ~ x1 + factor(year), data=Panel, index=c("country", "year"), model="within")
summary(fixed_time_dum)
## Oneway (individual) effect Within Model
##
## Call:
## plm(formula = y ~ x1 + factor(year), data = Panel, model = "within",
##     index = c("country", "year"))
##
## Balanced Panel: n = 7, T = 10, N = 70
##
## Residuals:
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max.
## -7.92e+09 -1.05e+09 -1.40e+08  0.00e+00  1.63e+09  5.49e+09
##
## Coefficients:
##                    Estimate Std. Error t-value Pr(>|t|)
## x1               1389050354 1319849567  1.0524  0.29738
## factor(year)1991  296381558 1503368528  0.1971  0.84447
## factor(year)1992  145369666 1547226548  0.0940  0.92550
## factor(year)1993 2874386795 1503862554  1.9113  0.06138 .
## factor(year)1994 2848156288 1661498927  1.7142  0.09233 .
## factor(year)1995  973941306 1567245748  0.6214  0.53698
## factor(year)1996 1672812557 1631539254  1.0253  0.30988
## factor(year)1997 2991770063 1627062032  1.8388  0.07156 .
## factor(year)1998  367463593 1587924445  0.2314  0.81789
## factor(year)1999 1258751933 1512397632  0.8323  0.40898
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares:    5.2364e+20
## Residual Sum of Squares: 4.0201e+20
## R-Squared:      0.23229
## Adj. R-Squared: 0.00052851
## F-statistic: 1.60365 on 10 and 53 DF, p-value: 0.13113

Or simply,

fixed_time <- plm(y ~ x1, data=Panel, index=c("country", "year"), model="within", effect="twoway")
summary(fixed_time)
## Twoways effects Within Model
##
## Call:
## plm(formula = y ~ x1, data = Panel, effect = "twoway", model = "within",
##     index = c("country", "year"))
##
## Balanced Panel: n = 7, T = 10, N = 70
##
## Residuals:
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max.
## -7.92e+09 -1.05e+09 -1.40e+08  0.00e+00  1.63e+09  5.49e+09
##
## Coefficients:
##      Estimate Std. Error t-value Pr(>|t|)
## x1 1389050354 1319849567  1.0524   0.2974
##
## Total Sum of Squares:    4.1041e+20
## Residual Sum of Squares: 4.0201e+20
## R-Squared:      0.020471
## Adj. R-Squared: -0.27524
## F-statistic: 1.10761 on 1 and 53 DF, p-value: 0.29738

### 6.5.1 Test whether adding time-fixed effects is necessary

pFtest(fixed_time, fixed)
##
##  F test for twoways effects
##
## data:  y ~ x1
## F = 1.209, df1 = 9, df2 = 53, p-value = 0.3094
## alternative hypothesis: significant effects