10.3 Fixed Effects Regression
Consider the panel regression model
Yit=β0+β1Xit+β2Zi+uitYit=β0+β1Xit+β2Zi+uit where the ZiZi are unobserved time-invariant heterogeneities across the entities i=1,…,ni=1,…,n. We aim to estimate β1β1, the effect on YiYi of a change in XiXi holding constant ZiZi. Letting αi=β0+β2Ziαi=β0+β2Zi we obtain the model Yit=αi+β1Xit+uit. Having individual specific intercepts αi, i=1,…,n, where each of these can be understood as the fixed effect of entity i, this model is called the fixed effects model. The variation in the αi, i=1,…,n comes from the Zi. (10.1) can be rewritten as a regression model containing n−1 dummy regressors and a constant: Yit=β0+β1Xit+γ2D2i+γ3D3i+⋯+γnDni+uit.Model (10.2) has n different intercepts — one for every entity. (10.1) and (10.2) are equivalent representations of the fixed effects model.
The fixed effects model can be generalized to contain more than just one determinant of Y that is correlated with X and changes over time. Key Concept 10.2 presents the generalized fixed effects regression model.
Key Concept 10.2
The Fixed Effects Regression Model
The fixed effects regression model is
Yit=β1X1,it+⋯+βkXk,it+αi+uitwith i=1,…,n and t=1,…,T. The αi are entity-specific intercepts that capture heterogeneities across entities. An equivalent representation of this model is given by
Yit=β0+β1X1,it+⋯+βkXk,it+γ2D2i+γ3D3i+⋯+γnDni+uitwhere the D2i,D3i,…,Dni are dummy variables.
Estimation and Inference
Software packages use a so-called “entity-demeaned” OLS algorithm which is computationally more efficient than estimating regression models with k+n regressors as needed for models (10.3) and (10.4).
Taking averages on both sides of (10.1) we obtain 1nn∑i=1Yit=β11nn∑i=1Xit+1nn∑i=1ai+1nn∑i=1uit¯Y=β1¯Xi+αi+¯ui. Subtraction from (10.1) yields Yit−¯Yi=β1(Xit−¯Xi)+(uit−¯ui)∼Yit=β1∼Xit+∼uit.In this model, the OLS estimate of the parameter of interest β1 is equal to the estimate obtained using (10.2) — without the need to estimate n−1 dummies and an intercept.
We conclude that there are two ways of estimating β1 in the fixed effects regression:
OLS of the dummy regression model as shown in (10.2)
OLS using the entity demeaned data as in (10.5)
Provided the fixed effects regression assumptions stated in Key Concept 10.3 hold, the sampling distribution of the OLS estimator in the fixed effects regression model is normal in large samples. The variance of the estimates can be estimated and we can compute standard errors, t-statistics and confidence intervals for coefficients. In the next section, we see how to estimate a fixed effects model using R and how to obtain a model summary that reports heteroskedasticity-robust standard errors. We leave aside complicated formulas of the estimators. See Chapter 10.5 and Appendix 10.2 of the book for a discussion of theoretical aspects.
Application to Traffic Deaths
Following Key Concept 10.2, the simple fixed effects model for estimation of the relation between traffic fatality rates and the beer taxes is FatalityRateit=β1BeerTaxit+StateFixedEffects+uit,a regression of the traffic fatality rate on beer tax and 48 binary regressors — one for each federal state.
We can simply use the function lm() to obtain an estimate of β1.
fatal_fe_lm_mod <- lm(fatal_rate ~ beertax + state - 1, data = Fatalities)
fatal_fe_lm_mod
##
## Call:
## lm(formula = fatal_rate ~ beertax + state - 1, data = Fatalities)
##
## Coefficients:
## beertax stateal stateaz statear stateca stateco statect statede
## -0.6559 3.4776 2.9099 2.8227 1.9682 1.9933 1.6154 2.1700
## statefl statega stateid stateil statein stateia stateks stateky
## 3.2095 4.0022 2.8086 1.5160 2.0161 1.9337 2.2544 2.2601
## statela stateme statemd statema statemi statemn statems statemo
## 2.6305 2.3697 1.7712 1.3679 1.9931 1.5804 3.4486 2.1814
## statemt statene statenv statenh statenj statenm stateny statenc
## 3.1172 1.9555 2.8769 2.2232 1.3719 3.9040 1.2910 3.1872
## statend stateoh stateok stateor statepa stateri statesc statesd
## 1.8542 1.8032 2.9326 2.3096 1.7102 1.2126 4.0348 2.4739
## statetn statetx stateut statevt stateva statewa statewv statewi
## 2.6020 2.5602 2.3137 2.5116 2.1874 1.8181 2.5809 1.7184
## statewy
## 3.2491
As discussed in the previous section, it is also possible to estimate β1 by applying OLS to the demeaned data, that is, to run the regression
∼FatalityRate=β1∼BeerTaxit+uit.
# obtain demeaned data
Fatalities_demeaned <- with(Fatalities,
data.frame(fatal_rate = fatal_rate - ave(fatal_rate, state),
beertaxs = beertax - ave(beertax, state)))
# estimate the regression
summary(lm(fatal_rate ~ beertax - 1, data = Fatalities_demeaned))
The function ave is convenient for computing group averages. We use it to obtain state specific averages of the fatality rate and the beer tax.
Alternatively one may use plm() from the package with the same name.
# install and load the 'plm' package
## install.packages("plm")
library(plm)
As for lm() we have to specify the regression formula and the data to be used in our call of plm(). Additionally, it is required to pass a vector of names of entity and time ID variables to the argument index. For Fatalities, the ID variable for entities is named state and the time id variable is year. Since the fixed effects estimator is also called the within estimator, we set model = “within”. Finally, the function coeftest() allows to obtain inference based on robust standard errors.
# estimate the fixed effects regression with plm()
fatal_fe_mod <- plm(fatal_rate ~ beertax,
data = Fatalities,
index = c("state", "year"),
model = "within")
# print summary using robust standard errors
coeftest(fatal_fe_mod, vcov. = vcovHC, type = "HC1")
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## beertax -0.65587 0.28880 -2.271 0.02388 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The estimated coefficient is again −0.6559. Note that plm() uses the entity-demeaned OLS algorithm and thus does not report dummy coefficients. The estimated regression function is
^FatalityRate=−0.66(0.29)×BeerTax+StateFixedEffects.The coefficient on BeerTax is negative and significant. The interpretation is that the estimated reduction in traffic fatalities due to an increase in the real beer tax by $1 is 0.66 per 10000 people, which is still pretty high. Although including state fixed effects eliminates the risk of a bias due to omitted factors that vary across states but not over time, we suspect that there are other omitted variables that vary over time and thus cause a bias.