Introduction to Econometrics with R

This book is in Open Review. We want your feedback to make the book better for you and other students. You may annotate some text by selecting it with the cursor and then click the on the pop-up menu. You can also see the annotations of others: click the in the upper right hand corner of the page

10.6 Drunk Driving Laws and Traffic Deaths

There are two major sources of omitted variable bias that are not accounted for by all of the models of the relation between traffic fatalities and beer taxes that we have considered so far: economic conditions and driving laws. Fortunately, Fatalities has data on state-specific legal drinking age (drinkage), punishment (jail, service) and various economic indicators like unemployment rate (unemp) and per capita income (income). We may use these covariates to extend the preceding analysis.

These covariates are defined as follows:

unemp: a numeric variable stating the state specific unemployment rate.
log(income): the logarithm of real per capita income (in prices of 1988).
miles: the state average miles per driver.
drinkage: the state specify minimum legal drinking age.
drinkagc: a discretized version of drinkage that classifies states into four categories of minimal drinking age; \(18\), \(19\), \(20\), \(21\) and older. R denotes this as [18,19), [19,20), [20,21) and [21,22]. These categories are included as dummy regressors where [21,22] is chosen as the reference category.
punish: a dummy variable with levels yes and no that measures if drunk driving is severely punished by mandatory jail time or mandatory community service (first conviction).

At first, we define the variables according to the regression results presented in Table 10.1 of the book.

# discretize the minimum legal drinking age
Fatalities$drinkagec <- cut(Fatalities$drinkage,
                            breaks = 18:22, 
                            include.lowest = TRUE, 
                            right = FALSE)

# set minimum drinking age [21, 22] to be the baseline level
Fatalities$drinkagec <- relevel(Fatalities$drinkagec, "[21,22]")

# mandadory jail or community service?
Fatalities$punish <- with(Fatalities, factor(jail == "yes" | service == "yes", 
                                             labels = c("no", "yes")))

# the set of observations on all variables for 1982 and 1988
Fatalities_1982_1988 <- Fatalities[with(Fatalities, year == 1982 | year == 1988), ]

Next, we estimate all seven models using plm().

# estimate all seven models
fatalities_mod1 <- lm(fatal_rate ~ beertax, data = Fatalities)

fatalities_mod2 <- plm(fatal_rate ~ beertax + state, data = Fatalities)

fatalities_mod3 <- plm(fatal_rate ~ beertax + state + year,
                       index = c("state","year"),
                       model = "within",
                       effect = "twoways", 
                       data = Fatalities)

fatalities_mod4 <- plm(fatal_rate ~ beertax + state + year + drinkagec 
                       + punish + miles + unemp + log(income), 
                       index = c("state", "year"),
                       model = "within",
                       effect = "twoways",
                       data = Fatalities)

fatalities_mod5 <- plm(fatal_rate ~ beertax + state + year + drinkagec 
                       + punish + miles,
                       index = c("state", "year"),
                       model = "within",
                       effect = "twoways",
                       data = Fatalities)

fatalities_mod6 <- plm(fatal_rate ~ beertax + year + drinkage 
                       + punish + miles + unemp + log(income), 
                       index = c("state", "year"),
                       model = "within",
                       effect = "twoways",
                       data = Fatalities)

fatalities_mod7 <- plm(fatal_rate ~ beertax + state + year + drinkagec 
                       + punish + miles + unemp + log(income), 
                       index = c("state", "year"),
                       model = "within",
                       effect = "twoways",
                       data = Fatalities_1982_1988)

We again use stargazer() (Hlavac, 2018) to generate a comprehensive tabular presentation of the results.

library(stargazer)

# gather clustered standard errors in a list
rob_se <- list(sqrt(diag(vcovHC(fatalities_mod1, type = "HC1"))),
               sqrt(diag(vcovHC(fatalities_mod2, type = "HC1"))),
               sqrt(diag(vcovHC(fatalities_mod3, type = "HC1"))),
               sqrt(diag(vcovHC(fatalities_mod4, type = "HC1"))),
               sqrt(diag(vcovHC(fatalities_mod5, type = "HC1"))),
               sqrt(diag(vcovHC(fatalities_mod6, type = "HC1"))),
               sqrt(diag(vcovHC(fatalities_mod7, type = "HC1"))))

# generate the table
stargazer(fatalities_mod1, fatalities_mod2, fatalities_mod3, 
          fatalities_mod4, fatalities_mod5, fatalities_mod6, fatalities_mod7, 
          digits = 3,
          header = FALSE,
          type = "latex", 
          se = rob_se,
          title = "Linear Panel Regression Models of Traffic Fatalities due to Drunk Driving",
          model.numbers = FALSE,
          column.labels = c("(1)", "(2)", "(3)", "(4)", "(5)", "(6)", "(7)"))


	Dependent Variable: Fatality Rate

	fatal_rate
	OLS	panel
		linear
	(1)	(2)	(3)	(4)	(5)	(6)	(7)

beertax	0.365^***	-0.656^**	-0.640^*	-0.445	-0.690^**	-0.456	-0.926^***
	(0.053)	(0.289)	(0.350)	(0.291)	(0.345)	(0.301)	(0.337)

drinkagec[18,19)				0.028	-0.010		0.037
				(0.068)	(0.081)		(0.101)

drinkagec[19,20)				-0.018	-0.076		-0.065
				(0.049)	(0.066)		(0.097)

drinkagec[20,21)				0.032	-0.100^*		-0.113
				(0.050)	(0.055)		(0.123)

drinkage						-0.002
						(0.021)

punishyes				0.038	0.085	0.039	0.089
				(0.101)	(0.109)	(0.101)	(0.161)

miles				0.00001	0.00002^*	0.00001	0.0001^***
				(0.00001)	(0.00001)	(0.00001)	(0.00005)

unemp				-0.063^***		-0.063^***	-0.091^***
				(0.013)		(0.013)	(0.021)

log(income)				1.816^***		1.786^***	0.996
				(0.624)		(0.631)	(0.666)

Constant	1.853^***
	(0.047)


Observations	336	336	336	335	335	335	95
R²	0.093	0.041	0.036	0.360	0.066	0.357	0.659
Adjusted R²	0.091	-0.120	-0.149	0.217	-0.134	0.219	0.157
Residual Std. Error	0.544 (df = 334)
F Statistic	34.394^*** (df = 1; 334)	12.190^*** (df = 1; 287)	10.513^*** (df = 1; 281)	19.194^*** (df = 8; 273)	3.252^*** (df = 6; 275)	25.423^*** (df = 6; 275)	9.194^*** (df = 8; 38)

Note:	^p<0.1; ^p<0.05; ^**p<0.01

Table 10.1: Linear Panel Regression Models of Traffic Fatalities due to Drunk Driving

While columns (2) and (3) recap the results (10.7) and (10.8), column (1) presents an estimate of the coefficient of interest in the naive OLS regression of the fatality rate on beer tax without any fixed effects. We obtain a positive estimate for the coefficient on beer tax that is likely to be upward biased. The model fit is rather bad, too (\(\bar{R}^2 = 0.091\)). The sign of the estimate changes as we extend the model by both entity and time fixed effects in models (2) and (3). Furthermore \(\bar{R}^2\) increases substantially as fixed effects are included in the model equation. Nonetheless, as discussed before, the magnitudes of both estimates may be too large.

The model specifications (4) to (7) include covariates that shall capture the effect of overall state economic conditions as well as the legal framework. Considering (4) as the baseline specification, we observe four interesting results:

Including the covariates does not lead to a major reduction of the estimated effect of the beer tax. The coefficient is not significantly different from zero at the level of \(5\%\) as the estimate is rather imprecise.
The minimum legal drinking age does not have an effect on traffic fatalities: none of the three dummy variables are significantly different from zero at any common level of significance. Moreover, an \(F\)-Test of the joint hypothesis that all three coefficients are zero does not reject. The next code chunk shows how to test this hypothesis.

# test if legal drinking age has no explanatory power
linearHypothesis(fatalities_mod4,
                 test = "F",
                 c("drinkagec[18,19)=0", "drinkagec[19,20)=0", "drinkagec[20,21)"), 
                 vcov. = vcovHC, type = "HC1")

## Linear hypothesis test
## 
## Hypothesis:
## drinkagec[18,19) = 0
## drinkagec[19,20) = 0
## drinkagec[20,21) = 0
## 
## Model 1: restricted model
## Model 2: fatal_rate ~ beertax + state + year + drinkagec + punish + miles + 
##     unemp + log(income)
## 
## Note: Coefficient covariance matrix supplied.
## 
##   Res.Df Df      F Pr(>F)
## 1    276                 
## 2    273  3 0.3782 0.7688

There is no evidence that punishment for first offenders has a deterring effects on drunk driving: the corresponding coefficient is not significant at the \(10\%\) level.
The economic variables significantly explain traffic fatalities. We can check that the employment rate and per capita income are jointly significant at the level of \(0.1\%\).

# test if economic indicators have no explanatory power
linearHypothesis(fatalities_mod4, 
                 test = "F",
                 c("log(income)", "unemp"), 
                 vcov. = vcovHC, type = "HC1")

## Linear hypothesis test
## 
## Hypothesis:
## log(income) = 0
## unemp = 0
## 
## Model 1: restricted model
## Model 2: fatal_rate ~ beertax + state + year + drinkagec + punish + miles + 
##     unemp + log(income)
## 
## Note: Coefficient covariance matrix supplied.
## 
##   Res.Df Df      F    Pr(>F)    
## 1    275                        
## 2    273  2 31.577 4.609e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Model (5) omits the economic factors. The result supports the notion that economic indicators should remain in the model as the coefficient on beer tax is sensitive to the inclusion of the latter.

Results for model (6) demonstrate that the legal drinking age has little explanatory power and that the coefficient of interest is not sensitive to changes in the functional form of the relation between drinking age and traffic fatalities.

Specification (7) reveals that reducing the amount of available information (we only use 95 observations for the period 1982 to 1988 here) inflates standard errors but does not lead to drastic changes in coefficient estimates.

Summary

We have not found evidence that severe punishments and increasing the minimum drinking age reduce traffic fatalities due to drunk driving. Nonetheless, there seems to be a negative effect of alcohol taxes on traffic fatalities which, however, is estimated imprecisely and cannot be interpreted as the causal effect of interest as there still may be a bias. The issue is that there may be omitted variables that differ across states and change over time and this bias remains even though we use a panel approach that controls for entity specific and time invariant unobservables.

A powerful method that can be used if common panel regression approaches fail is instrumental variables regression. We will return to this concept in Chapter 12.

References

Hlavac, M. (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables (Version 5.2.2). Retrieved from https://CRAN.R-project.org/package=stargazer