30.4 Empirical Studies

30.4.1 Example: The Unintended Consequences of “Ban the Box” Policies

Doleac and Hansen (2020) examine the unintended effects of “Ban the Box” (BTB) policies, which prevent employers from asking about criminal records during the hiring process. The intended goal of BTB was to increase job access for individuals with criminal records. However, the study found that employers, unable to observe criminal history, resorted to statistical discrimination based on race, leading to unintended negative consequences.

Three Types of “Ban the Box” Policies:

Public employers only
Private employers with government contracts
All employers

Identification Strategy

If any county within a Metropolitan Statistical Area (MSA) adopts BTB, the entire MSA is considered treated.
If a state passes a law banning BTB (“ban the ban”), then all counties in that state are treated.

The basic DiD model is:

$Y_{it} = \beta_0 + \beta_1 \text{Post}_t + \beta_2 \text{Treat}_i + \beta_3 (\text{Post}_t \times \text{Treat}_i) + \epsilon_{it}$

where:

$Y_{it}$ = employment outcome for individual $i$ at time $t$
$\text{Post}_t$ = indicator for post-treatment period
$\text{Treat}_i$ = indicator for treated MSAs
$\beta_3$ = the DiD coefficient, capturing the effect of BTB on employment
$\epsilon_{it}$ = error term

Limitations: If different locations adopt BTB at different times, this model is not valid due to staggered treatment timing.

For settings where different MSAs adopt BTB at different times, we use a staggered DiD approach:

$\begin{aligned} E_{imrt} &= \alpha + \beta_1 BTB_{imt} W_{imt} + \beta_2 BTB_{mt} + \beta_3 BTB_{mt} H_{imt} \\ &+ \delta_m + D_{imt} \beta_5 + \lambda_{rt} + \delta_m \times f(t) \beta_7 + e_{imrt} \end{aligned}$

where:

$i$ = individual, $m$ = MSA, $r$ = region (e.g., Midwest, South), $t$ = year
$W$ = White; $B$ = Black; $H$ = Hispanic
$BTB_{imt}$ = Ban the Box policy indicator
$\delta_m$ = MSA fixed effect
$D_{imt}$ = individual-level controls
$\lambda_{rt}$ = region-by-time fixed effect
$\delta_m \times f(t)$ = linear time trend within MSA

Fixed Effects Considerations:

Including $\lambda_r$ and $\lambda_t$ separately gives broader fixed effects.
Using $\lambda_{rt}$ provides more granular controls for regional time trends.

To estimate the effects for Black men specifically, the model simplifies to:

$E_{imrt} = \alpha + BTB_{mt} \beta_1 + \delta_m + D_{imt} \beta_5 + \lambda_{rt} + (\delta_m \times f(t)) \beta_7 + e_{imrt}$

To check for pre-trends and dynamic effects, we estimate:

$\begin{aligned} E_{imrt} &= \alpha + BTB_{m (t - 3)} \theta_1 + BTB_{m (t - 2)} \theta_2 + BTB_{m (t - 1)} \theta_3 \\ &+ BTB_{mt} \theta_4 + BTB_{m (t + 1)} \theta_5 + BTB_{m (t + 2)} \theta_6 + BTB_{m (t + 3)} \theta_7 \\ &+ \delta_m + D_{imt} \beta_5 + \lambda_{r} + (\delta_m \times f(t)) \beta_7 + e_{imrt} \end{aligned}$

Key points:

Leave out $BTB_{m (t - 1)} \theta_3$ as the reference category (to avoid perfect collinearity).
If $\theta_2$ is significantly different from $\theta_3$ , it suggests pre-trend issues, which could indicate anticipatory effects before BTB implementation.

30.4.2 Example: Minimum Wage and Employment

Card and Krueger (1993) famously studied the effect of an increase in the minimum wage on employment, challenging the traditional economic view that higher wages reduce employment.

Philipp Leppert provides an R-based replication.
Original datasets are available at David Card’s website.

Setting

Treatment group: New Jersey (NJ), which increased its minimum wage.
Control group: Pennsylvania (PA), which did not change its minimum wage.
Outcome variable: Employment levels in fast-food restaurants.

The study used a Difference-in-Differences approach to estimate the impact:

	State	After (Post)	Before (Pre)	Difference
Treatment	NJ	A	B	A - B
Control	PA	C	D	C - D
		A - C	B - D	(A - B) - (C - D)

where:

$A - B$ captures the treatment effect plus general time trends.
$C - D$ captures only the general time trends.
$(A - B) - (C - D)$ isolates the causal effect of the minimum wage increase.

For the DiD estimator to be valid, the following conditions must hold:

Parallel Trends Assumption
- The employment trends in NJ and PA would have been the same in the absence of the policy change.
- Pre-treatment employment trends should be similar between the two states.
No “Switchers”
- The policy must not induce restaurants to switch locations between NJ and PA (e.g., a restaurant relocating across the border).
PA as a Valid Counterfactual
- PA represents what NJ would have looked like had it not changed the minimum wage.
- The study focuses on bordering counties to increase comparability.

The main regression specification is:

$Y_{jt} = \beta_0 + NJ_j \beta_1 + POST_t \beta_2 + (NJ_j \times POST_t)\beta_3+ X_{jt}\beta_4 + \epsilon_{jt}$

where:

$Y_{jt}$ = Employment in restaurant $j$ at time $t$
$NJ_j$ = 1 if restaurant is in NJ, 0 if in PA
$POST_t$ = 1 if post-policy period, 0 if pre-policy
$(NJ_j \times POST_t)$ = DiD interaction term, capturing the causal effect of NJ’s minimum wage increase
$X_{jt}$ = Additional controls (optional)
$\epsilon_{jt}$ = Error term

Notes on Model Specification

$\beta_3$ (DiD coefficient) is the key parameter of interest, representing the causal impact of the policy.
$\beta_4$ (controls $X_{jt}$ ) is not necessary for unbiasedness but improves efficiency.
If we difference out the pre-period ( $\Delta Y_{jt} = Y_{j,Post} - Y_{j,Pre}$ ), we can simplify the model:

$\Delta Y_{jt} = \alpha + NJ_j \beta_1 + \epsilon_{jt}$

Here, we no longer need $\beta_2$ for the post-treatment period.

An alternative specification uses high-wage NJ restaurants as a control group, arguing that they were not affected by the minimum wage increase. However:

This approach eliminates cross-state differences, but
It may be harder to interpret causality, as the control group is not entirely untreated.

A common misconception in DiD is that treatment and control groups must have the same baseline levels of the dependent variable (e.g., employment levels). However:

DiD only requires parallel trends, meaning the slopes of employment changes should be the same pre-treatment.
If pre-treatment trends diverge, this threatens validity.
If post-treatment trends converge, it may suggest policy effects rather than pre-trend violations.

Is Parallel Trends a Necessary or Sufficient Condition?

Not sufficient: Even if pre-trends are parallel, other confounders could affect results.
Not necessary: Parallel trends may emerge only after treatment, depending on behavioral responses.

Thus, we cannot prove DiD is valid—we can only present evidence that supports the assumptions.

30.4.3 Example: The Effects of Grade Policies on Major Choice

Butcher, McEwan, and Weerapana (2014) investigate how grading policies influence students’ major choices. The central theory is that grading standards vary by discipline, which affects students’ decisions.

Why do the highest-achieving students often major in hard sciences?

Grading Practices Differ Across Majors
- In STEM fields, grading is often stricter, meaning professors are less likely to give students the benefit of the doubt.
- In contrast, softer disciplines (e.g., humanities) may have more lenient grading, making students’ experiences more pleasant.
Labor Market Incentives
- Degrees with lower market value (e.g., humanities) might compensate by offering a more pleasant academic experience.
- STEM degrees tend to be more rigorous but provide higher job market returns.

To examine how grades influence major selection, the study first estimates an OLS model:

$E_{ij} = \beta_0 + X_i \beta_1 + G_j \beta_2 + \epsilon_{ij}$

where:

$E_{ij}$ = Indicator for whether student $i$ chooses major $j$ .
$X_i$ = Student-level attributes (e.g., SAT scores, demographics).
$G_j$ = Average grade in major $j$ .
$\beta_2$ = Key coefficient, capturing how grading standards influence major choice.

Potential Biases in $\hat{\beta}_2$ :

Negative Bias:
- Departments with lower enrollment rates may offer higher grades to attract students.
- This endogenous response leads to a downward bias in the OLS estimate.
Positive Bias:
- STEM majors attract the best students, so their grades would naturally be higher if ability were controlled.
- If ability is not fully accounted for, $\hat{\beta}_2$ may be upward biased.

To address potential endogeneity in OLS, the study uses a difference-in-differences approach:

$Y_{idt} = \beta_0 + POST_t \beta_1 + Treat_d \beta_2 + (POST_t \times Treat_d)\beta_3 + X_{idt} + \epsilon_{idt}$

where:

$Y_{idt}$ = Average grade in department $d$ at time $t$ for student $i$ .
$POST_t$ = 1 if post-policy period, 0 otherwise.
$Treat_d$ = 1 if department is treated (i.e., grade policy change), 0 otherwise.
$(POST_t \times Treat_d)$ = DiD interaction term, capturing the causal effect of grade policy changes on major choice.
$X_{idt}$ = Additional student controls.

Difference-in-Differences Table
Group	Intercept ( $\beta_0$ )	Treatment ( $\beta_2$ )	Post ( $\beta_1$ )	Interaction ( $\beta_3$ )
Treated, Pre	1	1	0	0
Treated, Post	1	1	1	1
Control, Pre	1	0	0	0
Control, Post	1	0	1	0

The average pre-period outcome for the control group is given by $\beta_0$ .
The key coefficient of interest is $\beta_3$ , which captures the difference in the post-treatment effect between treated and control groups.

A more flexible specification includes fixed effects:

$Y_{idt} = \alpha_0 + (POST_t \times Treat_d) \alpha_1 + \theta_d + \delta_t + X_{idt} + u_{idt}$

where:

$\theta_d$ = Department fixed effects (absorbing $Treat_d$ ).
$\delta_t$ = Time fixed effects (absorbing $POST_t$ ).
$\alpha_1$ = Effect of policy change (equivalent to $\beta_3$ in the simpler model).

Why Use Fixed Effects?

More flexible specification:
- Instead of assuming a uniform treatment effect across groups, this model allows for department-specific differences ( $\theta_d$ ) and time-specific shocks ( $\delta_t$ ).
Higher degrees of freedom:
- Fixed effects absorb variation that would otherwise be attributed to $POST_t$ and $Treat_d$ , making the estimation more efficient.

Interpretation of Results

If $\alpha_1 > 0$ , then the policy increased grades in treated departments.
If $\alpha_1 < 0$ , then the policy decreased grades in treated departments.

References

Butcher, Kristin F, Patrick J McEwan, and Akila Weerapana. 2014. “The Effects of an Anti-Grade Inflation Policy at Wellesley College.” Journal of Economic Perspectives 28 (3): 189–204.

Card, David, and Alan B Krueger. 1993. “Minimum Wages and Employment: A Case Study of the Fast Food Industry in New Jersey and Pennsylvania.” National Bureau of Economic Research Cambridge, Mass., USA.

Doleac, Jennifer L, and Benjamin Hansen. 2020. “The Unintended Consequences of ‘Ban the Box’: Statistical Discrimination and Employment Outcomes When Criminal Histories Are Hidden.” Journal of Labor Economics 38 (2): 321–74.