30.4 Empirical Studies
30.4.1 Example: The Unintended Consequences of “Ban the Box” Policies
Doleac and Hansen (2020) examine the unintended effects of “Ban the Box” (BTB) policies, which prevent employers from asking about criminal records during the hiring process. The intended goal of BTB was to increase job access for individuals with criminal records. However, the study found that employers, unable to observe criminal history, resorted to statistical discrimination based on race, leading to unintended negative consequences.
Three Types of “Ban the Box” Policies:
- Public employers only
- Private employers with government contracts
- All employers
Identification Strategy
- If any county within a Metropolitan Statistical Area (MSA) adopts BTB, the entire MSA is considered treated.
- If a state passes a law banning BTB (“ban the ban”), then all counties in that state are treated.
The basic DiD model is:
Yit=β0+β1Postt+β2Treati+β3(Postt×Treati)+ϵit
where:
- Yit = employment outcome for individual i at time t
- Postt = indicator for post-treatment period
- Treati = indicator for treated MSAs
- β3 = the DiD coefficient, capturing the effect of BTB on employment
- ϵit = error term
Limitations: If different locations adopt BTB at different times, this model is not valid due to staggered treatment timing.
For settings where different MSAs adopt BTB at different times, we use a staggered DiD approach:
Eimrt=α+β1BTBimtWimt+β2BTBmt+β3BTBmtHimt+δm+Dimtβ5+λrt+δm×f(t)β7+eimrt
where:
- i = individual, m = MSA, r = region (e.g., Midwest, South), t = year
- W = White; B = Black; H = Hispanic
- BTBimt = Ban the Box policy indicator
- δm = MSA fixed effect
- Dimt = individual-level controls
- λrt = region-by-time fixed effect
- δm×f(t) = linear time trend within MSA
Fixed Effects Considerations:
Including λr and λt separately gives broader fixed effects.
Using λrt provides more granular controls for regional time trends.
To estimate the effects for Black men specifically, the model simplifies to:
Eimrt=α+BTBmtβ1+δm+Dimtβ5+λrt+(δm×f(t))β7+eimrt
To check for pre-trends and dynamic effects, we estimate:
Eimrt=α+BTBm(t−3)θ1+BTBm(t−2)θ2+BTBm(t−1)θ3+BTBmtθ4+BTBm(t+1)θ5+BTBm(t+2)θ6+BTBm(t+3)θ7+δm+Dimtβ5+λr+(δm×f(t))β7+eimrt
Key points:
- Leave out BTBm(t−1)θ3 as the reference category (to avoid perfect collinearity).
- If θ2 is significantly different from θ3, it suggests pre-trend issues, which could indicate anticipatory effects before BTB implementation.
30.4.2 Example: Minimum Wage and Employment
Card and Krueger (1993) famously studied the effect of an increase in the minimum wage on employment, challenging the traditional economic view that higher wages reduce employment.
- Philipp Leppert provides an R-based replication.
- Original datasets are available at David Card’s website.
Setting
- Treatment group: New Jersey (NJ), which increased its minimum wage.
- Control group: Pennsylvania (PA), which did not change its minimum wage.
- Outcome variable: Employment levels in fast-food restaurants.
The study used a Difference-in-Differences approach to estimate the impact:
State | After (Post) | Before (Pre) | Difference | |
---|---|---|---|---|
Treatment | NJ | A | B | A - B |
Control | PA | C | D | C - D |
A - C | B - D | (A - B) - (C - D) |
where:
- A−B captures the treatment effect plus general time trends.
- C−D captures only the general time trends.
- (A−B)−(C−D) isolates the causal effect of the minimum wage increase.
For the DiD estimator to be valid, the following conditions must hold:
- Parallel Trends Assumption
- The employment trends in NJ and PA would have been the same in the absence of the policy change.
- Pre-treatment employment trends should be similar between the two states.
- No “Switchers”
- The policy must not induce restaurants to switch locations between NJ and PA (e.g., a restaurant relocating across the border).
- PA as a Valid Counterfactual
- PA represents what NJ would have looked like had it not changed the minimum wage.
- The study focuses on bordering counties to increase comparability.
The main regression specification is:
Yjt=β0+NJjβ1+POSTtβ2+(NJj×POSTt)β3+Xjtβ4+ϵjt
where:
- Yjt = Employment in restaurant j at time t
- NJj = 1 if restaurant is in NJ, 0 if in PA
- POSTt = 1 if post-policy period, 0 if pre-policy
- (NJj×POSTt) = DiD interaction term, capturing the causal effect of NJ’s minimum wage increase
- Xjt = Additional controls (optional)
- ϵjt = Error term
Notes on Model Specification
β3 (DiD coefficient) is the key parameter of interest, representing the causal impact of the policy.
β4 (controls Xjt) is not necessary for unbiasedness but improves efficiency.
If we difference out the pre-period (ΔYjt=Yj,Post−Yj,Pre), we can simplify the model:
ΔYjt=α+NJjβ1+ϵjt
Here, we no longer need β2 for the post-treatment period.
An alternative specification uses high-wage NJ restaurants as a control group, arguing that they were not affected by the minimum wage increase. However:
- This approach eliminates cross-state differences, but
- It may be harder to interpret causality, as the control group is not entirely untreated.
A common misconception in DiD is that treatment and control groups must have the same baseline levels of the dependent variable (e.g., employment levels). However:
- DiD only requires parallel trends, meaning the slopes of employment changes should be the same pre-treatment.
- If pre-treatment trends diverge, this threatens validity.
- If post-treatment trends converge, it may suggest policy effects rather than pre-trend violations.
Is Parallel Trends a Necessary or Sufficient Condition?
- Not sufficient: Even if pre-trends are parallel, other confounders could affect results.
- Not necessary: Parallel trends may emerge only after treatment, depending on behavioral responses.
Thus, we cannot prove DiD is valid—we can only present evidence that supports the assumptions.
30.4.3 Example: The Effects of Grade Policies on Major Choice
Butcher, McEwan, and Weerapana (2014) investigate how grading policies influence students’ major choices. The central theory is that grading standards vary by discipline, which affects students’ decisions.
Why do the highest-achieving students often major in hard sciences?
- Grading Practices Differ Across Majors
- In STEM fields, grading is often stricter, meaning professors are less likely to give students the benefit of the doubt.
- In contrast, softer disciplines (e.g., humanities) may have more lenient grading, making students’ experiences more pleasant.
- Labor Market Incentives
- Degrees with lower market value (e.g., humanities) might compensate by offering a more pleasant academic experience.
- STEM degrees tend to be more rigorous but provide higher job market returns.
To examine how grades influence major selection, the study first estimates an OLS model:
Eij=β0+Xiβ1+Gjβ2+ϵij
where:
- Eij = Indicator for whether student i chooses major j.
- Xi = Student-level attributes (e.g., SAT scores, demographics).
- Gj = Average grade in major j.
- β2 = Key coefficient, capturing how grading standards influence major choice.
Potential Biases in ˆβ2:
- Negative Bias:
- Departments with lower enrollment rates may offer higher grades to attract students.
- This endogenous response leads to a downward bias in the OLS estimate.
- Positive Bias:
- STEM majors attract the best students, so their grades would naturally be higher if ability were controlled.
- If ability is not fully accounted for, ˆβ2 may be upward biased.
To address potential endogeneity in OLS, the study uses a difference-in-differences approach:
Yidt=β0+POSTtβ1+Treatdβ2+(POSTt×Treatd)β3+Xidt+ϵidt
where:
- Yidt = Average grade in department d at time t for student i.
- POSTt = 1 if post-policy period, 0 otherwise.
- Treatd = 1 if department is treated (i.e., grade policy change), 0 otherwise.
- (POSTt×Treatd) = DiD interaction term, capturing the causal effect of grade policy changes on major choice.
- Xidt = Additional student controls.
Group | Intercept (β0) | Treatment (β2) | Post (β1) | Interaction (β3) |
---|---|---|---|---|
Treated, Pre | 1 | 1 | 0 | 0 |
Treated, Post | 1 | 1 | 1 | 1 |
Control, Pre | 1 | 0 | 0 | 0 |
Control, Post | 1 | 0 | 1 | 0 |
- The average pre-period outcome for the control group is given by β0.
- The key coefficient of interest is β3, which captures the difference in the post-treatment effect between treated and control groups.
A more flexible specification includes fixed effects:
Yidt=α0+(POSTt×Treatd)α1+θd+δt+Xidt+uidt
where:
- θd = Department fixed effects (absorbing Treatd).
- δt = Time fixed effects (absorbing POSTt).
- α1 = Effect of policy change (equivalent to β3 in the simpler model).
Why Use Fixed Effects?
- More flexible specification:
- Instead of assuming a uniform treatment effect across groups, this model allows for department-specific differences (θd) and time-specific shocks (δt).
- Higher degrees of freedom:
- Fixed effects absorb variation that would otherwise be attributed to POSTt and Treatd, making the estimation more efficient.
Interpretation of Results
- If α1>0, then the policy increased grades in treated departments.
- If α1<0, then the policy decreased grades in treated departments.