35.1 Introduction and Motivation

35.1.1 Why Match?

In many observational studies, researchers do not have the luxury of randomization. Subjects (people, firms, schools, etc.) typically select or are selected into treatment based on certain observed and/or unobserved characteristics. This can introduce systematic differences (selection bias) that confound causal inference. Matching attempts to approximate a randomized experiment by “balancing” these observed characteristics between treated and non-treated (control) units.

Goal: Reduce model dependence and clarify causal effects by ensuring that treated and control subjects have sufficiently comparable covariates.
Challenge: Even if matching achieves balance in observed covariates, any unobserved confounders remain a threat to identification (i.e., Matching is only a selection observables identification strategy). Matching does not magically fix bias from unobserved variables.

To understand why causal inference is difficult in observational studies, consider:

$\begin{aligned} E(Y_i^T | T) - E(Y_i^C | C) &= E(Y_i^T - Y_i^C | T) + \underbrace{[E(Y_i^C | T) - E(Y_i^C | C)]}_{\text{Selection Bias}} \\ \end{aligned}$

The term $E(Y_i^T - Y_i^C | T)$ is the causal effect (specifically the ATT).
The term $E(Y_i^C | T) - E(Y_i^C | C)$ reflects selection bias due to systematic differences in the untreated potential outcome across treated and control groups.

Random assignment ensures:

$E(Y_i^C | T) = E(Y_i^C | C)$

which eliminates selection bias. In observational data, however, this equality rarely holds.

Matching aims to mimic randomization by conditioning on covariates $X$ :

$E(Y_i^C | X, T) = E(Y_i^C | X, C)$

For example, propensity score matching achieves this balance by conditioning on the propensity score $P(X)$ :

$E(Y_i^C | P(X), T) = E(Y_i^C | P(X), C)$

(See Propensity Scores for further discussion.)

The Average Treatment Effect (ATE) under matching is typically estimated as:

$\frac{1}{N_T} \sum_{i=1}^{N_T} \left(Y_i^T - \frac{1}{N_{C_i}} \sum_{j \in \mathcal{C}_i} Y_j^C\right)$

where $\mathcal{C}_i$ denotes the matched controls for treated unit $i$ .

Standard Errors in Matching

Matching does not have a closed-form standard error for the ATE or ATT.
Therefore, we rely on bootstrapping to estimate uncertainty.

Note: Matching tends to yield larger standard errors than OLS because it reduces the effective sample size by discarding unmatched observations.

35.1.2 Matching as “Pruning”

Matching can be thought of as “pruning” (a preprocessing step) (G. King, Lucas, and Nielsen 2017). The goal is to prune unmatched or poorly matched units before conducting analysis, reducing model dependence.

Without Matching:

Imbalanced data → Model dependence → Researcher discretion → Biased estimates

With Matching:

Balanced data → Reduces discretion → More credible causal inference

Degree of Balance Across Designs
Balance of Covariates	Complete Randomization	Fully Exact Matching
Observed	On average	Exact
Unobserved	On average	On average

Fully blocked or exactly matched designs outperform randomized ones on:

Imbalance
Model dependence
Efficiency and power
Bias
Robustness
Research costs

35.1.3 Matching with DiD

Matching can be fruitfully combined with DiD when multiple pre-treatment periods are available. Such designs can help correct for selection bias under certain assumptions:

When selection bias is symmetric around the treatment date, standard DID (implemented symmetrically around the treatment date) remains consistent (Chabé-Ferret 2015 ).
If selection bias is asymmetric, simulations by Chabé-Ferret (2015) show that symmetric DID still outperforms matching alone, although having more pre-treatment observations can improve matching performance.

In short, matching is not a universal solution but often provides a helpful preprocessing step before conducting DiD or other causal estimation methods (J. A. Smith and Todd 2005).

References

Chabé-Ferret, Sylvain. 2015. “Analysis of the Bias of Matching and Difference-in-Difference Under Alternative Earnings and Selection Processes.” Journal of Econometrics 185 (1): 110–23.

King, Gary, Christopher Lucas, and Richard A Nielsen. 2017. “The Balance-Sample Size Frontier in Matching Methods for Causal Inference.” American Journal of Political Science 61 (2): 473–89.

Smith, Jeffrey A, and Petra E Todd. 2005. “Does Matching Overcome LaLonde’s Critique of Nonexperimental Estimators?” Journal of Econometrics 125 (1-2): 305–53.