8.6 What should I match on?

  • Selection on observables assumption: Assignment random conditional on covariates
    • Match on all observed vars that may affect both treatment D and outcome Y
    • Careful: Avoid post-treatment and endogenous selection bias (Elwert and Winship 2014b)
    • Conceptual: Find subset in your data in which X ⇻ Y (Tam Cho et al. 2013)
  • Match on quadratic/polynomial terms etc. where it makes sense theoretically (e.g. age X, education D and income Y)
  • Ultimate benchmark is balance of controls X across values of treatment D
  • Does it make sense to combine matching with regression?
    • Yes, conceptually you induce independence between X and D in the matched sample. But there may still be a direct path from X on Y, i.e. X may explain some of the variation in Y independently from D
    • Adding X decreases the amount of unexplained variance in Y, you get more precise estimates (decreases S.E.)

References

Elwert, Felix, and Christopher Winship. 2014b. “Endogenous Selection Bias: The Problem of Conditioning on a Collider Variable.” Annu. Rev. Sociol. 40 (1): 31–53.

Tam Cho, Wendy K, Jason J Sauppe, Alexander G Nikolaev, Sheldon H Jacobson, and Edward C Sewell. 2013. “An Optimization Approach for Making Causal Inferences.” Stat. Neerl. 67 (2): 211–26.