8.9 Packages & Functions

Matching in R can be easily done with the Matching package (Sekhon 2011).

  • Match(): Match using variety of algorithms including propensity score (used together with MatchBalance())
    • Y: Vector containing the outcome of interest. Missing values are not allowed. Without a Y vector no causal effect estimates will be produced, only a matched dataset.
    • Tr: Vector indicating the observations which are in the treatment regime and those which are not (TRUE/FALSE or 0/1)
    • X: Matrix containing the variables we wish to match on. Contains actual observed covariates or the propensity score or a combination of both.
    • M = 1: A scalar for the number of matches which should be found.
    • …and there are various other arguments ?Match
  • MatchBalance(): Provides balance statistics
    • formul =: List the variables we wish to obtain univariate balance statistics for; Dep. variable = treatment indicator; Include many functions of covariates
    • data =: Data frame containing all variables in the formula (without dataframe lexical scoping)
    • match.out =: The output object from the Match(). With it you get before/after balance statistics. Without statistics for raw data.
    • nboots =: The number of bootstrap samples to be run (Check ?Matchbalance for N).
  • GenMatch() [used together with Match()]: Finds optimal balance using multivariate matching + Genetic search algorithm to determine weights of each covariate
    • Tr: Vector indicating the observations which are in the treatment regime and those which are no (TRUE/FALSE or 0/1)
    • X: Matrix containing variables we wish to match on
    • BalanceMatrix = X: Matrix containing variables we wish to achieve balance on
    • estimand = "ATT": Choose estimand.. also "ATE" and "ATC"
    • M = 1: A scalar for the number of matches which should be found
    • pop.size = 100: Number of individuals genoud uses to solve the optimization problem (choose recommended values).
    • max.generations = 100: Maximum number of generations that genoud will run when optimizing.
  • rgenoud package/function: “function that combines evolutionary algorithm methods with a derivativebased (quasi-Newton) method to solve diffcult optimization problems” (Sekhon and Mebane 1998)

There are several other packages and you might want to try them. Coarsened Exact Matching (Iacus, King, and Porro 2009) is implemented in the cem package. Or the matchit package (Ho et al. 2006). Or the matchit package. Importantly, look out for developments by Fredrik Sävje et al. (quickmatch) on generalized full matching. Packages allowing for matching across multiple treatment categories are under development. To visualize balance the package cobalt may be helpful.

References

Ho, Daniel, Kosuke Imai, Gary King, and Elizabeth Stuart. 2006. “MatchIt: MatchIt: Nonparametric Preprocessing for Parametric Casual Inference.” R Package Version, 2–2.

Iacus, Stefano, Gary King, and Giuseppe Porro. 2009. “Cem: Software for Coarsened Exact Matching.” J. Stat. Softw. 30 (1): 1–27.

Sekhon, Jasjeet S. 2011. “Multivariate and Propensity Score Matching Software with Automated Balance Optimization: The Matching Package for R.” Journal of Statistical Software 42 (7): 1–52.

Sekhon, Jasjeet S, and Walter R Mebane. 1998. “Genetic Optimization Using Derivatives.” Political Analysis: An Annual Publication of the Methodology Section of the American Political Science Association 7 (1): 187–210.