## 29.7 Genetic Matching

• GM uses iterative checking process of propensity scores, which combines propensity scores and Mahalanobis distance.

• GM is arguably “superior” method than nearest neighbor or full matching in imbalanced data

• Use a genetic search algorithm to find weights for each covariate such that we have optimal balance.

• Implementation

• could use with replacement

• balance can be based on

• paired $$t$$-tests (dichotomous variables)

• Kolmogorov-Smirnov (multinomial and continuous)

Packages

Matching

library(Matching)
data(lalonde)
attach(lalonde)

#The covariates we want to match on
X = cbind(age, educ, black, hisp, married, nodegr, u74, u75, re75, re74)

#The covariates we want to obtain balance on
BalanceMat <-
cbind(age,
educ,
black,
hisp,
married,
nodegr,
u74,
u75,
re75,
re74,
I(re74 * re75))

#
#Let's call GenMatch() to find the optimal weight to give each
#covariate in 'X' so as we have achieved balance on the covariates in
#'BalanceMat'. This is only an example so we want GenMatch to be quick
#so the population size has been set to be only 16 via the 'pop.size'
#option. This is *WAY* too small for actual problems.
#For details see http://sekhon.berkeley.edu/papers/MatchingJSS.pdf.
#
genout <-
GenMatch(
Tr = treat,
X = X,
BalanceMatrix = BalanceMat,
estimand = "ATE",
M = 1,
pop.size = 16,
max.generations = 10,
wait.generations = 1
)

#The outcome variable
Y=re78/1000

#
# Now that GenMatch() has found the optimal weights, let's estimate
# our causal effect of interest using those weights
#
mout <-
Match(
Y = Y,
Tr = treat,
X = X,
estimand = "ATE",
Weight.matrix = genout
)
summary(mout)

#
#Let's determine if balance has actually been obtained on the variables of interest
#
mb <-
MatchBalance(
treat ~ age + educ + black + hisp + married + nodegr
+ u74 + u75 + re75 + re74 + I(re74 * re75),
match.out = mout,
nboots = 500
)