## 18.3 Generalized linear models

• GLMs.. see Wikipedia and examples here
• Very easy to fit in R
• e.g. Logistic regression
• Predicting a binary outcome (Voter participation: Yes vs. No)
• e.g. Poisson regression
• Predicting outcome variable that represents counts (Number of wars started in a given year)
• Functions
• glm(): R function to estimate a generalized linear model
• glm(formula, family = familytype(link = linkfunction), data = )
• family =: Specify a choice of variance and link functions
• Every family comes with a default link function (see here) but you can change that
• glm(outcome ~ x1 + x2 + x3, data = somedata, family = binomial()): Logistic model
• glm(outcome ~ x1 + x2 + x3, data = somedata, family=poisson()): Poisson model
swiss2 <- swiss

library(plyr)
swiss2$d.Catholic <- cut(swiss2$Catholic,
breaks=c(-Inf, 50, Inf),
labels=c("low","high"))
class(swiss2$d.Catholic) # Factor! swiss2$d.Catholic

# Logistic Regression
# where F is a binary factor and
# x1-x3 are continuous predictors
fit <- glm(d.Catholic ~ Agriculture + Education,data = swiss2, family = binomial())

# Display results
summary(fit)

# Coefficients give the change in the log odds of the outcome for a one unit increase in the predictor variable
# e.g. For every one unit change in Agriculture, the log odds of Catholic (vs. non-Catholic) increases by 0.6173.

confint(fit) # CIs using profiled log-likelihood
# confidence intervals are based on the profiled log-likelihood function

confint.default(fit) # CIs using standard error
exp(coef(fit)) # exponentiated coefficients
exp(confint(fit)) # 95% CI for exponentiated coefficients
predict(fit, type="response") # predicted values
residuals(fit, type="deviance") # residuals

# Poisson Regression
# where count is a count and
# x1-x3 are continuous predictors
fit <- glm(count ~ x1+x2+x3, data=mydata, family=poisson())
summary(fit) display results