18.3 Generalized linear models

  • GLMs.. see Wikipedia and examples here
  • Very easy to fit in R
  • e.g. Logistic regression
    • Predicting a binary outcome (Voter participation: Yes vs. No)
  • e.g. Poisson regression
    • Predicting outcome variable that represents counts (Number of wars started in a given year)
  • Functions
    • glm(): R function to estimate a generalized linear model
    • glm(formula, family = familytype(link = linkfunction), data = )
      • family =: Specify a choice of variance and link functions
        • Every family comes with a default link function (see here) but you can change that
    • glm(outcome ~ x1 + x2 + x3, data = somedata, family = binomial()): Logistic model
    • glm(outcome ~ x1 + x2 + x3, data = somedata, family=poisson()): Poisson model
swiss2 <- swiss

library(plyr)
swiss2$d.Catholic <- cut(swiss2$Catholic,
                     breaks=c(-Inf, 50, Inf),
                     labels=c("low","high"))
class(swiss2$d.Catholic) # Factor!
swiss2$d.Catholic


# Logistic Regression
# where F is a binary factor and 
# x1-x3 are continuous predictors 
fit <- glm(d.Catholic ~ Agriculture + Education,data = swiss2, family = binomial())

# Display results
summary(fit)

# Coefficients give the change in the log odds of the outcome for a one unit increase in the predictor variable
# e.g. For every one unit change in Agriculture, the log odds of Catholic (vs. non-Catholic) increases by 0.6173.

confint(fit) # CIs using profiled log-likelihood
 # confidence intervals are based on the profiled log-likelihood function

confint.default(fit) # CIs using standard error 
exp(coef(fit)) # exponentiated coefficients
exp(confint(fit)) # 95% CI for exponentiated coefficients
predict(fit, type="response") # predicted values
residuals(fit, type="deviance") # residuals

# Poisson Regression
# where count is a count and 
# x1-x3 are continuous predictors 
fit <- glm(count ~ x1+x2+x3, data=mydata, family=poisson())
summary(fit) display results