Categorical Regression in Stata and R
Spring 2022
1 Introduction
This website houses all the information you need learn the basics of coding a number of different categorical and count models in Stata and R. It will not contain all the information taught in class, but will allow you to bridge that knowledge into running these models on your own. The Stata labs on this website were adapted from materials by Ewurama Okai.
1.1 Lab Structure
This course will contain 8 labs and an optional review lab at the end of the course. Our lab sessions will alternate between learning the regression models covered in the course and fundamental coding skills. We will also be including time for you to workshop your final projects before the end of the course.
Each lab will contain links to download script files (in .do or .r format) and overviews of key concepts.
Lab Topics
Note: All lab topics are tentative and subject to change.
- Lab 1: Introduction
- Lab 2: Linear Probability Models
- Lab 3: Logistic Regression
- Lab 4: Fundamentals Review
- Lab 5: Fundamentals Review + Likelihood Ratio Tests
- Lab 6: Probit Regression
- Lab 7: Multinomial Logistic Regression & Ordinal Regression
- Lab 8: Poisson Regression & Negative Binomial Regression
- Lab 9: Optional Review
1.2 Finding Data
When selecting data, consider:
- The research question you would like to answer
- Whether the dataset contains a categorical or count outcome variable (to fit the models we will be learning in this class)
- The unit/level of analysis in the dataset (individual? school? district? state?)
- The main independent and dependent variables you want to analyze
- Other relevant variables to include in your model
Some places to find datasets:
- Inter-university Consortium for Political and Social Research
- National Center for Education Statistics
- UNData
- World Values Survey
- General Social Survey
- Princeton’s Office of Population Research Data Archive
- Harvard Dataverse
- U.S. Government’s Open Data
- Chicago Open Data
- COVID-19 Open Data Repository