12.1 Descriptive vs. causal questions (Repetition)
12.1.1 Descriptive questions (and analysis)
- How are observations distributed across values of Y? (univariate)6
- e.g. How are observations distributed across income categories?
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|
303 | 42 | 172 | 270 | 369 | 1281 | 853 | 1344 | 1295 | 353 | 356 |
- How are observations distributed across values of Y and X?
- How are observations distributed across income and gender values?
- We can add as many variables/dimensions as we like
- How are observations distributed across income, gender and time values?
- Normally we summarize those distributions using models (associational inference)
- e.g. means of income across gender across time
12.1.2 Causal questions (and analysis)
- Is there a causal link between the distribution across values of Y and values of D?
- Do differences in D cause differences in Y? (see app)
- …in practice we tend to summarise those distributions..
- Continuous variables: Compare means
- Categorical variables (several): Compare probabilites for categories
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
---|---|---|---|---|---|---|---|---|---|---|---|
no victim | 259 | 36 | 135 | 214 | 320 | 1142 | 782 | 1228 | 1193 | 326 | 331 |
victim | 44 | 6 | 37 | 56 | 48 | 139 | 70 | 114 | 101 | 27 | 25 |
mean(d$trust2006[d$victim2006==0])
## [1] NA
mean(d$trust2006[d$victim2006==1])
## [1] NA