12.1 Descriptive vs. causal questions (Repetition)

12.1.1 Descriptive questions (and analysis)

  • How are observations distributed across values of Y? (univariate)6
    • e.g. How are observations distributed across income categories?

Table 11.1: Univariate distribution of trust (2006)
0 1 2 3 4 5 6 7 8 9 10
303 42 172 270 369 1281 853 1344 1295 353 356


  • How are observations distributed across values of Y and X?
    • How are observations distributed across income and gender values?
  • We can add as many variables/dimensions as we like
    • How are observations distributed across income, gender and time values?
  • Normally we summarize those distributions using models (associational inference)
    • e.g. means of income across gender across time

12.1.2 Causal questions (and analysis)

  • Is there a causal link between the distribution across values of Y and values of D?
    • Do differences in D cause differences in Y? (see app)
  • …in practice we tend to summarise those distributions..
    • Continuous variables: Compare means
    • Categorical variables (several): Compare probabilites for categories

Table 11.2: Joint distribution of trust and victimization (2006, N = 6633)
0 1 2 3 4 5 6 7 8 9 10
no victim 259 36 135 214 320 1142 782 1228 1193 326 331
victim 44 6 37 56 48 139 70 114 101 27 25


mean(d$trust2006[d$victim2006==0])
## [1] NA
mean(d$trust2006[d$victim2006==1])
## [1] NA

  1. See Gerring (2012) for a discussion of “What?” and “Why?” questions.