Measurement: Contingency tables
- Discussion
- Q: What do the tables show (start with Table 3.3 then 3.4)?
- Q: What does a cell in those tables show?
- Q: What would the univariate distribution for income look like behind Table 3.4?
- Q: What is a missing value/NA? Do they appear in Table 3.3 and 3.4?
- Q: How would the data underlying Table 3.3 and 3.4 look like in a dataframe?
Table 3.3: Contingency table of discipline (this class)
Economics
|
Sociology
|
4
|
7
|
Table 3.4: Contingency table of Income/Gender
|
female
|
male
|
high income
|
5
|
0
|
low income
|
2
|
3
|
Notes
- Lessons to be learned
- We can conceptualize data as units that have been assigned to cells through observation. Each cell represents a particular value on the variable (or a combination of values on several variables).
- “A contingency table (also known as a cross tabulation or crosstab) is a type of table in a matrix format that displays the (multivariate) frequency distribution of the variables.” (Wikipedia)
- Displays how observations are distributed across the values of our variable(s).
- Contingency tables are “underused”.
- Ideally, display contingency tables with the missings (in R).