Example 1
- High-quality projects depart from some clearly defined concepts.
- Population: What is the population we want to learn about?
- Sample: What is the sample we are using?
- Random sample of germans; students in this class-room
- Concept: What feature do we want to learn about?
- Income (students): [Low = below 1500, High = above 1500]
- Measure: How do we measure the concept?
- Survey question (income): How high is your income?
- Variable/scale: What does the variable look like?
- Income: ..has values, e.g. 0 = “low” and 1 = “high”
- Measurement process:
- Assign values of our variable to individuals (each cell = particular value)
- Data: Sample of individuals grouped according to values (cells)
- Q: What is measurement error? Example?
- Terminology: Unit vs. observation vs. unit of analysis
Table .: Distribution of Income
some_data
|
Freq
|
high income
|
5
|
low income
|
5
|
- Discussion
- Q: What do the tables show?
- Q: What does a cell in those tables show?
- Q: What is a missing value/NA? Where are they in the tables above?
- Q: How would the data look like in a dataframe?
Table .: Joint distribution of Income/Gender
|
female
|
male
|
high income
|
5
|
0
|
low income
|
2
|
3
|
Table .: Univariate distribution of education (2006)
Economics
|
Political Science
|
Sociology
|
2
|
2
|
16
|