Topic 1 Basic statistical concepts
1.1 Types of statistical analysis
- Descriptive: goal is to summarize the data we have in hand.
- Inferential: goal is to use the data we have in hand to know more about the larger population.
1.2 The different scales of data measurement
- Nominal level: defines different categories (e.g. different colleges).
- Ordinal level: defines different categories and establishes an order between them (e.g. ranking of best colleges in the US).
- Interval level: defines an order between values, but the distance between all values has the same meaning (e.g. SAT score, measures of temperature).
- Ratio level: interval level of measurement with an absolute 0 (e.g.GPA). By an absolute 0, we mean that the value of 0 means the absence of what we are measuring. Note that this is not the case, for example, with measures of temperature (0 does not mean the absence of temperature, it means a given value for temperature, just like other values). Because we have a true 0 value, a ratio scale does not have negative values.
1.3 Types of datasets
A collection of variables for a given group forms a dataset. Note that when we perform statistical analysis we want to deal with the individual observations (the microdata) not the aggregate values.
There are four main types of datasets:
- Cross-sectional data: 1 time period. Many subjects. Many variables.
- Panel data (longitudinal data): Several time periods. Several subjects. Many variables. We have data about the same subjects in every period.
- Pooled (repeated) Cross Sectional data: Several time periods. Several subjects. Many variables. We do not have data about the same subjects in every period.
- Time series data: Several time periods. 1 subject. Many variables.