Chapter 2 M2: Descriptives
This module’s focus is on describing variables and values that we’ve observed. We’ll see both visualization tools and some numerical summary measures, whether for a single variable or for the relationships between two (or more!) variables. Some key skills are:
- Interpret contingency tables and proportions
- Interpret bar plots
- Be snarky about pie charts
- Describe the distribution of numerical values in terms of shape, center, and spread
- Find and interpret numerical summary measures like the mean, median, SD, variance, and IQR
- Discuss and decide when it’s appropriate to use each kind of summary (considering things like outliers and skew)
- Use terms like symmetry, skew, tail, uni/bi/multimodal, etc. to describe shape
- Interpret histograms and box plots, and decide which one would be more appropriate for a given context/dataset
- Explain why we might want to use a transformation on a quantitative variable
- You don’t need to be able to magically decide on the Best Transformation for a given situation :) But it’s good to think about the kind of effect some common transformations can have – for example, the log “shrinks in” large values to reduce right skew, while linear transformations like unit changes don’t alter the shape of a distribution, just its scale.
- Interpret visualizations for multiple variables (mosaic plots, different types of bar plots, scatterplots, faceted and side-by-side plots)
- Decide which visualizations are most appropriate for a given dataset/context
- Describe the relationship between numerical variables (think direction, form, and strength: is the relationship positive/negative, linear/nonlinear, weak/medium/strong?)
- Interpret a correlation value
- Write the basic linear regression equation and explain what each piece represents
- Define a residual and interpret it in context
- Formally interpret \(R^2\) (the “proportion of variation in the response explained by the model” thing)
- Notice when someone is doing unreasonable extrapolation, explain the problem, and avoid it in your own work :)