# Inferential Reasoning in Data Analysis

From data and models, to the things we care about

# Introduction

People who analyze data are usually interested in something other than the data they analyze. A financial analyst might use patterns and anomalies in market data to create an investment strategy for the upcoming year. A physician might reference data from a randomized controlled trial when deciding what drug to prescribe to a patient. A basketball coach might plan player rotations after looking at data collected from their next opponent’s recent matches. Members of a local board of education might look at data from state standardized tests to decide whether to approve a proposed change to the 6th grade English Language Arts curriculum. And so on.

In all of these examples, the purpose of using data is to learn about something other than the data:

- The financial analyst wants to predict future events. The data are about past events.
- The physician is interested in the health of the patient she’s currently seeing; she’s not writing prescriptions for the people in the randomized controlled trial.
- The basketball coach is putting together a plan for the next game; the data describe previous games that are over and were played by other teams.
- The members of the education board are trying to decide whether a new ELA curriculum will be more effective than the previous one in helping upcoming classes of 6th graders learn to read and write. The data describe children who are not in their district, and who are no longer in 6th grade.

So, how do we justify this? What gives us permission to treat what we found in some data as the answer to a question about anything outside those data? This is the challenge of inference. It’s temping to meet the challenge glibly:

- “The trend is obvious”
- “I took a large sample”
- “We controlled for outside variables”
- “The outcome is statistically significant”
- “We used a Bayesian analysis, so the hypothesis has a 0.001% chance of being correct”
- “The tiny confidence interval means there’s almost no uncertainty”

Every one of these *could* be an important component in a larger inferential argument. But every one of them could also be an irrelevant distraction.

To make sense of any statistical results, we’ll probably want details about how the sample was acquired, how measurements were taken, and what the reason for performing the analysis was in the first place. Was it to make a decision? Test a hypothesis? Estimate an unknown quantity? Search for a pattern? Predict future events? These questions matter; there’s nothing inherently meaningful in taking some data and performing mathematical operations on them. We need to establish a logical path between the data and whatever goal they’re meant to serve.

This might all sound obvious. I’m only repeating general principles commonly covered at the beginning of an introductory statistics course. The problem, in my view, is that these principles are too easily digested in their general form. So I’ve made a point to include examples where statistical reasoning that sounds straightforward in the abstract becomes confusing or even incoherent in application. To learn how something works, it’s helpful to also see where it fails.

Introductory statistics textbooks typically present the logic of inference using carefully curated examples, chosen to demonstrate this logic as cleanly as possible. And this is appropriate - to understand a data analysis tool, look first at how it works. But to use it competently without getting into trouble, you’ll want to know as much as possible about how it *doesn’t* work. Along these lines, we’ll see that:

- Boilerplate statistical interpretations that are easily memorized can also be unsatisfying in practice.
- “Rules of thumb” for making data analysis decisions will make sense in some, but not all, situations - otherwise they’d just be “rules”.

- When we’re lucky, software warns us that we’re trying to do something dangerous (e.g. with built-in model assumption checks, or notification that a model fitting algorithm failed to converge). But usually this isn’t possible; results that don’t answer your real-world questions are usually just as easy to calculate as those that do.
- Failing to implement a needed statistical adjustment can ruin an analysis. Also, implementing an unneeded statistical adjustment can ruin an analysis. Also, measurement error can prevent statistical adjustments from working properly in the first place.

Making sense of statistical results requires judgment. Hopefully it’s well informed judgment that we can defend if questioned. But the task of making judgments can’t be delegated to any “objective” or automated process. There is no flowchart for drawing well-justified inferences from data, nor a set of step-by-step rules for critically assessing such inferences. The tools for analyzing data are mathematical; we can describe them with precision and count on them to work the same way every time. But interpreting results requires consultation with that messy world beyond our data, the one which compelled us to do an analysis in the first place. This book is here to help you with that.

## Course topics outline

- What this class is and is not about
- The logic of classical statistical inference
- Bayesian vs. frequentist probability
- Models and assumptions
- Estimation vs. testing
- Quantifying magnitude
- Correlation, causation, and statistical control
- A few statistical artifacts
- The warning label