3.9 Observational Studies vs. Experiments

In some studies, researchers do not assign study participants to groups/conditions. One example of this is the murderous nurse study. In this study, the two groups being compared, shifts when Gilbert was working and shifts when Gilbert was not working, were not assigned by the researchers — the groups were “just there.” When the groups are not formed by the researcher using randomization, the study is referred to as an observational study.

Observational studies vs. experiments

In an experiment, participants are randomly assigned to comparison groups.

In an observational study, the groups are “just there”, participants are not randomly assigned to the groups.

3.9.1 Hypothesis tests for observational studies

Data in observational studies may still be affected by random chance. For example, in the murderous nurse study, the death rate on a given shift is at least partially affected by randomness.

Thus, the null hypothesis for an observational study that compares two groups is similar to the null hypothesis for an experiment:

Null hypothesis for an observational study: There is no difference between the groups and any observed difference is due to random chance.

To conduct a hypothesis test on an observational study that compares two groups, researchers use similar methods as they use for data from an experiment. For both kinds of studies, we can use shuffler model and a randomization test. The reason we can use the same methods is that the null hypothesis is the same for both types of study. By combining all of the data together, we model the hypothesis that there is no difference between the groups. By randomly re-shuffling the data into groups, we find the expected variation due to random chance.

3.9.2 Drawing causal inferences from observational studies

In an observational study, researchers have less control over the timing of an intervention and the makeup of the groups. This means that it can be more difficult to establish the three criteria for causation. In particular, it may be more difficult to establish timing (that the cause came before the effect), and especially, it can be more difficult to rule out plausible alternative explanations because there may be differences between the groups. For example, in the murderous nurse study it may be that Gilbert worked on shifts that tended to have more high-risk patients.

It is especially important to scrutinize causal claims from observational studies, as sometimes these claims can be misleading and can even be construed as unethical. In 1988, results released to the public from the National Household Survey on Drug Abuse created the false perception that crack cocaine smoking was related to ethnicity. The analysis, which was based on observational data (researchers cannot assign race) showed that the rates of crack use among Black and Latinx people were twice as high as among White people. The data were re-analyzed in 1992 by researchers from Johns Hopkins University to take into account social factors such as where the users lived and how easily the drug could be obtained. They found that after adjusting for these factors, there were no differences among racial groups in the use of crack cocaine.

That said, it is not impossible to make a causal argument from an observational study. This is an active area of research, and in fact, the 2021 Nobel Prize in Economics was awarded to researchers for their “for their methodological contributions to the analysis of causal relationships” in observational studies. Researchers use a variety of techniques to establish the three criteria, including statistical techniques to “control” for confounding variables. We won’t get into these techniques in this class, but the important thing to look for is how the researchers are making their argument. In the end, making argument for a cause-and-effect inference is a human activity, and each of us has to evaluate the plausibility of the argument.