1.2 Does correlation imply causation?
Even if we have identified a significant association between two variables, this does not mean we have proven that increasing one variable will cause an increase (or decrease) in the other. All we have shown is that there is an association. It may be that changes in one of the variables does cause changes in the other. Or it may be that both variables don't have much to do with each other at all!
For example, even though we have seen that there is a significant positive association between income and happiness, either of the following could be true:
- Increases in income cause increases in happiness
- Increases in happiness cause increases in income.
Proving that either of the above statements were true is possible, but it would require an appropriate study design.
It could also be true that neither of the above statements are true! If this were the case, it may be an example of spurious correlation.
Spurious correlation occurs when we see an association between two variables, but this association is not causal. The association could have happened by chance, or there could be a "confounding factor" - a third variable with which both variables are associated.
Let's consider an example. The below chart displays both the number of people who drowned after falling out of a fishing boat, and the marriage rate in Kentucky, US, from 1999 to 2010:
We can see that the higher the marriage rate, the more people who drowned after falling out of a fishing boat. In fact, the correlation between these two variables is 0.9524 (Vigen 2015), which is extremely high! Does this mean that marriage is dangerous? Since it is quite difficult to come up with a logical explanation as to why these two variables would be related to each other, it is much more likely that this is a case of spurious correlation. Do you think this spurious correlation happened by chance? Or do you think there may have been other factors at play? Sometimes, spurious correlations simply do not make sense. Check out this site for more strange examples of spurious correlation!