21.2 The Formal Notation of Causality
A common mistake is defining causation using probability:
X causes Y if P(Y|X)>P(Y).
Seeing X (1st level) doesn’t mean the probability of Y increases.
It could be either that
- X causes Y, or
- Z affects both X and Y. We might be able use control variables - P(Y|X,Z=z)>P(Y|Z=z). But then the question becomes
- How to choose Z?
- Did you choose enough Z?
- Did you choose the right Z?
Hence, the previous statement is incorrect. The correct causal statement is:
P(Y|do(X))>P(Y).
With causal diagrams and do-calculus, we can formally express interventions and answer questions at the 2nd level (Intervention).