21.2 The Formal Notation of Causality
A common mistake is defining causation using probability:
\[ X \text{ causes } Y \text{ if } P(Y | X) > P(Y). \]
Seeing \(X\) (1st level) doesn’t mean the probability of Y increases.
It could be either that
- \(X\) causes Y, or
- \(Z\) affects both \(X\) and \(Y\). We might be able use control variables - \(P(Y|X, Z = z) > P(Y|Z = z)\). But then the question becomes
- How to choose \(Z\)?
- Did you choose enough \(Z\)?
- Did you choose the right \(Z\)?
Hence, the previous statement is incorrect. The correct causal statement is:
\[ P(Y | do(X)) > P(Y). \]
With causal diagrams and do-calculus, we can formally express interventions and answer questions at the 2nd level (Intervention).