21.2 The Formal Notation of Causality

A common mistake is defining causation using probability:

X causes Y if P(Y|X)>P(Y).

Seeing X (1st level) doesn’t mean the probability of Y increases.

It could be either that

  1. X causes Y, or
  2. Z affects both X and Y. We might be able use control variables - P(Y|X,Z=z)>P(Y|Z=z). But then the question becomes
    1. How to choose Z?
    2. Did you choose enough Z?
    3. Did you choose the right Z?

Hence, the previous statement is incorrect. The correct causal statement is:

P(Y|do(X))>P(Y).

With causal diagrams and do-calculus, we can formally express interventions and answer questions at the 2nd level (Intervention).