38.1 Basic Notation and Graph Structures

Directed Acyclic Graphs are composed of basic building blocks that define relationships between variables.

  1. Mediators (Chains)

XZY

  • Variable Z mediates the effect of X on Y.
  • Controlling for Z blocks the indirect effect of X on Y.
  • Use case in marketing: Email promotion (X) → customer interest (Z) → purchase (Y). Controlling for interest removes the indirect path, isolating the direct impact.
  1. Common Causes (Forks)

XZY

  • Z is a confounder, creating a spurious association between X and Y.
  • To estimate the causal effect of X on Y, Z must be controlled.
  • Use case in finance: An economic indicator (Z) affects both stock investment decisions (X) and market returns (Y).

Key concept: If Z is not controlled, X and Y may appear correlated due to a shared cause rather than a causal link.

  1. Common Effects (Colliders)

XZY

  • Z is a collider, and controlling for it induces a spurious association between X and Y.
  • Do not control for Z or its descendants.
  • Use case in HR analytics: Two independent hiring factors (X = education, Y = experience) both influence a decision variable Z (hiring outcome). Conditioning on being hired can create an artificial correlation between education and experience.

Other Concepts

  • Descendants: Any variable downstream from a node; controlling for a descendant can have similar effects to controlling for the ancestor.
  • d-Separation: A graphical criterion to determine conditional independence. If all paths between X and Y are blocked by controlling for a set of variables Z, then X is d-separated from Y given Z.