20.8 Putting It All Together: Comparing Objectives

As an overarching illustration, let ˆf be any trained predictor (ML model, regression, etc.) and let ˆβ be a parameter estimator from a structural or causal model. Their respective tasks differ:

  • Form of Output
    • ˆf is a function from XY.
    • ˆβ is a vector of parameters with theoretical meaning.
  • Criterion
    • Prediction: Minimizes predictive loss E[L(Y,ˆf(X))].
    • Causal Inference: Seeks β such that Y=mβ(X) is a correct structural representation. Minimizes bias in β, or satisfies orthogonality conditions in method-of-moments style, etc.
  • Validity
    • Prediction: Usually validated by out-of-sample experiments or cross-validation.
    • Estimation: Validated by theoretical identification arguments, assumptions about exogeneity, randomization, or no omitted confounders.
  • Interpretation
    • Prediction: “ˆf(x) is our best guess of Y for new x.”
    • Causal Inference: “β measures how Y changes if we intervene on X.”