20.8 Putting It All Together: Comparing Objectives
As an overarching illustration, let ˆf be any trained predictor (ML model, regression, etc.) and let ˆβ be a parameter estimator from a structural or causal model. Their respective tasks differ:
- Form of Output
- ˆf is a function from X→Y.
- ˆβ is a vector of parameters with theoretical meaning.
- Criterion
- Prediction: Minimizes predictive loss E[L(Y,ˆf(X))].
- Causal Inference: Seeks β such that Y=mβ(X) is a correct structural representation. Minimizes bias in β, or satisfies orthogonality conditions in method-of-moments style, etc.
- Validity
- Prediction: Usually validated by out-of-sample experiments or cross-validation.
- Estimation: Validated by theoretical identification arguments, assumptions about exogeneity, randomization, or no omitted confounders.
- Interpretation
- Prediction: “ˆf(x) is our best guess of Y for new x.”
- Causal Inference: “β measures how Y changes if we intervene on X.”