20.8 Putting It All Together: Comparing Objectives

As an overarching illustration, let $\hat{f}$ be any trained predictor (ML model, regression, etc.) and let $\hat{\beta}$ be a parameter estimator from a structural or causal model. Their respective tasks differ:

Form of Output
- $\hat{f}$ is a function from $\mathcal{X} \to \mathcal{Y}$ .
- $\hat{\beta}$ is a vector of parameters with theoretical meaning.
Criterion
- Prediction: Minimizes predictive loss $\mathbb{E}[L(Y,\hat{f}(X))]$ .
- Causal Inference: Seeks $\beta$ such that $Y = m_\beta(X)$ is a correct structural representation. Minimizes bias in $\beta$ , or satisfies orthogonality conditions in method-of-moments style, etc.
Validity
- Prediction: Usually validated by out-of-sample experiments or cross-validation.
- Estimation: Validated by theoretical identification arguments, assumptions about exogeneity, randomization, or no omitted confounders.
Interpretation
- Prediction: “ $\hat{f}(x)$ is our best guess of $Y$ for new $x$ .”
- Causal Inference: “ $\beta$ measures how $Y$ changes if we intervene on $X$ .”