17.1 Definition of Marginal Effects

Mathematically, the marginal effect of an independent variable $X$ on the expected value of a dependent variable $Y$ is given by:

$\frac{\partial E[Y|X]}{\partial X}$

which represents the instantaneous rate of change of $E[Y|X]$ with respect to $X$ .

For a linear regression model:

$Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \dots + \beta_k X_k + \varepsilon$

the marginal effect of $X_j$ is simply $\beta_j$ . However, in more complex cases, such as nonlinear models, interaction effects, or transformations, marginal effects are not directly given by the regression coefficients and must be computed explicitly.

17.1.1 Analytical Derivation of Marginal Effects

In models where $E[Y|X]$ is a differentiable function of $X$ , marginal effects are computed using calculus. The derivative of a function $f(x)$ is given by:

$f'(x) \equiv \lim_{h \to 0} \frac{f(x+h) - f(x)}{h}$

Example: Quadratic Function

Consider the function:

$f(x) = x^2.$

The marginal effect is derived as follows:

$\begin{aligned} f'(x) &= \lim_{h \to 0} \frac{(x+h)^2 - x^2}{h} \\ &= \frac{x^2 + 2xh + h^2 - x^2}{h} \\ &= \frac{2xh + h^2}{h} \\ &= 2x + h. \end{aligned}$

As $h \to 0$ , the marginal effect simplifies to:

$f'(x) = 2x.$

Thus, for small changes in $x$ , the effect on $f(x)$ depends on $x$ itself.

17.1.2 Numerical Approximation of Marginal Effects

In practice, analytical differentiation may be infeasible, particularly when dealing with complex functions, large datasets, or models without closed-form derivatives. In such cases, numerical differentiation provides an alternative.

17.1.2.1 One-Sided Numerical Approximation

A simple way to approximate the derivative is the forward difference formula:

$\begin{aligned} f'(x) &= \lim_{h \to 0} \frac{(x+h)^2 - x^2}{h} \\ & \approx \frac{f(x+h) -f(x)}{h} \end{aligned}$

where $h$ is a small step size.

17.1.2.2 Two-Sided Numerical Approximation

A more accurate method is the central difference formula:

$f'_2(x) \approx \frac{f(x+h) - f(x-h)}{2h}.$

This approach reduces numerical error and is generally preferred in computational implementations.

17.1.2.3 Choosing an Appropriate $h$

The choice of $h$ is critical (Gould, Pitblado, and Poi 2010, chap. 1):

Too small: Can lead to numerical instability due to floating-point precision limitations.
Too large: Reduces the accuracy of the approximation.

A common heuristic is to set $h = 10^{-5}$ or a small fraction of the standard deviation of $X$ .

Comparison of Analytical and Numerical Methods
Method	Advantages	Disadvantages
Analytical	Provides exact expressions	Requires differentiability, not always feasible
Numerical	Works for any function, easy to implement	Requires careful choice of step size $h$

Numerical derivatives are often preferred in empirical applications, especially when working with complex models or machine learning algorithms.

Comparison of Analytical and Numerical Approaches for Marginal Effects and Standard Errors
	Analytical Derivation	Numerical Approximation
Marginal Effects	Uses calculus (rules of differentiation)	Uses finite differences to approximate derivatives
Standard Errors	Derived using variance rules	Estimated via the delta method using the variance-covariance matrix

References

Gould, W, J Pitblado, and B Poi. 2010. “Maximum Likelihood Estimation with Stata: StataCorp LP.” Stata press; Texas: Stata Press.