16.2 Average Marginal Effect Algorithm

For one-sided derivative $\frac{\partial p(\mathbf{X},\beta)}{\partial X}$ in the probability scale

Estimate your model
For each observation $i$
1. Calculate $\hat{Y}_{i0}$ which is the prediction in the probability scale using observed values
2. Increase $X$ (variable of interest) by a “small” amount $h$ ( $X_{new} = X + h$ )
  - When $X$ is continuous, $h = (|\bar{X}| + 0.001) \times 0.001$ where $\bar{X}$ is the mean value of $X$
  - When $X$ is discrete, $h = 1$
3. Calculate $\hat{Y}_{i1}$ which is the prediction in the probability scale using new $X$ and other variables’ observed vales.
4. Calculate the difference between the two predictions as fraction of $h$ : $\frac{\bar{Y}_{i1} - \bar{Y}_{i0}}{h}$
Average numerical derivative is $E[\frac{\bar{Y}_{i1} - \bar{Y}_{i0}}{h}] \approx \frac{\partial p (Y|\mathbf{X}, \beta)}{\partial X}$

Two-sided derivatives

Estimate your model
For each observation $i$
1. Calculate $\hat{Y}_{i0}$ which is the prediction in the probability scale using observed values
2. Increase $X$ (variable of interest) by a “small” amount $h$ ( $X_{1} = X + h$ ) and decrease $X$ (variable of interest) by a “small” amount $h$ ( $X_{2} = X - h$ )
  - When $X$ is continuous, $h = (|\bar{X}| + 0.001) \times 0.001$ where $\bar{X}$ is the mean value of $X$
  - When $X$ is discrete, $h = 1$
3. Calculate $\hat{Y}_{i1}$ which is the prediction in the probability scale using new $X_1$ and other variables’ observed vales.
4. Calculate $\hat{Y}_{i2}$ which is the prediction in the probability scale using new $X_2$ and other variables’ observed vales.
5. Calculate the difference between the two predictions as fraction of $h$ : $\frac{\bar{Y}_{i1} - \bar{Y}_{i2}}{2h}$
Average numerical derivative is $E[\frac{\bar{Y}_{i1} - \bar{Y}_{i2}}{2h}] \approx \frac{\partial p (Y|\mathbf{X}, \beta)}{\partial X}$

library(margins)
library(tidyverse)

data("mtcars")
mod <- lm(mpg ~ cyl * disp * hp, data = mtcars)
margins::margins(mod) %>% summary()
#>  factor     AME     SE       z      p    lower   upper
#>     cyl -4.0592 3.7619 -1.0790 0.2806 -11.4324  3.3141
#>    disp -0.0350 0.0132 -2.6471 0.0081  -0.0610 -0.0091
#>      hp -0.0284 0.0185 -1.5348 0.1248  -0.0647  0.0079

# function for variable
get_mae <- function(mod, var = "disp") {
    data = mod$model
    
    pred <- predict(mod)
    
    if (class(mod$model[[{
        {
            var
        }
    }]]) == "numeric") {
        h = (abs(mean(mod$model[[var]])) + 0.01) * 0.01
    } else {
        h = 1
    }
    
    data[[{
        {
            var
        }
    }]] <- data[[{
{
var
}
}]] - h

    pred_new <- predict(mod, newdata = data)

    mean(pred - pred_new) / h
}

get_mae(mod, var = "disp")
#> [1] -0.03504546