5 Integration

In statistical applications we often need to compute quantities of the form \[ \mathbb{E}_f g(X) = \int g(x) f(x)\,dx \] where \(X\) is a random variable drawn from a distribution with probability mass function \(f\). Another quantity that we often need to compute is the normalizing constant for a probability density function. If \(X\) has a density that is proportional to \(p(x\mid\theta)\) then its normalizing constant is \(\int p(x\mid\theta)\,dx\).

In both problems—computing the expectation and computing the normalizing constant—an integral must be evaluated.

Approaches to solving the integration problem roughly fall into two categories. The first categories involves identifying a sequence of estimates to the integral that eventually converge to the true value as some index (usually involving time or resources) goes to infinity. Adaptive quadrature, independent Monte Carlo and Markov chain Monte Carlo techniques all fall into this category. Given enough time and resources, these techniques should converge to the true value of the integral.

The second category of techniques involves identifying a class of alternative functions that are easier to work with, finding the member of that class that best matches the true function, and then working with the alternate function instead to compute the integral. Laplace approximation, variational inference, and approximate Bayes computation (ABC) fall into this category of approaches. For a given dataset, these approaches will not provide the true integral value regardless of time and resources, but as the sample size increases, the approximations will get better.