Chapter 1 Method of Moments

1.1 Introduction

Method of moments estimation is based solely on the law of large numbers, which we repeat here:

Let M1,M2,... be independent random variables having a common distribution possessing a mean μM. Then the sample means converge to the distributional mean as the number of observations increase.

ˉMn=1nni=1MiμM, as n

To show how the method of moments determines an estimator, we first consider the case of one parameter. We start with independent random variables X1,X2,... chosen according to the probability density fX(x|θ) associated to an unknown parameter value θ. The common mean of the Xi, μX, is a function k(θ) of θ. For example, if the Xi are continuous random variables, then

μX=xfX(x|θ)dx=k(θ).

The law of large numbers states that

ˉXn=1nni=1XiμX, as n

Thus, if the number of observations n is large, the distributional mean, μ=k(θ), should be well approximated by the sample mean, i.e.,

ˉXk(θ)

This can be turned into an estimator ˆθ by setting

ˉXk(ˆθ)

and solving for ˆθ.

1.2 The Procedure

More generally, for independent random variables X1,X2,... chosen according to the probability distribution derived from the parameter value θ and m a real valued function, if k(θ)=Eθm(X1), then

1nni=1m(Xi)k(θ), as n

The method of moments results from the choices m(x)=xm. Write

µ_m = EX^m = k_m(\theta).

for the m-th moment.

Our estimation procedure follows from these 4 steps to link the sample moments to parameter estimates.

  • Step 1. If the model has d parameters, we compute the functions k_m for the first d moments,

\mu_1 = k_1(\theta_1, \theta_2 ..., \theta_d), \mu_2 = k_2(\theta_1, \theta_2 ..., \theta_d), ..., \mu_d = k_d(\theta_1, \theta_2 ..., \theta_d),

obtaining d equations in d unknowns.

  • Step 2. We then solve for the d parameters as a function of the moments.

\theta_1 = g_1(\mu_1, \mu_2, ··· , \mu_d), \theta_2 = g_2(\mu_1, \mu_2, ··· , \mu_d), ..., \theta_d = g_d(\mu_1, \mu_2, ··· , \mu_d)

  • Step 3. Now, based on the data x = (x_1, x_2,...,x_n), we compute the first d sample moments

\bar{x} = \frac{1}{n}\sum_{i=1}^n x_i, \bar{x^2} = \frac{1}{n}\sum_{i=1}^n x_i^2, \dots, \bar{x^d} = \frac{1}{n}\sum_{i=1}^n x_i^d,

  • Step 4. We replace the distributional moments \mu_m by the sample moments x_m, then the formulas for the method of moment estimators (\hat{\theta}_1,\hat{\theta}_2,\dots,\hat{\theta}_d). For the data x, these estimates are

\hat{\theta}_1(x)=g_1(\bar{x},\bar{x^2},\dots,\bar{x^d}),\hat{\theta}_2(x)=g_1(\bar{x},\bar{x^2},\dots,\bar{x^d}),\dots,\hat{\theta}_d(x)=g_d(\bar{x},\bar{x^2},\dots,\bar{x^d}).

1.3 Example

Consider the uniform distribution on the interval [a,b], U(a,b). If W\sim U(a,b) then we have

\mu_1 = \mathbb{E}[W] = \frac{1}{2}(a+b) \\ \mu_2 = \mathbb{E}[W^2] = \frac{1}{3}(a^2 + ab +b^2)

Solving these equations gives

\hat{a} = \mu_1 - \sqrt{3(\mu_2-\mu_1^2)} \\ \hat{b} = \mu_1 + \sqrt{3(\mu_2-\mu_1^2)}

Given a set of samples w_1,w_2,\dots we can use the sample moments \hat{\mu}_1 and \hat{\mu}_2 in these formula in order to estimate a and b.