Chair of Spatial Data Science and Statistical Learning

Lecture 1 Parameter Estimation

1.1 Overview

In this part we will learn the two most basic inferential methods in statistics, namely Maximum Likelihood Estimation (MLE) in Section 1.3 and the Methods of Moments (MoM) in Section 1.4. We will look at the question, why and how we estimate parameters for given distributions.

Parameter estimation is essential when fitting a statistical distribution to data. Hence, before we can estimate our parameters, it is to make an assumption about the underlying distribution.

1.2 Example: Coin Flipping

Let’s motivate this topic with a simple coin flipping example, where there are two possible outcomes Heads and Tails: \(\Omega = {H, T}\). We define the probability of getting Heads as \(p_H\).

1.2.1 Two Coin Flips

You flipped a coin twice, getting Heads once and Tails once. Is this coin fair? What is the estimated probability for Heads on this coin?

Note that you have to revers your thinking from most introductory stats classes: usually, you are asked to give an estimation of the probability of different possible events, given a distribution already equipped with a certain parameter/parameters. Now we want to find out the probability of the events we observed, given different possibilities of parameters. This is how we will find out, which parameter is more likely (for the data at hand).

For the given example, using two different parameter values \(p_H\) the joint probability is calculated as follows:

Fair game (i.e., \(p_H=0.5\)): \[\begin{eqnarray*} P(H,T|p_H=0.5)&\overset{\text{ind.}}{=}&P(H|p_H=0.5) \cdot P(T|p_H=0.5)\\ &=&0.5\cdot 0.5\\ &=&0.25 \end{eqnarray*}\] Unfair game (i.e., \(p_H \neq 0.5\)): \[\begin{eqnarray*} P(H,T|p_H=0.4)&\overset{\text{ind.}}{=}&P(H|p_H=0.4) \cdot P(T|p_H=0.4)\\ &=&0.4\cdot 0.6\\ &=&0.24 \end{eqnarray*}\]

The domain of \(p_H\) reaches from zero (no probability for head) to one (certain that head will come up). Therefore, we can easily plot the distribution as a function of the parameter while keeping the data fixed:

It can be observed that our data is most likely for a parameter value of \(p_H = 0.5\), which suggests the game is fair

1.2.2 Three Coin Flips

You flipped a coin three times, getting Heads once and Tails twice. Is this coin fair? What is the estimated probability for Heads on this coin?

We again calculate the joint probabilities for different parameter values:

Fair game (i.e., \(p_H=0.5\)): \[\begin{eqnarray*} P(H, T, T|p_H\!=\!0.5)&\overset{\text{ind.}}{=}&P(H|p_H\!=\!0.5) \cdot P(T|p_H\!=\!0.5) \cdot P(T|p_H\!=\!0.5)\\ &=&0.5\cdot 0.5\cdot 0.5\\ &=&0.125 \end{eqnarray*}\]

Unfair game (i.e., \(p_H \neq 0.5\)): \[\begin{eqnarray*} P(H, T, T|p_H\!=\!0.4)&\overset{\text{ind.}}{=}&P(H|p_H\!=\!0.4) \cdot P(T|p_H\!=\!0.4) \cdot P(T|p_H\!=\!0.4)\\ &=&0.4\cdot 0.6 \cdot 0.6\\ &=&0.144 \end{eqnarray*}\]

The resulting probability distribution for all possible values of \(p_H\) given the data looks as follows:

We find that for the parameter value \(p_H = \frac{1}{3}\) the present data has the highest probability, indicating that the game is not fair.

1.2.3 Formalization of the Question

In principle, the following questions are identical:

  • I flipped my coin \(n\) times and took down the results. What is the probability of flipping Heads with this coin?

  • If \(p\) is the probability of Heads: which \(p\) has the highest probability for the coin tosses I observed?

  • What does the Maximum-Likelihood-Estimator \(\hat{p}_{\text{ML}}\) look like for the observations \(X_1,\ldots,X_n\)?

In the next section we will translate the question to a mathematical optimization problem.

1.3 Maximum Likelihood Estimation

1.3.1 Procedure: Bernoulli Distribution

1.3.2 Recipe for MLE

1.3.3 Example: Mean Google Star Ratings

1.3.4 Derivation: MLE for the Variance

1.3.5 Why Maximum Likelihood Estimation?

1.4 Method of Moments

1.4.1 Moments

1.4.2 Concept

1.4.3 Example

1.5 Bias