Chapter 2 Maximum Likelihood Estiamtion
2.1 Introduction
The Maximum Likelihood Estimation (MLE) is a method of estimating the parameters of a model. This estimation method is one of the most widely used.
The method of maximum likelihood selects the set of values of the model parameters that maximizes the likelihood function. Intuitively, this maximizes the “agreement” of the selected model with the observed data.
The Maximum-likelihood Estimation gives an unified approach to estimation.
2.2 The Principle of Maximum Likelihood
We take poisson distributed random variables as an example. Suppose that X1,X2,…,XN are i.i.d. discrete random variables, such that Xi∼Pois(θ) with a pmf (probability mass function) defined as:
Pr(Xi=xi)=exp(−θ)θxixi!
where θ is an unknown parameter to estimate.
Question: What is the probability of observing the particular sample {x1,x2,…,xN}, assuming that a Poisson distribution with as yet unknown parameter θ generated the data?
This probability is equal to
Pr((X1=x1)∩⋯∩(XN=xN))
Since the variables Xi are i.i.d., this joint probability is equal to the product of the marginal probabilities:
Pr((X1=x1)∩⋯∩(XN=xN))=N∏i=1Pr(Xi=xi)
Given the pmf of the Poisson distribution, we have:
Pr((X1=x1)∩⋯∩(XN=xN))=N∏i=1exp(−θ)θxixi!=exp(−θN)θ∑Ni=1xi∏Ni=1xi!
This joint probability is a function of θ (the unknown parameter) and corresponds to the likelihood of the sample {x1,x2,…,xN} denoted by
L(x1,…,xN|θ)=Pr((X1=x1)∩⋯∩(XN=xN))
Consider maximizing the likelihood function L(x1,…,xN|θ) with respect to θ. Since the log function is monotonically increasing, we usually maximize lnL(x1,…,xN|θ) instead. We call this as loglikelihood function: ℓ(x1,…,xN|θ)=lnL(x1,…,xN|θ), or simply ℓ(θ). In this case:
ℓ(x1,…,xN|θ)=−θN+ln(θ)N∑i=1xi−ln(N∏i=1xi!)
The simplest way to find the θ that maximizes ℓ(θ) is to take a derivative.
∂ℓ(θ)∂θ=−N+1θN∑i=1xi
To make sure that we indeed maximize not minimize ℓ(θ), we should also check that the second derivative is less than 0:
∂2ℓ(θ)∂θ2=−1θ2N∑i=1xi<0
Therefore, the maximum likelihood estimator ˆθmle is:
ˆθmle=1NN∑i=1xi
For the Laplace model, the maximum-likelihood estimates are:
ˆμ=median(xt)ˆb=1nn∑i=1|xt−ˆμ|
Note that they are different from the MOM results.