7.4 Estimation

We can estimate model parameters using maximum likelihood. But first, we must write down the likelihood. Given that the conditional intensity function fully specifies a point process, it seems clear that the likelihood function should depend on that.

Given a dataset of event times $t_1,t_2,\dots,t_n$ observed on the interval $[0,T]$ , the log-likelihood for a model $\lambda(t\mid H_t; \theta)$ with parameter vector $\theta$ is

$\ell(\theta) = \sum_{i=1}^n\log\lambda(t_i\mid H_t; \theta) - \int_0^T \lambda(t\mid H_t;\theta)\,dt.$

The log-likelihood has two parts

The first sum “rewards” a model for having high intensity where event times are located.
The integral part, because it is subtracted off, rewards a model for having low intensity where the event times are not located.

With point process models, it is just as important to get it right where the points are not located as it is to model where the points are located.

Typically, the log-likelihood $\ell(\theta)$ will be a nonlinear function and therefore will need to be maximized using standard nonlinear optimization routines.

For a stationary Poisson process with rate $\lambda$ , we see that the log-likelihood reduces down the familiar

$\ell(\lambda) = n\log\lambda - \lambda T.$ The maximum likelihood of $\lambda$ is therefore $\hat{\lambda}=n/T.$