Chapter 6 Maximum Likelihood Estimation
In general, when we observe independent and identically distibuted data \(y_1,\dots,y_n\sim p(y;\theta)\), the maximum likelihood estimate of the parameter vector \(\theta\) is the value that maximizes the log-likelihood of \(\theta\), which can be written as \(\sum_{i=1}^n \log p(y_i; \theta)\). However, what if the data are not independent? How can we write down and maximize the log-likelihood?