Chapter 13 Birth and Death Process, MCMC for Discrete Distribution(Lecture on 02/16/2021)

Last class we discussed Poisson process and birth and death process. In Poisson process, N(t) is the number of events happend upto time t, while in birth and death process, N(t) is the number of individuals in the system at time t. The definition of birth and death process is given by Definition 12.2.

If we define pj(t)=P(N(t)=j), then use the results we discussed at the end of last class, we can show that pj(t)=(λj+μj)pj(t)+λj1pj1(t)+μj+1pj+1(t)

If the population starts with one individual, then P(N(0)=1)=1 or p1(0)=1 is the boundary conditions. Note that the boundary conditions may change if we know some other information about the process.

The birth and death process have some special cases:

  1. Pure birth process, μi=0,i.

  2. If μi=0,i and λi=λ for all i, then it becomes a Poisson process. Only the boundary condition changes for the Poisson process as for the Poisson process N(0)=0.

  3. If λi=iλ and μi=iμ for all i, then it is called a linear birth and death process.

  4. If μi=0 and λi=iλ,i, then it is called a linear birth process.

  5. If μi=iλ and λi=0,i, then it is called a linear death process.

Solving pj(t) from (13.1) with its general form of λi,μi is difficult. However, closed form expression exists if we concentrate on either the linear birth and deanth process or any of its variants (for example, λi=iλ+a,i).

Last class we have discussed how to solve pj(t) from the differential equations. However, sometimes interest lies in M(t)=E[N(t)]. We will see techniques to solve M(t) from different variants of birth and death processes.

The simplest case of the birth and death processes is the Poisson process. Recall that the differential equations from the Poisson process was pj(t)=λpj1(t)λpj(t),j1,p0(0)=1

We then have M(t)=E(N(t))=j=1jpj(t)

By taking derivatives with respect to t, and using the (13.2), we get M(t)=j=1jpj(t)=j=1j[λpj1(t)λpj(t)]=λj=1jpj(t)+λj=1jpj1(t)=λM(t)+λj=0(j+1)pj(t)=λM(t)+λj=0jpj(t)+λj=0pj(t)=λM(t)+λM(t)+λ(asj=0pj(t)=1)=λ

Therefore, we have t0M(s)ds=t0λds, which implies M(t)M(0)=λt or M(t)=M(0)+λt Note that M(0)=E(N(0))=0 for a Poisson process. Thus, M(t)=λt

This is not a surprising result, given Theorem 12.1, where we showed N(t) follows a Poisson distribution with parameter λt. However, here we get the same result using a different approach.

Now let us do the same for a linear birth process. For a linear birth process, λj=jλ,j and μj=0. Therefore, the differential equations from the birth and death process boil down to pj(t)=λjpj(t)+λ(j1)pj1(t),j1 By definition, M(t)=E(N(t))=j=1jpj(t) taking derivatives with respect to t and use the result of (13.7), we have M(t)=j=1jp(t)=j=1j[λjpj(t)+λ(j1)pj1(t)]=λj=1j2pj(t)+λj=1j(j1)pj1(t) use the formula j(j1)=(j1)2+(j1), we can further simplify (13.9) as M(t)=λj=1j2pj(t)+λj=1[(j1)2+(j1)]pj1(t)=λj=1j2pj(t)+λj=1(j1)2pj1(t)+λj=1(j1)pj1(t)=λj=1(j1)pj1(t)=λM(t) We have the differential equations as M(t)=λM(t) or M(t)λM(t)=0. The soultion to this is still comes from integrating factors. We have eλtM(t)eλtλM(t)=0 which gives us ddt[eλtM(t)]=0 Integrate on both side we have t0ddt[eλtM(t)]=eλtM(t)M(0)=0 Thus, M(t)=M(0)eλt If M(0)=1, that is, the process starts with one individual, then M(t)=eλt. In general, if the M(0)=n0, M(t)=n0eλt.

Now consider the linear birth and death process, since λj=jλ and μj=jμ, the differential equation for pj(t) is pj(t)=j(λ+μ)pj(t)+(λ1)λpj1(t)+(j+1)μpj+1(t) Thus, using this equation we have M(t)=j=1jpj(t)=j=1j[j(λ+μ)pj(t)+(j1)λpj1(t)+(j+1)μpj+1(t)]=(λ+μ)j=1j2pj(t)+λj=1j(j1)pj1(t)+μj=1j(j+1)pj+1(t) Using the two facts, {j(j1)=(j1)2+(j1)j(j+1)=(j+1)2(j+1) we further have M(t)=(λ+μ)j=1j2pj(t)+λj=1[(j1)2+(j1)]pj1(t)+μj=1[(j+1)2(j+1)]pj+1(t)=(λ+μ)j=1j2pj(t)+λj=1(j1)2pj1(t)+λj=1(j1)pj1(t)+μj=1(j+1)2pj+1(t)μj=1(j+1)pj+1(t)=λj=1j2pj(t)μj=1j2pj(t)+λj=1(j1)2pj1(t)+λj=1(j1)pj1(t)+μ[j=1(j+1)2pj+1(t)+1p1(t)]μ[j=1(j+1)pj+1(t)+1p1(t)]=λj=1(j1)pj1(t)μj=1(j)pj(t)=λM(t)μM(t)

Thus, the differential equation is M(t)=(λμ)M(t) Using the same technique, we can finally get M(t)=M(0)e(λμ)t

The result implies that, if λ>μ, then expected number of people witll increase over time, but if λ<μ, then the population will be extinct over time.

Let us check why MCMC works.

Suppose the parameter of interest θΘ where Θ is a discrete set. Let π(θ) denote the prior distribution and the posterior distribution is given by π(θ|x)=f(x|θ)π(θ)ψΘf(x|ψ)π(ψ)

Let \boldsymbol{\pi}=(\pi(\theta|\mathbf{x}):\theta\in\Theta)=(\pi_1,\pi_2,\cdots). We construct a Markov chain \{X_n\}_{n\geq 1}. Suppose X_n=i, we want to define X_{n+1} using the following ingredients.

  1. Let H=(h_{ij}:i,j\in\Theta) be an arbitrary stochastic matrix, called the proposal matrix, such that P(Y=j|X_n=i)=h_{ij}.

  2. Let the acceptance probability matrix \mathbf{A}=(a_{ij}:i,j\in\Theta) be a matrix with entries satisfying 0\leq a_{ij}\leq 1. a_{ij}s are called “acceptance probabilities”. a_{ij} is given by \begin{equation} a_{ij}=\min\{1,\frac{\pi_j h_{ji}}{\pi_i h_{ij}}\} \tag{13.22} \end{equation}

The algorithm is: given that Y=j, we set X_{n+1}=\left\{\begin{aligned} & j & \text{with probability}\, a_{ij}\\ & X_n & \text{with probability}\, 1-a_{ij}\end{aligned}\right. Now we find the transition probability matrix P=(p_{ij}:i,j\in\Theta) where p_{ij}=P(X_{n+1}=j|X_n=i).

If i\neq j, the proposal get accepted, then \begin{equation} \begin{split} p_{ij}&=P(Y=j|X_n=i)P(\text{proposal is accepted})\\ &=h_{ij}a_{ij} \end{split} \tag{13.23} \end{equation}

If i=j, then \begin{equation} \begin{split} p_{ij}&=1-\sum_{k\neq i}p_{ik}\\ &=1-\sum_{k\neq i}h_{ik}a_{ik} \end{split} \tag{13.24} \end{equation}

Now we claim that, \pi_kp_{kj}=\pi_jp_{jk} for all j,k\in\Theta, where \boldsymbol{\pi} is the posterior probabilities. We now prove it.

Firstly, if j\neq k, the first case might be \pi_jh_{jk}<\pi_kh_{kj}. In that case, using definition of a_{kj}, we have \begin{equation} a_{kj}=\frac{\pi_jh_{jk}}{\pi_kh_{kj}} \tag{13.25} \end{equation} Therefore, by (13.23) and (13.25), \begin{equation} \begin{split} \pi_kp_{kj}&=\pi_kh_{kj}a_{kj}\\ &=\pi_kh_{kj}\frac{\pi_jh_{jk}}{\pi_kh_{kj}}\\ &=\pi_jh_{jk} \end{split} \tag{13.26} \end{equation} In addition, \begin{equation} \begin{split} \pi_jp_{jk}&=\pi_jh_{jk}a_{jk}\\ &=\pi_jh_{jk}=\pi_kp_{kj} \end{split} \tag{13.27} \end{equation} because a_{jk}=1 by definition. The claim is satisfied.

Now, for another case \pi_jh_{jk}>\pi_kh_{kj}, we will have a_{jk}=\frac{\pi_kh_{kj}}{\pi_jh_{jk}} by definition. Then \begin{equation} \pi_jp_{jk}=\pi_jh_{jk}a_{jk}=\pi_kh_{kj} \tag{13.28} \end{equation} Now \begin{equation} \pi_kp_{kj}=\pi_kh_{kj}a_{kj}=\pi_kh_{kj} \tag{13.29} \end{equation} beacuse a_{kj}=1 by definition.

If we have a probability vector \boldsymbol{\pi} which satisfies detailed balance equation with transition probability of a Markov chain, then \boldsymbol{\pi} is the stationary distribution.Therefore, the full posterior distribution is the stationary distribution of the MCMC chain. Assuming all the other regularity conditions hold for the chain, then the stationary distribution is just the limiting distribution. Thus, the limiting distribution of the Markov chain is going to be the full posterior probabilities. Therefore, as you run the chain long enough, eventually you are going to hit the posterior distribution.