Chapter 12 Poisson Process, Birth and Death Process (Lecture on 02/11/2021)

Up until Lecture 10, we have been discussing discrete time and discrete state-space stochastic process \(\{X_n:n\geq 1\}\). That is, \(X_n\in\mathcal{S}\) where \(S\) is discrete. The indexing set of \(X\) is also discrete. It can be \(\mathbb{N}\). However, there are important stochastic processes for which \(\mathcal{S}\) is discrete but the indexing set is continuous. These processes may change their values at any instant of time rather than at specified epochs. In other words, the stochastic process can change instantaneously. Such a process is a family \(\{X(t):t\geq 0\}\) of random variables indexed by the half line \([0,\infty)\) and \(X(t)\) taking values in some discrete set \(\mathcal{S}\).

An important process of this type is called the Poisson process.

Definition 12.1 (Poisson Process) A Poisson process with intensity \(\lambda\) is a process \(N=\{N(t):t\geq 0\}\) taking values in \(\mathcal{S}=\{0,1,2,\cdots\}\) such that

  1. \(N(0)=0\). If \(s\leq t\), then \(N(s)\leq N(t)\).

  2. \(P(N(t+h)=n+m|N(t)=n)=\left\{\begin{aligned} & \lambda h+o(h) & m=1\\ & 1-\lambda h+o(h) & m=0 \\ & o(h) & m>1\end{aligned}\right.\) where \(o(h)\) represents any function such that \(\frac{o(h)}{h}\to 0\) as \(h\to 0\). That is, a function of \(h\) that decays to 0 faster than \(h\) (at a super-linear rate).

  3. If \(s<t\), \(N(t)-N(s)\), which denotes the number of events in the interval \((s,t]\) is independent of the number of events during the time \([0,s]\), i.e. \(N(t)-N(s)\) is independent of \(N(s)\).

The Poisson process with intensity \(\lambda\) is the process \(N(t)\) that represent the number of events that occured up to time \(t\). The first condition says that it need to satisfy that \(N(0)=0\), which means the number of events occured at time 0 is 0. As time increases, the number of events can only increase. The second condition says that, suppose there are \(n\) events occurs in the interval \([0,t]\), then the probability that there will be \(m\) events occuring in the interval \((t,t+h]\) is given by the formula in (b). \(o(h)\) term signifies that if \(h\) is small, then the probability that there will be more than one event occuring in the interval \((t,t+h]\) is very small. The third condition means that for any two disjoint intervals, then number of events occured in one interval is independent with number of events occured in the other interval.

Applications: Here are some typical scenairos where Poisson process is used to model the data.

  1. The number of car accidents in a city within a time interval.

  2. The number of phone calls received by a call center within a time interval.

Theorem 12.1 Let \(N(t)\) be the number of events in \([0,t]\) under the definition of Poisson process (Definition 12.1), then \(P(N(t)=j)=\frac{(\lambda t)^j}{j!}e^{-\lambda t}\) for \(j=0,1,\cdots\). Thus, \(N(t)\sim Pois(\lambda t)\).

Proof. We start with consider \(P(N(t+h)=j)\), we have \[\begin{equation} \begin{split} P(N(t+h)=j)&=\sum_iP(N(t+h)=j|N(t)=i)P(N(t)=i)\\ &=P(N(t+h)=j|N(t)=j-1)P(N(t)=j-1)\\ &+P(N(t+h)=j|N(t)=j)P(N(t)=j)\\ &+\sum_{i\neq j,j-1}P(N(t+h)=j|N(t)=i)P(N(t)=i) \end{split} \tag{12.1} \end{equation}\]

By definition of Poisson process, we have \[\begin{equation} \begin{split} &P(N(t+h)=j|N(t)=j-1)=\lambda h+o(h)\\ &P(N(t+h)=j|N(t)=j)=1-\lambda h+o(h)\\ &P(N(t+h)=j|N(t)=i)=o(h),\quad \forall i\neq j,j-1 \end{split} \tag{12.2} \end{equation}\]

Thus, (12.1) becomes \[\begin{equation} \begin{split} P(N(t+h)=j)&=(\lambda h+o(h))P(N(t)=j-1)+(1-\lambda h+o(h))P(N(t)=j)\\ &+\sum_{i\neq j,j-1}o(h)P(N(t)=i) \end{split} \tag{12.3} \end{equation}\] For a simplification of notation, we define \(P(N(t)=j)=p_j(t)\). By (12.3) we have \[\begin{equation} \begin{split} p_j(t+h)&=(\lambda h+o(h))p_{j-1}(t)+(1-\lambda h+o(h))p_j(t)+\sum_{i\neq j,j-1}o(h)p_i(t)\\ &=\lambda hp_{j-1}(t)+(1-\lambda h)p_j(t)+o(h)[p_{j-1}(t)+p_j(t)+\sum_{i\neq j,j-1}p_i(t)]\\ &=\lambda hp_{j-1}(t)+(1-\lambda h)p_j(t)+o(h) \quad \text{(because the last term is free of h)} \end{split} \tag{12.4} \end{equation}\] (12.4) implies that \[\begin{equation} \frac{p_j(t+h)-p_j(t)}{h}=-\lambda p_j(t)+\lambda p_{j-1}(t)+\frac{o(h)}{h} \tag{12.5} \end{equation}\]
Take the limit as \(h\to 0\) on both side of (12.5), we have \[\begin{equation} \begin{split} p^{\prime}_j(t)&=\lim_{h\to 0}\frac{p_j(t+h)-p_j(t)}{h}\\ &=\lim_{h\to 0}[-\lambda p_j(t)+\lambda p_{j-1}(t)+\frac{o(h)}{h}]=-\lambda p_j(t)+\lambda p_{j-1}(t) \end{split} \tag{12.6} \end{equation}\] for \(j\neq 0\).

Now note that \[\begin{equation} \begin{split} P(N(t+h)=0)&=P(N(t+h)=0|N(t)=0)P(N(t)=0)\\ &=(1-\lambda h+o(h))P(N(t)=0) \end{split} \tag{12.7} \end{equation}\] and it implies that \(p_0(t+h)=(1-\lambda h+o(h))p_0(t)\), or \(\frac{p_0(t+h)-p_0(t)}{h}=-\lambda p_0(t)+\frac{o(h)}{h}\). Taking limit of \(h\to 0\) on both sides we can get the differential equation: \[\begin{equation} p^{\prime}_0(t)=-\lambda p_0(t) \tag{12.8} \end{equation}\]

The boundary conditions are given by \[\begin{equation} p_j(0)=\left\{\begin{aligned} & 1 & j=0 \\ & 0 & j\neq 0 \end{aligned}\right. \tag{12.9} \end{equation}\] This is beacuse that \(p_j(0)=P(N(0)=j)\), from the definition of Poisson process we know \(N(0)=0\).

We start with (12.8), that is, \(p^{\prime}_0(t)+\lambda p_0(t)=0\). Multiply both sides by \(e^{\lambda t}\), we have \[\begin{equation} e^{\lambda t}p^{\prime}_0(t)+\lambda e^{\lambda t} p_0(t)=0 \tag{12.10} \end{equation}\] Thus, we have \[\begin{equation} \frac{d}{dt}[e^{\lambda t}p_0(t)]=0 \tag{12.11} \end{equation}\] Integrate both sides between 0 to \(t\), \[\begin{equation} \int_0^t \frac{d}{ds}[e^{\lambda s}p_0(s)]ds=e^{\lambda t}p_0(t)-p_0(0)=0 \tag{12.12} \end{equation}\] By boundary conditions, \(p_0(0)=1\), therefore \[\begin{equation} p_0(t)=e^{-\lambda t} \tag{12.13} \end{equation}\]

Now take \(j=1\), we have \[\begin{equation} p_1^{\prime}(t)=-\lambda p_1(t)+\lambda p_0(t)=-\lambda p_1(t)+\lambda e^{-\lambda t} \tag{12.14} \end{equation}\] Thus, \(p_1^{\prime}(t)+\lambda p_1(t)=\lambda e^{-\lambda t}\). Multiply both sides by \(e^{\lambda t}\), we have \[\begin{equation} \frac{d}{dt}[e^{\lambda t}p_1(t)]=\lambda \tag{12.15} \end{equation}\] Integrate both sides between 0 to \(t\), we have \[\begin{equation} \int_0^t \frac{d}{ds}[e^{\lambda s}p_1(s)]ds=e^{\lambda t}p_1(t)-p_1(0)=\lambda t \tag{12.16} \end{equation}\] By boundary conditions, \(p_1(0)=0\), therefore \[\begin{equation} p_1(t)=\lambda te^{-\lambda t} \tag{12.17} \end{equation}\]

Thus, we have already shown that \(P(N(t)=0)=e^{-\lambda t}\) and \(P(N(t)=1)=e^{-\lambda}\lambda t\). Now we use induction technique to show the general result.

Assume \(p_{j-1}(t)=e^{-\lambda t}\frac{(\lambda t)^{j-1}}{(j-1)!}\), we will show \(p_j(t)=e^{-\lambda t}\frac{(\lambda t)^j}{j!}\).

From (12.6) and the assumption, we have \[\begin{equation} p^{\prime}_j(t)=-\lambda p_j(t)+\lambda e^{-\lambda t}\frac{(\lambda t)^{j-1}}{(j-1)!} \tag{12.18} \end{equation}\] which implies \[\begin{equation} p^{\prime}_j(t)+\lambda p_j(t)=\lambda e^{-\lambda t}\frac{(\lambda t)^{j-1}}{(j-1)!} \tag{12.19} \end{equation}\] Again multiply \(e^{\lambda t}\) on both side of (12.19), we get \[\begin{equation} \frac{d}{dt}[e^{\lambda t}p_j(t)]=\frac{\lambda(\lambda t)^{j-1}}{(j-1)!} \tag{12.20} \end{equation}\] Integrate between 0 to \(t\), we get \[\begin{equation} \begin{split} &\int_0^t\frac{d}{ds}[e^{\lambda s}p_j(s)]ds=e^{\lambda t}p_j(t)-p_j(0)\\ &=\frac{\lambda^j}{(j-1)!}\int_0^ts^{j-1}ds=\frac{\lambda^j}{(j-1)!}\frac{t^j}{j}=\frac{(\lambda t)^j}{j!} \end{split} \tag{12.21} \end{equation}\] Also from boundary conditions, \(p_j(0)=0\), we finally have \[\begin{equation} p_j(t)=e^{-\lambda t}\frac{(\lambda t)^j}{j!} \tag{12.22} \end{equation}\]

Theorem 12.1 suggests that there is a deep connection between the Poisson process and the p.m.f. of Poisson distribution. The number of event occuring between \((0,t]\) follows a Poisson distribution with parameter \(\lambda t\). As the length of the interval increases, the mean of the Poisson distribution also increases linearly.

Definition 12.2 (Birth and death process) \(\{N(t):t\geq 0\}\) is a birth and death process if there exist \(\{\lambda_i\}_{i\geq 0}\), \(\{\mu_i\}_{i\geq 0}\) with \(\mu_0=0\) and \(\mu_i,\lambda_i\geq 0\) such that \[\begin{equation} \begin{split} &P(N(t+h)=i+1|N(t)=i)=\lambda_i h+o(h)\\ &P(N(t+h)=i-1|N(t)=i)=\mu_i h+o(h)\\ &P(N(t+h)=i|N(t)=i)=1-(\lambda_i+\mu_i) h+o(h)\\ &P(N(t+h)=i+m|N(t)=i)=o(h)\quad |m|>1\\ \end{split} \tag{12.23} \end{equation}\]

(12.23) tells us that if the number of events happend in \([0,t]\) is i, then how many events will occur in \((t,t+h]\).

There is a basic difference between the birth and death process and the Poisson process. Note in the Poisson process, the number of events can only increase over time, while in the birth and death process, the number of events can also decrease. When the number of events increase, we call it a birth process and when the number of events decrease, we call it a death process. For example, if you have up to time \(t\), \(i\) events occured. Then between \(t\) to \(t+h\), there can be a birth, a death, or nothing happens, with probability given by the first three equations in (12.23). \(\lambda_i\) is called the birth rate and \(\mu_i\) is called the death rate. The last equation in (12.23) suggests that the probabilities that there will be more than one birth or more than one death will be very small if the time interval is small.

Application: Here are some typical scenairos where birth and death process is used to model the data.

  1. Demography: how population of a particular community evolves over time.

  2. Queuing theory: in a counter, look at the number of people in the queue. A person is served and leaves the queue is called a “death”, and a person joins the queue is called a “birth”.

  3. Mathematical biology: to model the evolotion of bacteria.

We will use a similar technique to study birth and death process. Also denote \(p_j(t)=P(N(t)=j)\), and for \(P(N(t+h)=j)\) we have \[\begin{equation} \begin{split} P(N(t+h)=j)&=\sum_iP(N(t+h)=j|N(t)=i)P(N(t)=i)\\ &=P(N(t+h)=j|N(t)=j)P(N(t)=j)\\ &+P(N(t+h)=j|N(t)=j-1)P(N(t)=j-1)\\ &+P(N(t+h)=j|N(t)=j+1)P(N(t)=j+1)\\ &+\sum_{i\neq j,j-1,j+1}P(N(t+h)=j|N(t)=i)P(N(t)=i) \end{split} \tag{12.24} \end{equation}\] It implies that \[\begin{equation} p_j(t+h)=[1-(\lambda_j+\mu_j)h]p_j(t)+[\lambda_{j-1}h]p_{j-1}(t)+[\mu_{j+1}h]p_{j+1}(t)+o(h) \tag{12.25} \end{equation}\]