Chapter 3 Markov Chain: Definition and Basic Properties (Lecture on 01/12/2021)

The next few chapters will be mainly about discrete time, discrete state space stochastic process, mainly from the context of Markov chain.

Let $X=\{X(t,\omega):t\in T,\omega\in\Omega\}$ , $X(t,\omega)\in S$ , consider the case both $T$ and $S$ are discrete. Define the random variabel $X_t(\omega)=X(t,\omega)$ and consider the sequence of random variables $\{X_0,X_1,\cdots\}$ which take values in some countable set $S$ , called the state space. Each $X_n$ is a discrete random variable that takes one of the $N$ possible values, where $N=|S|$ . We are allowing $N=\infty$ . This is the set up for discrete time, discrete state stochastic process.

Definition 3.1 (Markov Chain) The process $X=\{X_0,X_1,\cdots\}$ is a Markov chain if it satisfies the Markov condition: $\begin{equation} P(X_{n}=s|X_0=x_0,X_1=x_1,\cdots,X_{n-1}=x_{n-1})=P(X_{n}=s|X_{n-1}=x_{n-1}),\quad\forall s \tag{3.1} \end{equation}$

The Markov property described in this way is equivalent to

$\begin{equation} P(X_{n}=s|X_{n_1}=x_{n_1},\cdots,X_{n_k}=x_{n_k})=P(X_{n}=s|X_{n_k}=x_{n_k}),\quad\forall s \tag{3.2} \end{equation}$ for all

$n_1<n_2<\cdots<n_k\leq n-1$ .

We have asssumed that $X$ takes values in some countable set $S$ . Since $S$ is countable, it can be put in one-to-one correspondence with some subset $S^{\prime}$ of the integers. Thus, without loss of generality, we can say the following: if $X_n=i$ , it actually means that the chain is in the $i$ th state at the $n$ th time points.

The evolution of a chain is described by the transition probabilities, defined as $P(X_{n+1}=j|X_n=i)$ . This probability may depend on $n,i,j$ . We will restric our attention to the case when transition probabilities do not depend on $n$ .

Definition 3.2 (Homogenous Chain) The chain

$X=\{X_0,\cdots\}$ is called homogenous (time homogenous) if

$P(X_{n+1}=j|X_n=i)=P(X_1=j|X_0=i),\forall n,i,j$ . For a homogenous chain, we define the transition matrix

$P=\{p_{ij}\}_{i,j=1}^{|S|}$ as an

$|S|\times|S|$ matrix of transition probabilities

$p_{ij}=P(X_{n+1}=j|X_n=i)$ .

The beauty of homogeneity assumption is that, we can specify the distribution of the whole stochastic process by specify the transition matrix. Suppose the stochastic process have infinite index space

$T$ and finite state space

$S$ . If we assume homogeneity, the problem of specifying the infinite dimensional distribution

$(x_0,\cdots,)$ becomes to the problem of specifying a finite dimensional matrix

$P$ . It simplifies the problem a lot.

Theorem 3.1 (Properties of Transition Probability Matrix) If $\mathbf{P}$ is a transition probability matrix, then

$0\leq p_{ij}\leq 1$ , $\forall i,j$ .
$\sum_{j}p_{ij}=1$ , $\forall i$ .

Definition 3.3 (The n-step Transition Probability Matrix) The n-step transition probability matrix is defined as

$P(m,m+n)=\{p_{ij}(m,m+n)\}_{i,j=1}^{|S|}$ , where

$p_{ij}(m,m+n)=P(X_{m+n}=j|X_m=i)$ .

Theorem 3.2 (Chapman-Kolmogorov Equation) $\begin{equation} p_{ij}(m,m+n+r)=\sum_{k}p_{ik}(m,m+n)p_{kj}(m+n,m+n+r),\quad \forall r \tag{3.3} \end{equation}$

Intuitively, this means the

$n+r$ step transition probability matrix can be decomposed into a

$n$ step and a

$r$ step transication probability matrix.

Proof.

$\begin{equation} \begin{split} p_{ij}(m,m+n+r)&=P(X_{n+m+r}=j|X_m=i)=\sum_k P(X_{m+n+r}=j,X_{m+n}=k|X_m=i)\\ &=\sum_k P(X_{m+n+r}=j|X_{m+n}=k,X_m=i)P(X_{m+n}=k|X_m=i)\\ &=\sum_k P(X_{m+n+r}=j|X_{m+n}=k)P(X_{m+n}=k|X_m=i) \quad (By\,Markov\,property)\\ &=\sum_k p_{kj}(m+n,m+n+r)p_{ik}(m,m+n) \end{split} \end{equation}$

Since the transition probability matrix is $P(m,m+n+r)=\{p_{ij}(m,m+n+r)\}_{i,j=1}^{|S|}$ , the Chapman-Kolmogorov equation tells us $\begin{equation} P(m,m+n+r)=P(m,m+n)P(m+n,m+n+r),\quad \forall n,m,r \tag{3.4} \end{equation}$ Specificly, we can take $r=n=1$ and we have $\begin{equation} P(m,m+2)=P(m,m+1)P(m+1,m+2),\quad \forall n,m,r \tag{3.5} \end{equation}$ If we further assume time homogeneity, then $P(m,m+1)=P(m+1,m+2)=P$ and (3.5) becomes $\begin{equation} P(m,m+2)=p^2 \tag{3.6} \end{equation}$ In general, we have $\begin{equation} P(m,m+n)=p^n,\quad \forall n \tag{3.7} \end{equation}$ That is, if the one step transition probability matrix is $P=\{p_{ij}\}_{i,j=1}^{|S|}$ where $p_{ij}=P(X_{n+1}=j|X_n=i)$ , then the $i,j$ th entry of the n-step transition probability matrix $P(X_{m+n}=j|X_m=i)=(P^n)_{i,j}$ where $(P^n)_{i,j}$ denotes the $i,j$ th entry of $P^n$ .

Lemma 3.1 Let

$\mu_i(n)=P(X_n=i)$ , that is the marginal probability of

$X_n$ takes the

$i$ th state. Write

$\boldsymbol{\mu}(n)$ as the row vector

$(\mu_i(n),i\in S)$ , then

$\begin{equation} \boldsymbol{\mu}(m+n)=\boldsymbol{\mu}(m)P^n \tag{3.8} \end{equation}$

This lemma gives the relationship between the marginal probability vector of

$X$ at time

$m$ and at time

$m+n$ .

Proof.

$\begin{equation} \begin{split} \mu_j^{m+n}&=P(X_{m+n}=j)=\sum_iP(X_{m+n}=j|X_m=i)P(X_m=i)\\ &=\sum_ip_{ij}(m,m+n)\mu_i(m)\\ &=\sum_i(P^n)_{i,j}\mu_i(m)=(\boldsymbol{\mu}(m)P^n)_j \end{split} \end{equation}$ Since this is true for all

$j\in S$ , we have

$\boldsymbol{\mu}(m+n)=\boldsymbol{\mu}(m)P^n$

Example 3.1 (Simple Random Walk) Suppose

$X_n=\left\{\begin{aligned} & 1 & p \\ & -1 & 1-p \end{aligned}\right.$ for all

$n\in\mathbb{N}$ . Consider the stochastic process given by

$S_n(\omega)=X_1(\omega)+\cdots+X_n(\omega)$ . The state space of this stochastic process is

$S=\{0,\pm 1,\pm 2,\cdots\}$ . Then

$S_n$ is a Markov chain. The one step transition probability is given by

$P(S_n=j|S_{n-1}=i)=\left\{\begin{aligned} & p & j=i+1 \\ & 1-p & j=i-1 \\ & 0 & o.w. \end{aligned}\right.$ Now for the n-step transition probability, we are interested in

$P_{ij}(n)=P(X_n=j|X_0=i)$ . Suppose there are

$a$ upward move and

$b$ downward moves, we have

$\begin{equation} \left\{\begin{aligned} & a+b=n \\ & a-b=j-i \end{aligned}\right.\Longrightarrow \left\{\begin{aligned} & a=\frac{n+j-i}{2} \\ & b=\frac{n-j+i}{2} \end{aligned}\right. \tag{3.9} \end{equation}$ Then, the n-step transition probability is given by

$\begin{equation} p_{ij}(n)=P(X_n=j|X_0=i)=\left\{\begin{aligned} & {n \choose a}p^a(1-p)^b & n+j-i\, even \\ & 0 & n+j-i\,odd \end{aligned}\right. \tag{3.10} \end{equation}$ where

$a$ is given by (3.9).

Example 3.2 (Ehrenfest Diffusion Models) Suppose there are a total of 2A balls in 2 boxes, labeled

$b$ and

$B$ . At each time, we choose a ball at random, and shifted it from its box of origin to the other box. Let

$X_n$ be the number of balls at time n in box

$b$ , then

$X_n$ is a Markov chain. We have

$\begin{equation} P(X_{n+1}=A+j|X_n=A+i)=\left\{\begin{aligned} & \frac{A-i}{2A} & j=i+1 \\ & \frac{A+i}{2A} & j=i-1 \end{aligned}\right. \tag{3.11} \end{equation}$ for all

$i=-A,\cdots,A$ .

Definition 3.4 (Persistent State) State $i$ is called persistent (recurrent) if $P(X_n=i\, \text{for some}\, n\geq 1|X_0=i)=1$ . This is to say that the probability of the chain eventually return to i, having started from i, is 1.

If this probability is less than 1, state $i$ is known as the transient state.

We will be interested in the first passage time defined as $f_{ij}(n)=P(X_1\neq j,\cdots,X_{n-1}\neq j,X_n=j|X_0=i)$ . This is the probability that state $j$ is first visited from state $i$ at time n. Write $f_{ij}=\sum_{n=1}^{\infty}f_{ij}(n)$ , it is the probability that state $j$ is ever visited from state $i$ . If $f_{ij}=1$ , we are interested in the constraints it imples on the transition probability.