Chapter 8 Stochastic Process and Markov Chains
8.1 Strochastic Processes
A stochastic process is a family of random variables where the index belongs to some parameter set , and each of the random variables can take the same set of possible values. The possible values that the random variables can take is known as the state space.
Typically the indices and parameter set have a time-related interpretation. With this in mind, informally a stochastic process can be thought of as the state of some system at a time . When the set is discrete, for example , then the stochastic process is known as a discrete-time process. When the set is continuous, for example , then the stochastic process is known as a continuous-time process. In DATA2002, we will predominantly consider stochastic processes that are discrete time processes.
Furthermore throughout the remainder of DATA2002, we will only consider stochastic processes for which the state space is discrete, that is, each of the are discrete random variables.
Consider a fair -sided die that we roll repeatedly keeping track of our total cumulative score. Let be the total of the first dice rolls. What is the parameter set , and what is the state space denoted . Also calculate the PMFs of and .
The index corresponds the number of dice rolls that have been made, and so will be a positive whole integer. Therefore After dice rolls, the cumulative total will be an integer between and . All positive integers will be included in at least one these intervals, and since all random variables must share a common state space, it is necessary to take The cumulative total score after one die roll, that is the PMF of is given by
The possible outcomes for rolling the die twice are summarized by the following table.

Note from Example 8.1.2 that the random variable having an element in the state space is not the same as there being a non-zero probability that . For example since it is impossible to have a total of after just one roll, but does remain in the state space of .
Deja Brew and Latte Da are the two chains of cafes in Nottingham. Mary Berry travels to Nottingham and decides to try out Latte Da. Mary then falls into a habit: if she has been to Latte Da there is a chance she chooses to go to Deja Brew on her next trip to Nottingham, and a chance she will go back to Latte Da. Alternatively if Mary has been to Deja Brew, then there is a chance that she decides to go Latte Da on her next day out in Nottingham, and a she returns to Deja Brew.
This can be summarised by the following diagram:

Mary Berrys’ behaviour can be modeled by a stochastic process. Let be the state space, and indicate whether Mary goes to Latte Da or Deja Brew on her subsequent trip to Nottingham, with .
A sample path is a sequence of realised values for the random variables of a stochastic process.
Consider again the random variables of Example 8.1.2, obtained by calculating the cumulative score of rolls of a fair -sided die. The following table outlines examples of sample paths observed for the stochastic process:

For , we rolled successively.
Using the stochastic process of Example 8.1.2, what is ?
Using the stochastic process of Example 8.1.2, what is ?
Using the stochastic process of Example 8.1.2, what is ?
These are all examples of questions that we will learn to answer.
What should also be noted in Example 8.1.2 is that in performing the underlying experiment to observe , we necessarily observe (before we can roll the die twice, we have to roll it once). Therefore it could be argued that it makes more natural sense to study and , rather than and . This argument is true more generally since the parameter set typically has a time-related or sequential interpretation: we can study .
This highlights a key feature of stochastic processes: the transitions from one state to another state.
8.2 Defintion of Markov Chains
A Markov Chain is a type of stochastic processes.
A stochastic process is a Markov chain if for all and states .
This is to say, that when trying to predict the future state of a Markov chain it is only important to know the present state and not how the process arrived at this state.
The stochastic processes in Example 8.1.2 and Example 8.1.3 were Markov Chains.
Set . To define for , one tosses a coin. If a head is obtained then set , and if a tails is obtained set . Explain why is not a Markov chain.
In the event of tossing a head clearly is dependent on both the value of and , not just the current state . This is in contradiction to Definition 8.2.1 of a Markov chain.
Identifying Markov chains within your workplace is a key skill for DATA2002. Many Markov chains will have a great number of states.
A Markov chain is called time-homogeneous if for all and for all .
Informally time-homogeneouity means that the probabilities regarding transitions between states do not change with time. In this module, we only consider time-homogeneous Markov chains.
Clearly Example 8.1.3 is a time-homogeneous Markov chain.
Consider a time-homogeneous Markov chain . The transition probabilities are the values
8.3 Transition Matrix
The transition probabilities introduced in Definition 8.2.4 play a key role in the analysis of Markov chains. As an efficient storing technique we introduce the transition probability matrix.
The matrix , where the entry in the row and column is the transition probability , is called the transition probability matrix.
Note that any transition probability matrix :
for all , or equivalently every matrix entry is non-negative;
for all , or equivalently the entries in each row of the matrix sum to .
Any matrix that satisfies both of these conditions is called a stochastic matrix.
What is the transition probability matrix of the stochastic process in Example 8.1.3?
Allowing the index 1 to correspond to Latte Da and index 2 correspond to Deja Brew, from the description of the problem we have
What is the transition probability matrix of the stochastic process in Example 8.1.2?
Suppose that for some positive integer . Then since the fair die has an equal chance of rolling each of there is a chance that is equal to each of , , , , , . Specifically where . Therefore the transition probability matrix is given by
This notion of transition probabilities can extended from one-step transitions to -step transitions. Specifically set Note that due to the Markov chains we are considering being time-homogeneous, we have This is to say, only the number of steps between our known state and the future event we wish to find the probability of matters, not the time at which this is happening.
The matrix , where the entry in the row and column is the -step transition probability , is called the n-step transition probability matrix.
What is the -step transition probability matrix of the stochastic process in Example 8.1.3.
Again allowing the index 1 to correspond to Latte Da and index 2 correspond to Deja Brew, from the description of the problem we can calculate
The following theorem gives us an easy way to calculate -step transition probability matrix.
Chapman-Kolmogorov Equations: The transition probability matrix and the -step transition probability matrix satisfy the equation
What is the -step transition probability matrix of the stochastic process in Example 8.1.2.
By the Chapman-Kolmogorov equations, the question is equivalent to calculating where is the transition probability matrix calculated in Example 8.3.3. Specifically
What is the -step transition probability matrix of the stochastic process in Example 8.1.3.
By the Chapman-Kolmogorov equations, the question is equivalent to calculating where is the transition probability matrix calculated in Example 8.3.2. Specifically
Approximate the probability that Mary Berry goes to Deja Brew on some trip infinitely far in the future.
8.4 Hitting Probabilities and Hitting Times
Given a Markov chain and a current state, one can ask about the probabilities of ever achieving other states.
Given two distinct states of a Markov chain, the hitting probability denoted is the probability of ever reaching state in the future given that the current state is . Mathematically
Often the best way to calculate , is to split the possible sample path into cases based on the first step. Specifically This right hand side of this equation can then be conditioned using the general probability identity . This is fully illustrated by the following example:
A company has a new website which consists of 4 pages: a Front Page which links to all other pages, an About Us page and a Contact Us page which both link to each other and back to the front page, and an Our Staff page which has no links on it. A prospective customer starts at a random page, and chooses from the available links on the page with equal probability to decide which the next page is. At all future times they repeat this procedure. Thinking of the above process as a Markov chain, and in doing so labeling the pages as according to the order they are listed above, calculate the hitting times , , , .
The Markov chain can be described diagrammatically:

The transition matrix is therefore
Calculating the hitting probability using the recommended method:
So we have an expression for in terms of . Let’s calculate :
Substituting equation into equation : Therefore By the symmetry of the structure of the website with regards to the pages About Us and Contact Us pages, we haveAs a side note to Example 8.4.2, note that once one gets on the Our staff page that there is no hyperlink away from this page. Therefore .
Calculate , and in Example 8.4.2.
Hitting time probabilities can be decomposed into a sum of probabilities such a transition takes place in exactly steps.
Given two distinct states of a Markov chain, define to be the probability of reaching state for the first time in exactly steps given that the current state is . Mathematically
What is the difference between and ?
Defintion 8.4.1 and Defintion 8.4.3 collectively lead to the identity
Consider again the company website from Example 8.4.2. Starting on the Front Page, I begin to browse the website by clicking on hyperlinks. What is the probability that I return to the Front Page for the first time after movements, where .
Mathematically the question is asking us to calculate . To do so, we will look at all paths starting at state which first return to state after time steps. Since there is no link to leave the Our Staff page, that is there is no way for the Markov chain to leave state , such a path will not visit this page of the website. Furthermore the path only visits state , the Front Page, at the beginning and end, all other intermediary states will necessarily fluctuate between states and .
The concept of -step hitting times leads to consideration of the random variable , defined formally below, which governs when the process first hits the state starting from state .
The hitting time, denoted , is the random variable which returns the quantity for a randomly chosen sample path of the Markov chain.
If we never reach state from state in the sample path then .
Note that
Consider again the Markov chain of Example 8.4.2 based on traversing a companys’ website. Calculate .
We calculate
In the calculation of why did we ignore the path where .
Renewal Equation: Given a Markov chain with -step transition probabilities , and hitting probabilities , one has
Informally the renewal equation states that the probability of going from state to state over steps can be conditioned on when we first visit state .