Chapter 1 Basic Definitions of Stochastic Process, Kolmogorov Consistency Theorem (Lecture on 01/05/2021)

Motivation: Why are we studying stochastic process?

One of the classical statistical inference is that we are given response variable \(Y\) and some covariates \(\mathbf{x}\), and we are interested in the functional relationship between \(Y\) and \(\mathbf{x}\), usually written as \(Y=f(\mathbf{x})+\epsilon\). Traditional Bayesian inference would assume some parameteric form of \(f(\mathbf{x})\). Denote the parameters in \(f(\mathbf{x})\) as \(\boldsymbol{\beta}\), with prior on \(\boldsymbol{\beta}\), we can use Bayesian inference to get the updated posterior distribution of \(\boldsymbol{\beta}\), based on the data. But what if the function \(f(\mathbf{x})\) is unknown and have some complex form? Then we need stochastic process. Stochastic process is distribution on an infinite dimensional function. We can assume the unknown function \(f\) has some well-studied stochastic process as prior, then use data to get the posterior distribution of \(f\).

Definition 1.1 (Stochastic Process) Consider a probability space \((\Omega,\mathcal{F},P)\) where \(\Omega\) is the sample space of the experiment, as well as an index set \(T\) and a state space \(S\). A stochastic process is a collection \(X=\{X(\omega,t):\omega\in\Omega,t\in T\}\) such that

  1. For any \(n\) and any set of index points \(t_1,\cdots,t_n\in T\), \((x_{t_1},\cdots,x_{t_n})\) is an n-dimensional random variable (random vector) defined on the probability space \((\Omega,\mathcal{F},P)\) and taking values in \(S^n=S\times\cdots\times S\). Hence, for each fixed \(t\), \(X_t(\cdot)=X(\cdot,t): (\Omega,\mathcal{F},P)\to S\) is a random variable.

  2. For any fixed \(\omega\in\Omega\), \(X_{\omega}(\cdot)=X(\omega,\cdot): T\to S\) is a function defined on \(T\) and taking values in \(S\), referred to as a sample path of \(X\).

For a stochastic process, \(T\) is called the indexing set, \(S\) is the range of values taken by the stochastic process.
Understanding both part in the definition of stochastic process is important. Understand the random variable as a function of \(\omega\), we understand the distribution of the stochastic process. The behave of the stochastic process as a function of \(t\) gives how smooth the stochastic process is.

Types of stochastic process: There are four types of stochastic process.

  • When \(T\) is discrete space and \(S\) is continuous, the stochastic process is called a discrete time, continuous state process.

  • When \(T\) is discrete space and \(S\) is discrete as well, the stochastic process is called a discrete time, discrete state process.

  • When \(T\) is continuous space and \(S\) is discrete, the stochastic process is called a continuous time, discrete state process.

  • When both \(T\) and \(S\) are continuous, the stochastic process is called a continuous time, continuous state process

Example 1.1 Specify types of the stochastic process:

  1. Let \(Z\) be a random variable such that \(P(Z=1)=P(Z=-1)=\frac{1}{2}\). Define the stochastic process \(X=\{X(t,\omega):t\geq 0,\omega\in\Omega\}\) such that \(X(t,\omega)=Z(\omega)\). This is a continuous time, discrete state stochastic process, where \(S=\{-1,1\}\) and \(T=\mathbb{R}^+\).

  2. Let \(P(X_n=1)=P(X_n=-1)=\frac{1}{2}\) and let \(H_n=X_1+\cdots+X_n\) for \(n\in\mathbb{N}\). Define the stochastic process \(X=\{X(n,\omega):n\in\mathbb{N},\omega\in\Omega\}\) such that \(X(n,\omega)=H_n(\omega)\). This is a discrete time, discrete state process, where \(S=\mathbb{Z}\) and \(T=\mathbb{N}\).

  3. Let \(Z\) be a random variable such that \(Z\sim N(0,1)\). Define the stochastic process to be \(X=\{X(n,\omega):n\in\mathbb{N},\omega\in\Omega\}\) such that \(X(n,\omega)=Z(\omega)\sin(\frac{2\pi}{n})\). This is a discrete time, continuous state process, where \(S=\mathbb{R}\) and \(T=\mathbb{N}\).

  4. Consider a collection \(\{X(t,\omega):t\geq 0,\omega\in\Omega\}\) such that

    1. \(X_0(\omega)=X(0,\omega)=0\).
    1. For \(0\leq s<t<\infty\), \(X_t-X_s\sim N(0,t-s)\).
    1. For \(0\leq s<t<\infty\), \(X_t-X_s\) is independent of \(X_s\).
    1. The sample path \(t\to X_t\) is a continuous function.
This stochastic process is known as the Brownian motion. It is a continuous time, continuous state process where \(S=\mathbb{R}\) and \(T=\mathbb{R}^+\).
In studying the stochastic process, both distributional properties (condition (1) in Definition 1.1) abd properties of the sample path (condition (2) in Definition 1.1) need to be understood. We will be mostly dealing with the former. However, we will discuss the latter for important stochastic process.
Definition 1.2 (Finte Dimensional Distributions (f.d.d.s.)) In regard to condition (1) in Definition 1.1, the distribution function of the random vector \((X_{t_1},\cdots,X_{t_n})\), denoted as \(F_{t_1,\cdots,t_n}(x_1,\cdots,x_n)=P(X_{t_1}\leq x_1,\cdots,X_{t_n}\leq x_n)\) where \(t_1,\cdots,t_n\in T\) is called the finite dimensional distributions of the stochastic process.

Can two stochastic processes have the same f.d.d.s.? Yes!

Example 1.2 (Two stochastic processes with the same f.d.d.s.) Define \(X=\{X(t,\omega):t\in\mathbb{R},\omega\in\Omega\}\) such that \(X(t,\omega)=0\) for all \(t\in\mathbb{R}\) and \(\omega\in\Omega\), so \(X\) is idenitically 0. Then define \(Y=\{X(t,\omega):t\in\mathbb{R},\omega\in\Omega\}\) as \(Y(t,\omega)=\left\{\begin{aligned} &0 & |Z(\omega)|\neq t\\ &1 & o.w. \end{aligned}\right.\) where \(Z\sim N(0,1)\).

Let \(t_1,\cdots,t_n\in T\), then we have \[\begin{equation} \begin{split} P((X_{t_1},\cdots,X_{t_n})\neq (Y_{t_1},\cdots,Y_{t_n}))&\leq P(\exists i, X_{t_i}\neq Y_{t_i})\\ &\leq \sum_{i=1}^nP(X_{t_i}\neq Y_{t_i})=\sum_{i=1}^nP(|Z|=t_i)=0 \end{split} \tag{1.1} \end{equation}\]

Therefore, \((X_{t_1},\cdots,X_{t_n})\) has the same distribution as \((Y_{t_1},\cdots,Y_{t_n})\) for any \(n\). The two stochastic processes \(X\) and \(Y\) have the same finite dimensional distributions.

However, the two stochastic process are not identical. The second stochastic process has a discontinuous sample path, the first stochastic process has a continuous sample path.
The finite dimensional distribution of \(X\) and \(Y\) are the same, but the infinite dimensional distribution of them may not be the same, because \(\sum_{t\in\mathcal{T}}P(|Z|=t)\) may be positive.

People want to define stochastic process through the f.d.d.s. and they found by adding additional restrictions to f.d.d.s., a stochastic process can be uniquely extended from its finite dimensional distributions. This is known as the kolmogorov consistency theorem.

Theorem 1.1 (Kolmogorov Consistency Theorem) Assume that for each \(n\) and for each set of indexing points \(\mathbf{t}=(t_1,\cdots,t_n)\), we define a f.d.d. \(F_{\mathbf{t}}\). If the collection of all all such \(F_{\mathbf{t}}\) satisfy

  1. \(F_{t_1,\cdots,t_n,t_{n+1}}(x_1,\cdots,x_n,x_{n+1})\to F_(t_1,\cdots,t_n)(x_1,\cdots,x_n)\) as \(x_{n+1}\to\infty\).

  2. For all \(n\), \(\mathbf{x}=(x_1,\cdots,x_n),\mathbf{t}=(t_1,\cdots,t_n)\) and any permutation \(\boldsymbol{\pi}=(\pi(1),\cdots,\pi(n))\) of \(\{1,\cdots,n\}\), \(F_{\boldsymbol{\pi}(\mathbf{t})}(\boldsymbol{\pi}(\mathbf{x}))=F_{\mathbf{t}}(\mathbf{x})\).

If these two conditions are satisfied, then this finite dimensional distribution are going to represent a stochastic process. If we add additional condition that the sample path is right continuous, then the resulting stochastic process is going to be unique.

Definition 1.3 (Useful Functions of Stochastic Process) Using the information provided by the set of f.d.d.s., we can define several useful functions for a stochastic process.

  1. For any \(t\in T\), the mean function of \(X\) is \(\mu(t)=\mu_X(t)=E(X_t)\).

  2. For any \(t_i,t_j\in T\), the (auto)covariance function is given by \[\begin{equation} C(t_i,t_j)=Cov(X_{t_i},X_{t_j})=E(X_{t_i}X_{t_j})-\mu_X(t_i)\mu_X(t_j) \tag{1.2} \end{equation}\] and the (auto)correlation function is given by \[\begin{equation} r(t_i,t_j)=Corr(X_{t_i},X_{t_j})=\frac{Cov(X_{t_i}X_{t_j})}{\sqrt{Var(X_{t_i})Var(X_{t_j})}} \tag{1.3} \end{equation}\]
Proposition 1.1 (Property of Covariance function) Covariance function is a non-negative definite function. Meaning that, for all (finite) \(k\) and for any \(t_1,\cdots,t_k\in T\) and real constants \(z_1,\cdots,z_k\), we have \[\begin{equation} \sum_{i=1}^k\sum_j=1^kz_iz_jC(t_i,t_j)\geq 0 \tag{1.4} \end{equation}\] where \(C(\cdot,\cdot)\) is the autocovariance function.
Proof. Consider the variance of \(\sum_{i=1}^kz_iX_{t_i}\), we have \[\begin{equation} Var(\sum_{i=1}^kz_iX_{t_i})=\sum_{i=1}^k\sum_j=1^kz_iz_jC(t_i,t_j)\geq 0 \tag{1.5} \end{equation}\] Therefore, the condition ((1.4)) must be satisfied by covariance functions.

If \(C(t_i,t_j)=0\), \(\forall t_i,t_j\), for a stochastic process, then it is called a white noise process. For example, if \(\epsilon_1,\cdots,\epsilon_n\stackrel{i.i.d.}{\sim} N(0,\sigma^2)\), then \(Cov(\epsilon_i,\epsilon_j)=0,\forall i,j\). If we define the stochastic process \(X=\{X(n,\omega):n\in \mathbb{N},\omega\in\Omega\}\) such that \(X(n,\omega)=Z_n(\omega)\) then \(X\) is a white noise process.