# Chapter 1 Basic Definitions of Stochastic Process, Kolmogorov Consistency Theorem (Lecture on 01/05/2021)

**Motivation:** Why are we studying stochastic process?

One of the classical statistical inference is that we are given response variable \(Y\) and some covariates \(\mathbf{x}\), and we are interested in the functional relationship between \(Y\) and \(\mathbf{x}\), usually written as \(Y=f(\mathbf{x})+\epsilon\). Traditional Bayesian inference would assume some parameteric form of \(f(\mathbf{x})\). Denote the parameters in \(f(\mathbf{x})\) as \(\boldsymbol{\beta}\), with prior on \(\boldsymbol{\beta}\), we can use Bayesian inference to get the updated posterior distribution of \(\boldsymbol{\beta}\), based on the data. But what if the function \(f(\mathbf{x})\) is unknown and have some complex form? Then we need stochastic process. **Stochastic process is distribution on an infinite dimensional function.** We can assume the unknown function \(f\) has some well-studied stochastic process as prior, then use data to get the posterior distribution of \(f\).

**Definition 1.1 (Stochastic Process) **Consider a probability space \((\Omega,\mathcal{F},P)\) where \(\Omega\) is the sample space of the experiment, as well as an index set \(T\) and a state space \(S\). A stochastic process is a collection \(X=\{X(\omega,t):\omega\in\Omega,t\in T\}\) such that

For any \(n\) and any set of index points \(t_1,\cdots,t_n\in T\), \((x_{t_1},\cdots,x_{t_n})\) is an n-dimensional random variable (random vector) defined on the probability space \((\Omega,\mathcal{F},P)\) and taking values in \(S^n=S\times\cdots\times S\). Hence, for each fixed \(t\), \(X_t(\cdot)=X(\cdot,t): (\Omega,\mathcal{F},P)\to S\) is a random variable.

For any fixed \(\omega\in\Omega\), \(X_{\omega}(\cdot)=X(\omega,\cdot): T\to S\) is a function defined on \(T\) and taking values in \(S\), referred to as a sample path of \(X\).

**Types of stochastic process:** There are four types of stochastic process.

When \(T\) is discrete space and \(S\) is continuous, the stochastic process is called a discrete time, continuous state process.

When \(T\) is discrete space and \(S\) is discrete as well, the stochastic process is called a discrete time, discrete state process.

When \(T\) is continuous space and \(S\) is discrete, the stochastic process is called a continuous time, discrete state process.

When both \(T\) and \(S\) are continuous, the stochastic process is called a continuous time, continuous state process

**Example 1.1 **Specify types of the stochastic process:

Let \(Z\) be a random variable such that \(P(Z=1)=P(Z=-1)=\frac{1}{2}\). Define the stochastic process \(X=\{X(t,\omega):t\geq 0,\omega\in\Omega\}\) such that \(X(t,\omega)=Z(\omega)\). This is a continuous time, discrete state stochastic process, where \(S=\{-1,1\}\) and \(T=\mathbb{R}^+\).

Let \(P(X_n=1)=P(X_n=-1)=\frac{1}{2}\) and let \(H_n=X_1+\cdots+X_n\) for \(n\in\mathbb{N}\). Define the stochastic process \(X=\{X(n,\omega):n\in\mathbb{N},\omega\in\Omega\}\) such that \(X(n,\omega)=H_n(\omega)\). This is a discrete time, discrete state process, where \(S=\mathbb{Z}\) and \(T=\mathbb{N}\).

Let \(Z\) be a random variable such that \(Z\sim N(0,1)\). Define the stochastic process to be \(X=\{X(n,\omega):n\in\mathbb{N},\omega\in\Omega\}\) such that \(X(n,\omega)=Z(\omega)\sin(\frac{2\pi}{n})\). This is a discrete time, continuous state process, where \(S=\mathbb{R}\) and \(T=\mathbb{N}\).

Consider a collection \(\{X(t,\omega):t\geq 0,\omega\in\Omega\}\) such that

- \(X_0(\omega)=X(0,\omega)=0\).

- For \(0\leq s<t<\infty\), \(X_t-X_s\sim N(0,t-s)\).

- For \(0\leq s<t<\infty\), \(X_t-X_s\) is independent of \(X_s\).

- The sample path \(t\to X_t\) is a continuous function.

**Brownian motion**. It is a continuous time, continuous state process where \(S=\mathbb{R}\) and \(T=\mathbb{R}^+\).

**Definition 1.2 (Finte Dimensional Distributions (f.d.d.s.))**In regard to condition (1) in Definition 1.1, the distribution function of the random vector \((X_{t_1},\cdots,X_{t_n})\), denoted as \(F_{t_1,\cdots,t_n}(x_1,\cdots,x_n)=P(X_{t_1}\leq x_1,\cdots,X_{t_n}\leq x_n)\) where \(t_1,\cdots,t_n\in T\) is called the finite dimensional distributions of the stochastic process.

**Can two stochastic processes have the same f.d.d.s.? Yes!**

**Example 1.2 (Two stochastic processes with the same f.d.d.s.) **Define \(X=\{X(t,\omega):t\in\mathbb{R},\omega\in\Omega\}\) such that \(X(t,\omega)=0\) for all \(t\in\mathbb{R}\) and \(\omega\in\Omega\), so \(X\) is idenitically 0. Then define \(Y=\{X(t,\omega):t\in\mathbb{R},\omega\in\Omega\}\) as \(Y(t,\omega)=\left\{\begin{aligned} &0 & |Z(\omega)|\neq t\\ &1 & o.w. \end{aligned}\right.\) where \(Z\sim N(0,1)\).

Let \(t_1,\cdots,t_n\in T\), then we have \[\begin{equation} \begin{split} P((X_{t_1},\cdots,X_{t_n})\neq (Y_{t_1},\cdots,Y_{t_n}))&\leq P(\exists i, X_{t_i}\neq Y_{t_i})\\ &\leq \sum_{i=1}^nP(X_{t_i}\neq Y_{t_i})=\sum_{i=1}^nP(|Z|=t_i)=0 \end{split} \tag{1.1} \end{equation}\]

Therefore, \((X_{t_1},\cdots,X_{t_n})\) has the same distribution as \((Y_{t_1},\cdots,Y_{t_n})\) for any \(n\). The two stochastic processes \(X\) and \(Y\) have the same finite dimensional distributions.

However, the two stochastic process are not identical. The second stochastic process has a discontinuous sample path, the first stochastic process has a continuous sample path.People want to define stochastic process through the f.d.d.s. and they found by adding additional restrictions to f.d.d.s., a stochastic process can be uniquely extended from its finite dimensional distributions. This is known as the kolmogorov consistency theorem.

**Theorem 1.1 (Kolmogorov Consistency Theorem) **Assume that for each \(n\) and for each set of indexing points \(\mathbf{t}=(t_1,\cdots,t_n)\), we define a f.d.d. \(F_{\mathbf{t}}\). If the collection of all all such \(F_{\mathbf{t}}\) satisfy

\(F_{t_1,\cdots,t_n,t_{n+1}}(x_1,\cdots,x_n,x_{n+1})\to F_(t_1,\cdots,t_n)(x_1,\cdots,x_n)\) as \(x_{n+1}\to\infty\).

For all \(n\), \(\mathbf{x}=(x_1,\cdots,x_n),\mathbf{t}=(t_1,\cdots,t_n)\) and any permutation \(\boldsymbol{\pi}=(\pi(1),\cdots,\pi(n))\) of \(\{1,\cdots,n\}\), \(F_{\boldsymbol{\pi}(\mathbf{t})}(\boldsymbol{\pi}(\mathbf{x}))=F_{\mathbf{t}}(\mathbf{x})\).

If these two conditions are satisfied, then this finite dimensional distribution are going to represent a stochastic process. If we add additional condition that the sample path is right continuous, then the resulting stochastic process is going to be **unique**.

**Definition 1.3 (Useful Functions of Stochastic Process) **Using the information provided by the set of f.d.d.s., we can define several useful functions for a stochastic process.

For any \(t\in T\), the mean function of \(X\) is \(\mu(t)=\mu_X(t)=E(X_t)\).

- For any \(t_i,t_j\in T\), the (auto)covariance function is given by \[\begin{equation} C(t_i,t_j)=Cov(X_{t_i},X_{t_j})=E(X_{t_i}X_{t_j})-\mu_X(t_i)\mu_X(t_j) \tag{1.2} \end{equation}\] and the (auto)correlation function is given by \[\begin{equation} r(t_i,t_j)=Corr(X_{t_i},X_{t_j})=\frac{Cov(X_{t_i}X_{t_j})}{\sqrt{Var(X_{t_i})Var(X_{t_j})}} \tag{1.3} \end{equation}\]

**Proposition 1.1 (Property of Covariance function)**Covariance function is a non-negative definite function. Meaning that, for all (finite) \(k\) and for any \(t_1,\cdots,t_k\in T\) and real constants \(z_1,\cdots,z_k\), we have \[\begin{equation} \sum_{i=1}^k\sum_j=1^kz_iz_jC(t_i,t_j)\geq 0 \tag{1.4} \end{equation}\] where \(C(\cdot,\cdot)\) is the autocovariance function.

*Proof.*Consider the variance of \(\sum_{i=1}^kz_iX_{t_i}\), we have \[\begin{equation} Var(\sum_{i=1}^kz_iX_{t_i})=\sum_{i=1}^k\sum_j=1^kz_iz_jC(t_i,t_j)\geq 0 \tag{1.5} \end{equation}\] Therefore, the condition ((1.4)) must be satisfied by covariance functions.

If \(C(t_i,t_j)=0\), \(\forall t_i,t_j\), for a stochastic process, then it is called a **white noise process**. For example, if \(\epsilon_1,\cdots,\epsilon_n\stackrel{i.i.d.}{\sim} N(0,\sigma^2)\), then \(Cov(\epsilon_i,\epsilon_j)=0,\forall i,j\). If we define the stochastic process \(X=\{X(n,\omega):n\in \mathbb{N},\omega\in\Omega\}\) such that \(X(n,\omega)=Z_n(\omega)\) then \(X\) is a white noise process.