Chapter 1 Basic Definitions of Stochastic Process, Kolmogorov Consistency Theorem (Lecture on 01/05/2021)
Motivation: Why are we studying stochastic process?
One of the classical statistical inference is that we are given response variable \(Y\) and some covariates \(\mathbf{x}\), and we are interested in the functional relationship between \(Y\) and \(\mathbf{x}\), usually written as \(Y=f(\mathbf{x})+\epsilon\). Traditional Bayesian inference would assume some parameteric form of \(f(\mathbf{x})\). Denote the parameters in \(f(\mathbf{x})\) as \(\boldsymbol{\beta}\), with prior on \(\boldsymbol{\beta}\), we can use Bayesian inference to get the updated posterior distribution of \(\boldsymbol{\beta}\), based on the data. But what if the function \(f(\mathbf{x})\) is unknown and have some complex form? Then we need stochastic process. Stochastic process is distribution on an infinite dimensional function. We can assume the unknown function \(f\) has some well-studied stochastic process as prior, then use data to get the posterior distribution of \(f\).
Definition 1.1 (Stochastic Process) Consider a probability space \((\Omega,\mathcal{F},P)\) where \(\Omega\) is the sample space of the experiment, as well as an index set \(T\) and a state space \(S\). A stochastic process is a collection \(X=\{X(\omega,t):\omega\in\Omega,t\in T\}\) such that
For any \(n\) and any set of index points \(t_1,\cdots,t_n\in T\), \((x_{t_1},\cdots,x_{t_n})\) is an n-dimensional random variable (random vector) defined on the probability space \((\Omega,\mathcal{F},P)\) and taking values in \(S^n=S\times\cdots\times S\). Hence, for each fixed \(t\), \(X_t(\cdot)=X(\cdot,t): (\Omega,\mathcal{F},P)\to S\) is a random variable.
For any fixed \(\omega\in\Omega\), \(X_{\omega}(\cdot)=X(\omega,\cdot): T\to S\) is a function defined on \(T\) and taking values in \(S\), referred to as a sample path of \(X\).
Types of stochastic process: There are four types of stochastic process.
When \(T\) is discrete space and \(S\) is continuous, the stochastic process is called a discrete time, continuous state process.
When \(T\) is discrete space and \(S\) is discrete as well, the stochastic process is called a discrete time, discrete state process.
When \(T\) is continuous space and \(S\) is discrete, the stochastic process is called a continuous time, discrete state process.
When both \(T\) and \(S\) are continuous, the stochastic process is called a continuous time, continuous state process
Example 1.1 Specify types of the stochastic process:
Let \(Z\) be a random variable such that \(P(Z=1)=P(Z=-1)=\frac{1}{2}\). Define the stochastic process \(X=\{X(t,\omega):t\geq 0,\omega\in\Omega\}\) such that \(X(t,\omega)=Z(\omega)\). This is a continuous time, discrete state stochastic process, where \(S=\{-1,1\}\) and \(T=\mathbb{R}^+\).
Let \(P(X_n=1)=P(X_n=-1)=\frac{1}{2}\) and let \(H_n=X_1+\cdots+X_n\) for \(n\in\mathbb{N}\). Define the stochastic process \(X=\{X(n,\omega):n\in\mathbb{N},\omega\in\Omega\}\) such that \(X(n,\omega)=H_n(\omega)\). This is a discrete time, discrete state process, where \(S=\mathbb{Z}\) and \(T=\mathbb{N}\).
Let \(Z\) be a random variable such that \(Z\sim N(0,1)\). Define the stochastic process to be \(X=\{X(n,\omega):n\in\mathbb{N},\omega\in\Omega\}\) such that \(X(n,\omega)=Z(\omega)\sin(\frac{2\pi}{n})\). This is a discrete time, continuous state process, where \(S=\mathbb{R}\) and \(T=\mathbb{N}\).
Consider a collection \(\{X(t,\omega):t\geq 0,\omega\in\Omega\}\) such that
- \(X_0(\omega)=X(0,\omega)=0\).
- For \(0\leq s<t<\infty\), \(X_t-X_s\sim N(0,t-s)\).
- For \(0\leq s<t<\infty\), \(X_t-X_s\) is independent of \(X_s\).
- The sample path \(t\to X_t\) is a continuous function.
Can two stochastic processes have the same f.d.d.s.? Yes!
Example 1.2 (Two stochastic processes with the same f.d.d.s.) Define \(X=\{X(t,\omega):t\in\mathbb{R},\omega\in\Omega\}\) such that \(X(t,\omega)=0\) for all \(t\in\mathbb{R}\) and \(\omega\in\Omega\), so \(X\) is idenitically 0. Then define \(Y=\{X(t,\omega):t\in\mathbb{R},\omega\in\Omega\}\) as \(Y(t,\omega)=\left\{\begin{aligned} &0 & |Z(\omega)|\neq t\\ &1 & o.w. \end{aligned}\right.\) where \(Z\sim N(0,1)\).
Let \(t_1,\cdots,t_n\in T\), then we have \[\begin{equation} \begin{split} P((X_{t_1},\cdots,X_{t_n})\neq (Y_{t_1},\cdots,Y_{t_n}))&\leq P(\exists i, X_{t_i}\neq Y_{t_i})\\ &\leq \sum_{i=1}^nP(X_{t_i}\neq Y_{t_i})=\sum_{i=1}^nP(|Z|=t_i)=0 \end{split} \tag{1.1} \end{equation}\]
Therefore, \((X_{t_1},\cdots,X_{t_n})\) has the same distribution as \((Y_{t_1},\cdots,Y_{t_n})\) for any \(n\). The two stochastic processes \(X\) and \(Y\) have the same finite dimensional distributions.
However, the two stochastic process are not identical. The second stochastic process has a discontinuous sample path, the first stochastic process has a continuous sample path.People want to define stochastic process through the f.d.d.s. and they found by adding additional restrictions to f.d.d.s., a stochastic process can be uniquely extended from its finite dimensional distributions. This is known as the kolmogorov consistency theorem.
Theorem 1.1 (Kolmogorov Consistency Theorem) Assume that for each \(n\) and for each set of indexing points \(\mathbf{t}=(t_1,\cdots,t_n)\), we define a f.d.d. \(F_{\mathbf{t}}\). If the collection of all all such \(F_{\mathbf{t}}\) satisfy
\(F_{t_1,\cdots,t_n,t_{n+1}}(x_1,\cdots,x_n,x_{n+1})\to F_(t_1,\cdots,t_n)(x_1,\cdots,x_n)\) as \(x_{n+1}\to\infty\).
For all \(n\), \(\mathbf{x}=(x_1,\cdots,x_n),\mathbf{t}=(t_1,\cdots,t_n)\) and any permutation \(\boldsymbol{\pi}=(\pi(1),\cdots,\pi(n))\) of \(\{1,\cdots,n\}\), \(F_{\boldsymbol{\pi}(\mathbf{t})}(\boldsymbol{\pi}(\mathbf{x}))=F_{\mathbf{t}}(\mathbf{x})\).
If these two conditions are satisfied, then this finite dimensional distribution are going to represent a stochastic process. If we add additional condition that the sample path is right continuous, then the resulting stochastic process is going to be unique.
Definition 1.3 (Useful Functions of Stochastic Process) Using the information provided by the set of f.d.d.s., we can define several useful functions for a stochastic process.
For any \(t\in T\), the mean function of \(X\) is \(\mu(t)=\mu_X(t)=E(X_t)\).
- For any \(t_i,t_j\in T\), the (auto)covariance function is given by \[\begin{equation} C(t_i,t_j)=Cov(X_{t_i},X_{t_j})=E(X_{t_i}X_{t_j})-\mu_X(t_i)\mu_X(t_j) \tag{1.2} \end{equation}\] and the (auto)correlation function is given by \[\begin{equation} r(t_i,t_j)=Corr(X_{t_i},X_{t_j})=\frac{Cov(X_{t_i}X_{t_j})}{\sqrt{Var(X_{t_i})Var(X_{t_j})}} \tag{1.3} \end{equation}\]
If \(C(t_i,t_j)=0\), \(\forall t_i,t_j\), for a stochastic process, then it is called a white noise process. For example, if \(\epsilon_1,\cdots,\epsilon_n\stackrel{i.i.d.}{\sim} N(0,\sigma^2)\), then \(Cov(\epsilon_i,\epsilon_j)=0,\forall i,j\). If we define the stochastic process \(X=\{X(n,\omega):n\in \mathbb{N},\omega\in\Omega\}\) such that \(X(n,\omega)=Z_n(\omega)\) then \(X\) is a white noise process.