Chapter 2 Stationarity, Spectral Theorem, Ergodic Theorem(Lecture on 01/07/2021)

Definition 2.1 (Uncorrelated increaments) We say that a stochastic process $X$ has uncorrelated (orthogonal) increaments if for any $t_i<t_j<t_k<t_l\in T$ , $Cov(t_j-t_i,t_l-t_k)=0$ .

Definition 2.2 (Cross-covariance Function) A useful function for the study of coevolution of two stochastic processes, say $X$ and $Y$ , defined on the same probability space and with the same index set, is the cross-covariance function, defined as $\begin{equation} C_{X,Y}(t_i,t_j)=Cov(X_{t_i},Y_{t_j})=E(X_{t_i}Y_{t_j})-E(X_{t_i})E(Y_{t_j}) \tag{2.1} \end{equation}$ for $t_i,t_j\in T$ . The cross-covariance function measures how correlated are the two processes.

$X$ and

$Y$ are completely uncorrelated, then

$C_{X,Y}(t_i,t_j)=0,\forall t_i,t_j\in T$ .

Why is f.d.d.s. important?

One of the major usage of the stochastic process is to serve as the prior for the unknown function $f(\mathbf{x})$ in a regression problem. In real application, we only have finite number of data. Therefore, what we really need is a prior on the finite dimensional random vector $(f(\mathbf{x}_1),\cdots,f(\mathbf{x}_n))$ , which is just the f.d.d.s. of the stochastic process.

Definition 2.3 (Strongly Stationarity) A stochastic process

$X$ is called strong stationary if the f.d.d.s. are invariant under index shift. That is, for any (finite)

$n$ , any

$t_0$ and for all

$t_1,\cdots,t_n\in T$ ,

$(X_{t_1},\cdots,X_{t_n})$ has the same distribution as

$(X_{t_1+t_0},\cdots,X_{t_n+t_0})$ .

Definition 2.4 (Weakly Stationarity) A stochastic process

$X$ is called weak stationary if its mean function is constant and its covariance function is invariant under index shift. That is,

$\forall t\in T$ ,

$E(X_t)=\mu$ and for all

$t_i,t_j\in T$ ,

$Cov(X_{t_i},X_{t_j})=c(t_i-t_j)$ . It is a function of

$t_i-t_j$ only.

Stationarity is a simplification on f.d.d.s.. The theory and method for a stochastic process are considerably simplified under the assumption of (either strong or weak) stationarity, which imposes certain structures on the finite dimensional distributions.
Strong stationarity implies weak stationarity. The converse is not true in general. While for the Gaussian process, the opposite direction is also true. For Gaussian process the f.d.d.s. are all multivariate normal and we can characterize f.d.d.s. completely by the mean function and covariance function.

From the theory of Fourier analysis, any function $f:\mathbb{R}\to\mathbb{R}$ with certain properties (including periodicity and continuity) has a unique Fourier representation. $\begin{equation} f(x)=0.5a_0+\sum_{n=1}^{\infty}(a_n\cos(nx)+b_n\sin(nx)) \tag{2.2} \end{equation}$ which express $f$ as a sum of varying proportions of regular oscillation.

Fourier representation transforms the uncountably infinite dimensional problem of estimating unknown function $f(x)$ to a countably infinite dimensional problem of estimating $a_n$ and $b_n$ .

We will express the covariance function of a stationary process using the Fourier transformation. Before doing that, let us fix some notations. By weak stationarity we have $E(X_t)=\mu,\forall t$ and $Cov(X_{t_i},X_{t_j})=c(t)$ , where $t=|t_i-t_j|$ . Then $Var(X_{t_i})=c(0)=\sigma^2$ for all $t\in\mathbb{R}$ . With out loss of generality, we assume $\mu=0$ and $\sigma^2=1$ . Under this assumption, the autocorrelation function and the autocovariance function coincide, i.e. $Corr(X_{t_i},X_{t_j})=c(t)$ .

We are going to express (using the idea of Fourier transformation) the covariance function as a characteristic function of some random variable. That is $\begin{equation} c(t)=\int \exp(itx)dF(x)=\int \exp(itx)f(x)dx=E(\exp(itx)) \tag{2.3} \end{equation}$ for some distribution function $F$ . This distribution is called the spectral distribution of the stochastic process. If $F$ has a density, then that is called the spectral density.

The spectral distribution gives us another paradigm of studying stochastic process. Instead of following the usually paradigm to study the mean and covariance function, one can also go to the spectral domain and model the spectral density by density estimation methods. Sometimes working with spectral density is much more easier (especially in time series analysis), but when the dimension is higher, working in complex domain becomes harder.

Theorem 2.1 (Bochner Theorem in Stochastic Process) A function $\phi$ is a characteristic function of a random variable if

$\phi(0)=1$ , $|\phi(t)|\leq 1$ for all $t$ .
$\phi$ is uniformly continuous on $\mathbb{R}$ .
$\phi$ is a non-negative function. i.e. for all $t_1,\cdots,t_n$ , and constant $z_1,\cdots,z_n$ , $\sum_{i=1}^k\sum_{j=1}^kz_iz_j\phi(t_i-t_j)\geq 0$ .

Thus, only (2) needs to be assumed for $c(t)$ to have a spectral density. From (2.3), using the result from Fourier analysis, we can express the spectral density $f$ as $\begin{equation} f(x)=\frac{1}{2\pi}\int_{-\infty}^{\infty}\exp(-itx)c(t)dt \tag{2.4} \end{equation}$

The spectral density and the covariance function has a one-to-one correspondence.

The above representation has been mainly defined when $T=\mathbb{R}$ . There are some special case when the index is not $\mathbb{R}$ .

If the index point is multivariate, i.e. $T=\mathbb{R}^d$ , then $\begin{equation} c(\mathbf{t})=\int_{x_1}\cdots\int_{x_t}\exp(-i\mathbf{t}\mathbf{x})f(\mathbf{x})d\mathbf{x} \tag{2.5} \end{equation}$

If the indexing set of the stochastic process is

$\mathbb{Z}$ , i.e.

$X=\{X(n,\omega):n\in\mathbb{Z},\omega\in\Omega\}$ . In that case the above representation is not unique since

$\exp(inx)=\exp(i(n+2\pi)x)$ when

$x$ is an integer. However, we can constrain it on the region

$(-\pi,\pi)$ and define

$\begin{equation} c(n)=\int_{-\pi}^{\pi}\exp(-inu)dF(u)\Longrightarrow f(u)=\frac{1}{2\pi}\sum_{n=-\infty}^{\infty}\exp(-inu)c(u), \tag{2.6} \end{equation}$ for

$u\in(-\pi,\pi)$ . Then the correspondence is still one-to-one.

When we are concerned about a stochastic process with a discrete index set, that is $X=\{X(n,\omega):n\in\mathbb{Z},\omega\in\Omega\}$ . We are interested in $S_n=\sum_{j=1}^nX_j$ . When $n$ is large and $X_j$ are i.i.d. it is simple to study $S_n$ , we then have CLT or LLN to give the asymptotic behavior of $S_n$ . However, for a stochastic process, these assumptions do not hold. Still we have some results for stochastic process, these are known as the ergodicity theorem for stochastic process.

Theorem 2.2 (Ergodic Theorem for Weakly Stationary Stochastic Process) If

$X=\{X(n,\omega):n\in\mathbb{Z},\omega\in\Omega\}$ is a weakly stationary process. Then there exist a random variable

$Y$ such that

$E(y)=E(X_1)$ and

$\frac{S_n}{n}\to Y$ in mean square, meaning that

$\lim_{n\to\infty}E[(\frac{S_n}{n}-Y)^2]=0$

Theorem 2.3 (Ergodic Theorem for Strongly Stationary Stochastic Process) If

$X=\{X(n,\omega):n\in\mathbb{Z},\omega\in\Omega\}$ is a strongly stationary process such that

$E(|X_1|)<\infty$ . Then there exist a random variable

$Y$ such that

$E(y)=E(X_1)$ and

$\frac{S_n}{n}\to Y$ almost surely and in mean square, meaning that

$\lim_{n\to\infty}P[|\frac{S_n}{n}-Y|>\epsilon]=0$ for any

$\epsilon>0$ .

Definition 2.5 (Brownian Motion) A stochastic process $B=\{B(t,\omega):t\geq 0,\omega\in\Omega\}$ is called a Brownian motion, if it satisfies

$B_0=0$ ;
$B_t-B_s\sim N(0,t-s)$ , $\forall t>s\geq 0$ .
$B_t-B_s$ is independent of $B_s$ .
The function $t\to B_t$ is continuous.

Brownian motion is not a stationary stochastic process, because from (2) and (3) in Definition 2.5, $Cov(B_t,B_s)=\min(t,s)$