1.6 Limit Theorem

1.6.1 Central Limit Theorem

https://en.wikipedia.org/wiki/Galton_board

Theorem 1.2 Let \(X_1,X_2,\cdots,X_n\) be a sequence of independent and identically distributed (i.i.d.) random variables with finite mean \(\mu\) and finite variance \(\sigma^2\).

Let \(\bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i\) be the sample mean, \(Z_n = \frac{\bar{X}_n−\mu}{\sigma/\sqrt{n}}\) be the standardized sample mean.

Then, as \(n\) approaches infinity, the distribution of \(Z_n\) converges to a standard normal distribution, i.e.,

\[ \lim_{n \to \infty} P(Z_n \leq z) = \Phi(z) \]

where \(\Phi(z)\) is the cumulative distribution function of the standard normal distribution.

Special Case:

https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem

https://en.wikipedia.org/wiki/Illustration_of_the_central_limit_theorem

1.6.2 Probability inequality

1.6.2.1 Markov inequality

Let \(X\) be a non-negative random variable. Then for any \(a>0\),

\[\begin{equation*} P(X \geq a) \leq \frac{E[X]}{a} \end{equation*}\]

Let \(I\) be the indicator random variable for the event \(\{X \geq a\}\), defined as: \[ I = \begin{cases} 1 & \text{if } X \geq a \\ 0 & \text{if } X < a \end{cases} \] Since \(X\) is non-negative, we have \(X \geq aI\) for all possible values of \(X\). Taking the expectation of both sides:

\[ E[X] \geq E[aI] = aE[I] = aP(X \geq a) \] Dividing both sides by a gives the desired result:

\[ P(X \geq a) \leq \frac{E[X]}{a} \]

1.6.2.2 Chebshev inequality

Let \(X\) be a random variable with finite mean \(\mu\) and variance \(\sigma^2\). Then for any \(k>0\),

\[ P(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2} \]

Apply Markov’s inequality to the non-negative random variable \((X−\mu)^2\):

\[ P((X - \mu)^2 \geq k^2\sigma^2) \leq \frac{E[(X - \mu)^2]}{k^2\sigma^2} \]

Since \(E[(X−\mu)^2] = \sigma^2\), we have:

\[ P(|X - \mu| \geq k\sigma) \leq \frac{\sigma^2}{k^2\sigma^2} = \frac{1}{k^2} \]

1.6.2.3 Jensen inequality

Let \(f\) be a convex function and \(X\) be a random variable. Then:

\[ E[f(X)] \geq f(E[X]) \]

For a convex function \(f\), there exists a supporting hyperplane at any point \(x_0\). This means that for all \(x\):

\[ f(x) \geq f(x_0) + f'(x_0)(x - x_0) \]

Let \(x_0 = E[X]\). Then:

\[ f(X) \geq f(E[X]) + f'(E[X])(X - E[X]) \]

Taking the expectation of both sides:

\[ \begin{split} E[f(X)] &\geq E[f(E[X])] + E[f'(E[X])(X - E[X])] \\ &= f(E[X]) + f'(E[X])E[X - E[X]] \\ &= f(E[X]) \end{split} \]

1.6.3 Law of large numbers

1.6.3.1 Weak Law of Large Numbers (WLLN)

Theorem 1.3 Let \(X_1,X_2,\cdots\) be a sequence of independent and identically distributed (i.i.d.) random variables with finite mean \(\mu=E[X_i]\).

Then, for any \(\varepsilon > 0\),

\[ \lim_{n \to \infty} P\left(\left|\frac{1}{n}\sum_{i=1}^{n} X_i - \mu\right| \geq \varepsilon\right) = 0 \]

1.6.3.2 Strong Law of Large Numbers (SLLN)

Theorem 1.4 Let \(X_1,X_2,\cdots\) be a sequence of i.i.d. random variables with finite mean \(\mu=E[X_i]\).

Then,

\[ P\left(\lim_{n \to \infty} \frac{1}{n}\sum_{i=1}^{n} X_i = \mu\right) = 1 \]