1.6 Limit Theorem

1.6.1 Central Limit Theorem

https://en.wikipedia.org/wiki/Galton_board

Theorem 1.2 Let X1,X2,,Xn be a sequence of independent and identically distributed (i.i.d.) random variables with finite mean μ and finite variance σ2.

Let ˉXn=1nni=1Xi be the sample mean, Zn=ˉXnμσ/n be the standardized sample mean.

Then, as n approaches infinity, the distribution of Zn converges to a standard normal distribution, i.e.,

lim

where \Phi(z) is the cumulative distribution function of the standard normal distribution.

Special Case:

https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem

https://en.wikipedia.org/wiki/Illustration_of_the_central_limit_theorem

1.6.2 Probability inequality

1.6.2.1 Markov inequality

Let X be a non-negative random variable. Then for any a>0,

\begin{equation*} P(X \geq a) \leq \frac{E[X]}{a} \end{equation*}

Let I be the indicator random variable for the event \{X \geq a\}, defined as: I = \begin{cases} 1 & \text{if } X \geq a \\ 0 & \text{if } X < a \end{cases} Since X is non-negative, we have X \geq aI for all possible values of X. Taking the expectation of both sides:

E[X] \geq E[aI] = aE[I] = aP(X \geq a) Dividing both sides by a gives the desired result:

P(X \geq a) \leq \frac{E[X]}{a}

1.6.2.2 Chebshev inequality

Let X be a random variable with finite mean \mu and variance \sigma^2. Then for any k>0,

P(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}

Apply Markov’s inequality to the non-negative random variable (X−\mu)^2:

P((X - \mu)^2 \geq k^2\sigma^2) \leq \frac{E[(X - \mu)^2]}{k^2\sigma^2}

Since E[(X−\mu)^2] = \sigma^2, we have:

P(|X - \mu| \geq k\sigma) \leq \frac{\sigma^2}{k^2\sigma^2} = \frac{1}{k^2}

1.6.2.3 Jensen inequality

Let f be a convex function and X be a random variable. Then:

E[f(X)] \geq f(E[X])

For a convex function f, there exists a supporting hyperplane at any point x_0. This means that for all x:

f(x) \geq f(x_0) + f'(x_0)(x - x_0)

Let x_0 = E[X]. Then:

f(X) \geq f(E[X]) + f'(E[X])(X - E[X])

Taking the expectation of both sides:

\begin{split} E[f(X)] &\geq E[f(E[X])] + E[f'(E[X])(X - E[X])] \\ &= f(E[X]) + f'(E[X])E[X - E[X]] \\ &= f(E[X]) \end{split}

Application:

Take logarithm to Y, then fit OLS: E[\log(Y)] = X \beta

Poisson GLM: \log(E[Y]) = X \beta

1.6.3 Law of large numbers

1.6.3.1 Weak Law of Large Numbers (WLLN)

Theorem 1.3 Let X_1,X_2,\cdots be a sequence of independent and identically distributed (i.i.d.) random variables with finite mean \mu=E[X_i].

Then, for any \varepsilon > 0,

\lim_{n \to \infty} P\left(\left|\frac{1}{n}\sum_{i=1}^{n} X_i - \mu\right| \geq \varepsilon\right) = 0

1.6.3.2 Strong Law of Large Numbers (SLLN)

Theorem 1.4 Let X_1,X_2,\cdots be a sequence of i.i.d. random variables with finite mean \mu=E[X_i].

Then,

P\left(\lim_{n \to \infty} \frac{1}{n}\sum_{i=1}^{n} X_i = \mu\right) = 1