18 Reference Distributions

In this chapter, you will find a list of the distributions we are using in this course.

18.1 Bernoulli Distributions

A Bernoulli random variable is a discrete random variable such that \(P(X=1) = p\) and \(P(X=0) = 1-p = q\). We get \[E[X]=p \text{ and } VAR[X]=pq.\] Example: Bernoulli random variable with p=0.6.

18.2 Binomial Distribution

Assume that you have a binomial distribution with N trials, x number of successes, probability of success = p. Then:

\[P(X=x) = {N \choose x} p^x (1-p)^{N-x} = \frac{N!}{x! (N-x)!} p^xq^{N-x}\] You can use the binomial distribution if these prerequisites are given:

  • You have a fixed number of trials, N;
  • Each trial has two outcomes, success and failure;
  • Each trail has the same probability p of success as the others;
  • All the trials are independent.

Note that we can interpret a binomial random variable, n trials, as the sum of n independent and identically distributed Bernoulli random variables. Similarly, we could think of a Bernoulli r.v. as a special case of a binomial r.v with n=1. We thus get the expected value and variance of a binomial random variable as \[E[X]= np,\] \[VAR[X] = n p q.\] Example: Binomial distribution with N=20, p=0.3

18.3 Discrete Uniform distribution

All potential values of the distribution have the same probability. For \(n\) possible values \(x_i\) this gives us \(P(X=x_i) = \frac{1}{n}\). The mean and variance depend on the “spacing” of the values. If we have a uniform distribution of consecutive integers on an interval \([a,b]\), i.e. the possible values are a, a+1, a+2, …, b, then \[n=(b-a)+1\] \[ E[X]=\mu = \frac{a+b}{2}\] \[VAR[X] = \sigma^2 = \frac{n^2-1}{12}\]

18.4 Continuous Uniform Distribution

Assume that you have a uniform distribution on an interval \([a,b]\), where \(-\infty <a<b < \infty\in R\). Then

\[\begin{equation*} f(x) = \left\{ \begin{array}{ll} \frac{1}{b-a} & x \in [a,b] \\ 0 & x \notin [a,b] \end{array} \right . \end{equation*}\] \[\begin{equation*} F(x) = \left\{ \begin{array}{ll} 0 & x <a\\ \frac{x-a}{b-a} & x \in [a,b]\\ 1 & x>b \end{array} \right. \end{equation*}\] \[\begin{equation*} E[X] = \frac{a+b}{2} \end{equation*}\] \[\begin{equation*} VAR[X] = \frac{(b-a)^2}{12} \end{equation*}\]

Example: Uniform distribution on \([2,5]\)

18.5 Exponential distribution

An exponential distribution is characterized by the parameter \(\lambda\), called the distribution rate or rate parameter. It defines the time between events in a process in which events occur continuously and independently at a constant average rate. A random variable \(X\) with an exponential distribution is continuous and often measures a passage of time, although it can be used in other applications.

Example: Assume there are on average four customers every hour. What is the probability that the next customer arrives within the next 20 minutes? Here our time scale would be in hours and \(\lambda =\lambda.\) We find \(P(X\le \frac{1}{3}.\)

The density function of an exponential distribution is given by
\[f(x) = \frac{1}{\lambda} e^{-\frac{x}{\lambda}},\text{ } x\ge 0\] We have \[E[X]=\lambda \] \[VAR[X] = \lambda^2\]

Example: Exponential distribution with \(\lambda=4\) and \(\lambda = 1\).

18.6 Normal Distribution, a.k.a. Gaussian Distribution

A normal distribution is determined by its mean \(\mu\) and standard deviation \(\sigma\). For a random variable X, we write \(X \sim \mathcal{N} (\mu, \sigma)\). If \(\mu = 0\) and \(\sigma = 1\) we have a standard normal distribution. The density function is given by \[f(x) = \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{(x-\mu)^2}{2 \sigma ^2}}.\] Note that this function is symmetric about the mean and has inflection points at \(\mu \pm \sigma.\)

Example: Normal distribution with mean = 2, standard deviation = 0.5

18.7 Student t-distribution

Assume you have samples, each consisting of n independent observations from the same normal distribution with mean \(\mu\) and standard deviation \(\sigma\). For each sample, compute the associated t-value \(t=\frac{\bar{x}-\mu}{s/\sqrt{k}}\). A t-distribution with n-1 degrees of freedom is the sampling distribution of those t-values. Note that the t-distribution changes shape depending on the size of the samples used. As \(n \rightarrow \infty\), the t-distribution approaches the standard normal distribution (see Central Limit Theorem). We have \[E[X] = 0 \text{ for }n>2 \text{, else undefined}\] \[VAR[X] = \frac{n-1}{n-3} \text{ for }n>5, \infty \text{ for }n=3, 4, 5, \text{ else undefined} \] The density function is given by \[f(x) = \frac{\Gamma(\frac{n}{2})}{\sqrt{(n-1)\pi}\Gamma(\frac{n-1}{2})} (1+\frac{x^2}{n-1})^{-n/2},\] where the Gamma function \(\Gamma(x)\) is defined for positive x as \(\Gamma(x) = \int_0^{\infty} t^{x-1}e^{-t}dt.\) It can be shown (challenge exercise) that \[\lim_{n \rightarrow \infty} \frac{\Gamma(\frac{n}{2})}{\sqrt{(n-1)\pi}\Gamma(\frac{n-1}{2})} (1+\frac{x^2}{n-1})^{-n/2} = \frac{1}{\sqrt{2 \pi}}e^{\frac{-x^2}{2}}.\]

18.8 Chi-square Distribution with \(k\) degrees of freedom

The Chi-square distribution (\(\mathcal X ^2\)) arises as the distribution of the sum of squares of \(k\) independent standard normal random variables. If a random variable \(X \sim \mathcal{X}^2(k)\) \[E[X] = k \text{ and }\] \[VAR[X]=2k.\]

The density function is given by \[f(x) = \frac{1}{2^{\frac{k}{2}}\Gamma(\frac{k}{2})} x^{\frac{k}{2}-1}e^{-\frac{x}{2}}\] The Gamma function \(\Gamma(x)\) is defined for positive x as \[\Gamma(x) = \int_0^{\infty} t^{x-1}e^{-t}dt.\] Fun fact: \(\Gamma(n) = (n-1)!\)

18.9 F-distribution

Assume you have two random variables \(X_1\) and \(X_2\) with \(\mathcal X ^2\) distributions with \(k_1\) and \(k_2\) degrees of freedom respectively. Compute the ratio \[X = \frac{X_1 / k_1}{X_2 / k_2}.\] The distribution of F is called F-distribution with \(k_1\) and \(k_2\) degrees of freedom. The formulas for the density and cumulative distribution functions are too complicated for this course, so we will skip them and just look at the graphs below. However, you should convince yourself that, given the definition of the ration \(X\), the density and cdf graphs make sense.

Furthermore, we have \[E[X] = \frac{k_2}{k_2-2} \text{ for } k_2>2\] \[VAR[X] = \frac{2k_2^2(k_1+k_2-2)}{k_1(k_2-2)^2(k_2-4)} \text{ for } k_2 > 4\] Strangely, the expected value does not depend on \(k_1\).

Note that, if \(k_1 = k_2 \rightarrow \infty\), \(E[X] \rightarrow 1\) and \(VAR[X] \rightarrow 0\).