# Chapter 4 Probability Distribution

## 4.1 Introduction to Probability

A probability model consists of a nonempty set called the sample space S; a collection of events that are subsets of S; and a probability measure P assigning a probability between 0 and 1 to each event, with $$P(\varnothing)=0$$ and $$P(S)=1$$ and with P additive

• Sample space S: the set of possible outcomes.
• example: sample space in a single coin flip $$S={H,T}$$
• Event E: a subset of the sample space.
• example: in a single coin flip H (coin lands head) is an event.
• Probability : For each event E, $$P(E)$$ means the probability of event E occurring. The properties of $$P(E)$$ such as:
• $$P(A)$$ is always a nonnegative real number, between 0 and 1 inclusive. $$0≤ P(E) ≤ 1$$
• $$P(\varnothing)=0$$, i.e., if E is the empty set , then $$P(E) = 0$$
• $$P(S)=1$$, i.e., if E is the entire sample space S, then $$P(E)=1$$
• P is (countably) additive, meaning that if A1,A2, . . . is a finite or countable sequence of disjoint events, then $$P(A1 \cup A2 \cup . . . ) = \sum_{i} P(A_i)$$

## 4.2 Random Variables

A random variable X is a function from a sample space S to a real number. The distribution of random variable X is the collection of probabilities $$P(X \in B)$$ for X belonging to all subsets B.

### 4.2.1 Discrete Distribution

A discrete random variable X assumes values in discrete subset of $$\mathbb{R}$$

The distribution of discrete random variable X named Probability Mass Function $$P(X=x) = p_x(x)$$, where $\sum_{x \in i} P(X=x) = 1$

### 4.2.2 Continues Distribution

A continuous random variable X assumes values in $$\mathbb{R}$$

The distribution of continues random variable X named Probability Density Function $$f_x$$, where $P(X=x) = 0$ $\int_{-\infty}^{\infty} f(x) dx = 1$ $P(a≤X≤B) = \int_{a}^{b} f(x) dx$

## 4.3 Probability Distribution

### 4.3.1 Bernouli Distribution

The random variable X is said to have the Bernoulli distribution if a response variable takes only two possible values, with the probability of a values is p

$Y \sim Ber(p)$ If $$X \in {0,1}$$ and $$p_x(1)=1-p_x(0)=p$$

### 4.3.2 Binomial Distribution

The binomial distribution models the number of successes k in a fixed number of independent trials n, each with the same probability of success p

$Y \sim Bin(n,p)$ If $$X \in {0,1,2, . . . ,n}$$ and $$p_x(k)= \binom{n}{k} p^k (1-p)^{(n-k)}$$

In R, functions to work with the binomial distribution, such as:

• dbinom(k, n, p) : calculates the probability mass function, probability of getting exactly k successes in size trials n with a success probability of p
## Probability of getting exactly 3 successes in 5 trials with a success probability of 0.5 :
dbinom(3, size = 5, prob = 0.5)
## [1] 0.3125
• pbinom(k, n, p) : calculates the cumulative distribution function, probability of getting up to k successes in size trials n with a success probability of p
## Probability of getting up to 3 successes in 5 trials with a success probability of 0.5 :
pbinom(3, size = 5, prob = 0.5)
## [1] 0.8125
• qbinom(prop, size, p) : finds the number of successes k such that the probability of getting that number or fewer successes is prob in size trials with a success probability of p
## Find the number of successes such that the probability of getting that number or fewer successes is 0.8
qbinom(0.8, size = 5, prob = 0.5) 
## [1] 3
• rbinom(n, size, prob) : generates n random numbers following a binomial distribution with size trials n and a success probability of p
## Generate 10 random numbers following a binomial distribution with 5 trials and a success probability of 0.5 :
rbinom(10, size = 5, prob = 0.5) 
##  [1] 4 4 1 2 4 2 4 1 4 3

### 4.3.3 Poisson Distribution

The Poisson distribution models the probability of a certain number of events occurring within a fixed interval of time or space, given a known average rate of occurrence.

$Y \sim Pois(\lambda)$ If $$X \in {0,1,2, . . . }$$ and $$p_x(k)= \frac{\lambda^k e^{-\lambda}}{k!}$$

In R, functions to work with the poisson distribution, such as:

• dpois(k, lambda) : calculates the probability of observing x events in a given interval with an average rate of occurrence lambda.
## Probability of observing exactly 3 events in an interval with an average rate of occurrence of 2 :
dpois(3, lambda = 2) 
## [1] 0.180447
• ppois(x, lambda) : calculates the cumulative distribution function, probability of observing up to x events in a given interval with an average rate of occurrence lamnda.
## Probability of observing up to 3 events in an interval with an average rate of occurrence of 2 :
ppois(3, lambda = 2) 
## [1] 0.8571235
• qpois(p, lambda) : the number of events such that the probability of observing that number or fewer events is p in a given interval with an average rate of occurrence lambda.
## Find the number of events such that the probability of observing that number or fewer events is 0.8 :
qpois(0.8, lambda = 2) 
## [1] 3
• rpois(n, lambda) : generates n random numbers following a Poisson distribution with an average rate of occurrence lambda
## Generate 10 random numbers following a Poisson distribution with an average rate of occurrence of 2 :
rpois(10, lambda = 2) 
##  [1] 3 2 1 1 2 4 5 2 2 3

### 4.3.4 Normal Distribution

The normal distribution, also known as the Gaussian distribution, is a probability distribution that is symmetric, bell-shaped, and characterized by its mean and standard deviation, with the majority of observations clustered around the mean.

$Y \sim Normal(\mu, {\sigma}^2)$ If $$-\infty<X<\infty$$ and $$f_x= \frac{1}{\sqrt{2\pi\sigma}} exp{-\frac{(x-\mu)^2}{2\sigma^2}}$$

In R, functions to work with the poisson distribution, such as:

• dnorm(x, mean, sd) : calculates the probability density at a given point x for a normal distribution with mean mean and standard deviation sd.
# Probability density at x = 0 for a normal distribution with mean 0 and standard deviation 1
dnorm(0, mean = 0, sd = 1)
## [1] 0.3989423
• pnorm(x, mean, sd) : calculates the cumulative distribution function, probability of observing a value less than or equal to x in a normal distribution with mean mean and standard deviation sd.
# Cumulative probability up to x = 1 for a normal distribution with mean 0 and standard deviation 1
pnorm(1, mean = 0, sd = 1)
## [1] 0.8413447
• qnorm(p, mean, sd) : finds the value such that the probability of observing a value less than or equal to that value
# Find the value such that the cumulative probability is 0.8 for a normal distribution with mean 0 and standard deviation 1
qnorm(0.8, mean = 0, sd = 1)
## [1] 0.8416212
• rnorm(n, mean, sd) : generates n random numbers following a normal distribution with mean mean and standard deviation sd.
# Generate 10 random numbers following a normal distribution with mean 0 and standard deviation 1
rnorm(10, mean = 0, sd = 1)
##  [1] -1.0378804  1.1208806  0.7059895  0.3188912  0.2439931 -0.2477641
##  [7] -0.8954654  1.1203232 -0.1612854 -1.4704093

## 4.4 Probability Distribution Summary

Distribution Type PMF/PDF E(X) Var(X) R Function
Bernoulli D $$p_x(k)= p^k (1-p)^{(1-k)}$$ $$X \in {0,1}$$ $$p$$ $$p(1-p)$$ -
Binomial D $$p_x(k)= \binom{n}{k} p^k (1-p)^{(n-k)}$$ $$X \in {0,1,2, . . . ,n}$$ $$np$$ $$np(1-p)$$ dbinom, pbinom, qbinom, rbinom
Poisson D $$p_x(k)= \frac{\lambda^k e^{-\lambda}}{k!}$$ $$X \in {0,1,2, . . . }$$ $$\lambda$$ $$\lambda$$ dpois, ppois, qpois, rpois
Uniform C $$f_x= \frac{1}{b-a}$$ for $$a<=x<=b$$ $$\frac{a+b}{2}$$ $$\frac{(a-b)^2}{12}$$ dunif, punif, qunif, runif
Normal C $$f_x= \frac{1}{\sqrt{2\pi\sigma}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}$$ for $$-\infty<X<\infty$$ $$\mu$$ $$\sigma^2$$ dnorm, pnorm, qnorm, rnorm
Exponential C $$f_x=\lambda e^{(-\lambda x)}$$ for $$x>0$$ $$\frac{1}{\lambda}$$ $$\frac{1}{\lambda^2}$$ dexp, pexp, qexp, rexp
Geometry D $$p_x(k)= (1-p)^{(k-1)}p$$ $$0<X<=1$$ $$\frac{1}{p}$$ $$\frac{1-p}{p^2}$$ dgeom, pgeom, qgeom, rgeom
Beta D $$f_x=\frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)} x^{\alpha-1}(1-x)^{\beta-1}$$ for $$0<=x<=1$$ $$\frac{\alpha}{\alpha+\beta}$$ $$\frac{\alpha\beta}{(\alpha+\beta)^2 (\alpha+\beta+1)}$$ dbeta, pbeta, qbeta, rbeta