Chapter 4 Probability Distribution

4.1 Introduction to Probability

A probability model consists of a nonempty set called the sample space S; a collection of events that are subsets of S; and a probability measure P assigning a probability between 0 and 1 to each event, with \(P(\varnothing)=0\) and \(P(S)=1\) and with P additive

  • Sample space S: the set of possible outcomes.
    • example: sample space in a single coin flip \(S={H,T}\)
  • Event E: a subset of the sample space.
    • example: in a single coin flip H (coin lands head) is an event.
  • Probability : For each event E, \(P(E)\) means the probability of event E occurring. The properties of \(P(E)\) such as:
    • \(P(A)\) is always a nonnegative real number, between 0 and 1 inclusive. \(0≤ P(E) ≤ 1\)
    • \(P(\varnothing)=0\), i.e., if E is the empty set , then \(P(E) = 0\)
    • \(P(S)=1\), i.e., if E is the entire sample space S, then \(P(E)=1\)
    • P is (countably) additive, meaning that if A1,A2, . . . is a finite or countable sequence of disjoint events, then \(P(A1 \cup A2 \cup . . . ) = \sum_{i} P(A_i)\)

4.2 Random Variables

A random variable X is a function from a sample space S to a real number. The distribution of random variable X is the collection of probabilities \(P(X \in B)\) for X belonging to all subsets B.

4.2.1 Discrete Distribution

A discrete random variable X assumes values in discrete subset of \(\mathbb{R}\)

The distribution of discrete random variable X named Probability Mass Function \(P(X=x) = p_x(x)\), where \[\sum_{x \in i} P(X=x) = 1\]

4.2.2 Continues Distribution

A continuous random variable X assumes values in \(\mathbb{R}\)

The distribution of continues random variable X named Probability Density Function \(f_x\), where \[ P(X=x) = 0 \] \[ \int_{-\infty}^{\infty} f(x) dx = 1 \] \[ P(a≤X≤B) = \int_{a}^{b} f(x) dx \]

4.3 Probability Distribution

4.3.1 Bernouli Distribution

The random variable X is said to have the Bernoulli distribution if a response variable takes only two possible values, with the probability of a values is p

\[Y \sim Ber(p)\] If \(X \in {0,1}\) and \(p_x(1)=1-p_x(0)=p\)

4.3.2 Binomial Distribution

The binomial distribution models the number of successes k in a fixed number of independent trials n, each with the same probability of success p

\[Y \sim Bin(n,p)\] If \(X \in {0,1,2, . . . ,n}\) and \(p_x(k)= \binom{n}{k} p^k (1-p)^{(n-k)}\)

In R, functions to work with the binomial distribution, such as:

  • dbinom(k, n, p) : calculates the probability mass function, probability of getting exactly k successes in size trials n with a success probability of p
## Probability of getting exactly 3 successes in 5 trials with a success probability of 0.5 : 
dbinom(3, size = 5, prob = 0.5)
## [1] 0.3125
  • pbinom(k, n, p) : calculates the cumulative distribution function, probability of getting up to k successes in size trials n with a success probability of p
## Probability of getting up to 3 successes in 5 trials with a success probability of 0.5 : 
pbinom(3, size = 5, prob = 0.5)
## [1] 0.8125
  • qbinom(prop, size, p) : finds the number of successes k such that the probability of getting that number or fewer successes is prob in size trials with a success probability of p
## Find the number of successes such that the probability of getting that number or fewer successes is 0.8 
qbinom(0.8, size = 5, prob = 0.5) 
## [1] 3
  • rbinom(n, size, prob) : generates n random numbers following a binomial distribution with size trials n and a success probability of p
## Generate 10 random numbers following a binomial distribution with 5 trials and a success probability of 0.5 : 
rbinom(10, size = 5, prob = 0.5) 
##  [1] 2 1 4 5 2 4 2 3 4 2

4.3.3 Poisson Distribution

The Poisson distribution models the probability of a certain number of events occurring within a fixed interval of time or space, given a known average rate of occurrence.

\[Y \sim Pois(\lambda)\] If \(X \in {0,1,2, . . . }\) and \(p_x(k)= \frac{\lambda^k e^{-\lambda}}{k!}\)

In R, functions to work with the poisson distribution, such as:

  • dpois(k, lambda) : calculates the probability of observing x events in a given interval with an average rate of occurrence lambda.
## Probability of observing exactly 3 events in an interval with an average rate of occurrence of 2 : 
dpois(3, lambda = 2) 
## [1] 0.180447
  • ppois(x, lambda) : calculates the cumulative distribution function, probability of observing up to x events in a given interval with an average rate of occurrence lamnda.
## Probability of observing up to 3 events in an interval with an average rate of occurrence of 2 : 
ppois(3, lambda = 2) 
## [1] 0.8571235
  • qpois(p, lambda) : the number of events such that the probability of observing that number or fewer events is p in a given interval with an average rate of occurrence lambda.
## Find the number of events such that the probability of observing that number or fewer events is 0.8 : 
qpois(0.8, lambda = 2) 
## [1] 3
  • rpois(n, lambda) : generates n random numbers following a Poisson distribution with an average rate of occurrence lambda
## Generate 10 random numbers following a Poisson distribution with an average rate of occurrence of 2 : 
rpois(10, lambda = 2) 
##  [1] 1 1 1 3 0 3 1 3 4 3

4.3.4 Normal Distribution

The normal distribution, also known as the Gaussian distribution, is a probability distribution that is symmetric, bell-shaped, and characterized by its mean and standard deviation, with the majority of observations clustered around the mean.

\[Y \sim Normal(\mu, {\sigma}^2)\] If \(-\infty<X<\infty\) and \(f_x= \frac{1}{\sqrt{2\pi\sigma}} exp{-\frac{(x-\mu)^2}{2\sigma^2}}\)

In R, functions to work with the poisson distribution, such as:

  • dnorm(x, mean, sd) : calculates the probability density at a given point x for a normal distribution with mean mean and standard deviation sd.
# Probability density at x = 0 for a normal distribution with mean 0 and standard deviation 1
dnorm(0, mean = 0, sd = 1)
## [1] 0.3989423
  • pnorm(x, mean, sd) : calculates the cumulative distribution function, probability of observing a value less than or equal to x in a normal distribution with mean mean and standard deviation sd.
# Cumulative probability up to x = 1 for a normal distribution with mean 0 and standard deviation 1
pnorm(1, mean = 0, sd = 1)
## [1] 0.8413447
  • qnorm(p, mean, sd) : finds the value such that the probability of observing a value less than or equal to that value
# Find the value such that the cumulative probability is 0.8 for a normal distribution with mean 0 and standard deviation 1
qnorm(0.8, mean = 0, sd = 1)
## [1] 0.8416212
  • rnorm(n, mean, sd) : generates n random numbers following a normal distribution with mean mean and standard deviation sd.
# Generate 10 random numbers following a normal distribution with mean 0 and standard deviation 1
rnorm(10, mean = 0, sd = 1)
##  [1] -1.4662273  0.3501215  0.9431867  0.4434739  0.8454432  0.2389213
##  [7] -1.1977786  0.4258555  0.3626680  0.1183605

4.4 Probability Distribution Summary

Distribution Type PMF/PDF E(X) Var(X) R Function
Bernoulli D \(p_x(k)= p^k (1-p)^{(1-k)}\) \(X \in {0,1}\) \(p\) \(p(1-p)\) -
Binomial D \(p_x(k)= \binom{n}{k} p^k (1-p)^{(n-k)}\) \(X \in {0,1,2, . . . ,n}\) \(np\) \(np(1-p)\) dbinom, pbinom, qbinom, rbinom
Poisson D \(p_x(k)= \frac{\lambda^k e^{-\lambda}}{k!}\) \(X \in {0,1,2, . . . }\) \(\lambda\) \(\lambda\) dpois, ppois, qpois, rpois
Uniform C \(f_x= \frac{1}{b-a}\) for \(a<=x<=b\) \(\frac{a+b}{2}\) \(\frac{(a-b)^2}{12}\) dunif, punif, qunif, runif
Normal C \(f_x= \frac{1}{\sqrt{2\pi\sigma}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}\) for \(-\infty<X<\infty\) \(\mu\) \(\sigma^2\) dnorm, pnorm, qnorm, rnorm
Exponential C \(f_x=\lambda e^{(-\lambda x)}\) for \(x>0\) \(\frac{1}{\lambda}\) \(\frac{1}{\lambda^2}\) dexp, pexp, qexp, rexp
Geometry D \(p_x(k)= (1-p)^{(k-1)}p\) \(0<X<=1\) \(\frac{1}{p}\) \(\frac{1-p}{p^2}\) dgeom, pgeom, qgeom, rgeom
Beta D \(f_x=\frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)} x^{\alpha-1}(1-x)^{\beta-1}\) for \(0<=x<=1\) \(\frac{\alpha}{\alpha+\beta}\) \(\frac{\alpha\beta}{(\alpha+\beta)^2 (\alpha+\beta+1)}\) dbeta, pbeta, qbeta, rbeta