Chapter 4 Probability Distribution
4.1 Introduction to Probability
A probability model consists of a nonempty set called the sample space S
ΝΎ a collection of events that are subsets of S
ΝΎ and a probability measure P
assigning a probability between 0 and 1 to each event, with P(β
)=0 and P(S)=1 and with P
additive
- Sample space
S
: the set of possible outcomes.- example: sample space in a single coin flip S=H,T
- Event
E
: a subset of the sample space.- example: in a single coin flip H (coin lands head) is an event.
- Probability : For each event
E
, P(E) means the probability of eventE
occurring. The properties of P(E) such as:- P(A) is always a nonnegative real number, between 0 and 1 inclusive. 0β€P(E)β€1
- P(β
)=0, i.e., if
E
is the empty set , then P(E)=0 - P(S)=1, i.e., if
E
is the entire sample spaceS
, then P(E)=1 P
is (countably) additive, meaning that ifA1,A2, . . .
is a finite or countable sequence of disjoint events, then P(A1βͺA2βͺ...)=βiP(Ai)
4.2 Random Variables
A random variable X
is a function from a sample space S
to a real number. The distribution of random variable X is the collection of probabilities P(XβB) for X
belonging to all subsets B
.
4.3 Probability Distribution
4.3.1 Bernouli Distribution
The random variable X
is said to have the Bernoulli distribution if a response variable takes only two possible values, with the probability of a values is p
YβΌBer(p) If Xβ0,1 and px(1)=1βpx(0)=p
4.3.2 Binomial Distribution
The binomial distribution models the number of successes k
in a fixed number of independent trials n
, each with the same probability of success p
YβΌBin(n,p) If Xβ0,1,2,...,n and p_x(k)= \binom{n}{k} p^k (1-p)^{(n-k)}
In R, functions to work with the binomial distribution, such as:
dbinom(k, n, p)
: calculates the probability mass function, probability of getting exactlyk
successes in size trialsn
with a success probability ofp
## Probability of getting exactly 3 successes in 5 trials with a success probability of 0.5 :
dbinom(3, size = 5, prob = 0.5)
## [1] 0.3125
pbinom(k, n, p)
: calculates the cumulative distribution function, probability of getting up tok
successes in size trialsn
with a success probability ofp
## Probability of getting up to 3 successes in 5 trials with a success probability of 0.5 :
pbinom(3, size = 5, prob = 0.5)
## [1] 0.8125
qbinom(prop, size, p)
: finds the number of successesk
such that the probability of getting that number or fewer successes is prob in size trials with a success probability ofp
## Find the number of successes such that the probability of getting that number or fewer successes is 0.8
qbinom(0.8, size = 5, prob = 0.5)
## [1] 3
rbinom(n, size, prob)
: generates n random numbers following a binomial distribution with size trialsn
and a success probability ofp
## Generate 10 random numbers following a binomial distribution with 5 trials and a success probability of 0.5 :
rbinom(10, size = 5, prob = 0.5)
## [1] 2 1 4 5 2 4 2 3 4 2
4.3.3 Poisson Distribution
The Poisson distribution models the probability of a certain number of events occurring within a fixed interval of time or space, given a known average rate of occurrence.
Y \sim Pois(\lambda) If X \in {0,1,2, . . . } and p_x(k)= \frac{\lambda^k e^{-\lambda}}{k!}
In R, functions to work with the poisson distribution, such as:
dpois(k, lambda)
: calculates the probability of observingx
events in a given interval with an average rate of occurrence lambda.
## Probability of observing exactly 3 events in an interval with an average rate of occurrence of 2 :
dpois(3, lambda = 2)
## [1] 0.180447
ppois(x, lambda)
: calculates the cumulative distribution function, probability of observing up tox
events in a given interval with an average rate of occurrence lamnda.
## Probability of observing up to 3 events in an interval with an average rate of occurrence of 2 :
ppois(3, lambda = 2)
## [1] 0.8571235
qpois(p, lambda)
: the number of events such that the probability of observing that number or fewer events isp
in a given interval with an average rate of occurrencelambda
.
## Find the number of events such that the probability of observing that number or fewer events is 0.8 :
qpois(0.8, lambda = 2)
## [1] 3
rpois(n, lambda)
: generatesn
random numbers following a Poisson distribution with an average rate of occurrencelambda
## Generate 10 random numbers following a Poisson distribution with an average rate of occurrence of 2 :
rpois(10, lambda = 2)
## [1] 1 1 1 3 0 3 1 3 4 3
4.3.4 Normal Distribution
The normal distribution, also known as the Gaussian distribution, is a probability distribution that is symmetric, bell-shaped, and characterized by its mean and standard deviation, with the majority of observations clustered around the mean.
Y \sim Normal(\mu, {\sigma}^2) If -\infty<X<\infty and f_x= \frac{1}{\sqrt{2\pi\sigma}} exp{-\frac{(x-\mu)^2}{2\sigma^2}}
In R, functions to work with the poisson distribution, such as:
dnorm(x, mean, sd)
: calculates the probability density at a given point x for a normal distribution with mean mean and standard deviation sd.
# Probability density at x = 0 for a normal distribution with mean 0 and standard deviation 1
dnorm(0, mean = 0, sd = 1)
## [1] 0.3989423
pnorm(x, mean, sd)
: calculates the cumulative distribution function, probability of observing a value less than or equal to x in a normal distribution with mean mean and standard deviation sd.
# Cumulative probability up to x = 1 for a normal distribution with mean 0 and standard deviation 1
pnorm(1, mean = 0, sd = 1)
## [1] 0.8413447
qnorm(p, mean, sd)
: finds the value such that the probability of observing a value less than or equal to that value
# Find the value such that the cumulative probability is 0.8 for a normal distribution with mean 0 and standard deviation 1
qnorm(0.8, mean = 0, sd = 1)
## [1] 0.8416212
rnorm(n, mean, sd)
: generates n random numbers following a normal distribution with mean mean and standard deviation sd.
# Generate 10 random numbers following a normal distribution with mean 0 and standard deviation 1
rnorm(10, mean = 0, sd = 1)
## [1] -1.4662273 0.3501215 0.9431867 0.4434739 0.8454432 0.2389213
## [7] -1.1977786 0.4258555 0.3626680 0.1183605
4.4 Probability Distribution Summary
Distribution | Type | PMF/PDF | E(X) | Var(X) | R Function |
---|---|---|---|---|---|
Bernoulli | D | p_x(k)= p^k (1-p)^{(1-k)} X \in {0,1} | p | p(1-p) | - |
Binomial | D | p_x(k)= \binom{n}{k} p^k (1-p)^{(n-k)} X \in {0,1,2, . . . ,n} | np | np(1-p) | dbinom , pbinom , qbinom , rbinom |
Poisson | D | p_x(k)= \frac{\lambda^k e^{-\lambda}}{k!} X \in {0,1,2, . . . } | \lambda | \lambda | dpois , ppois , qpois , rpois |
Uniform | C | f_x= \frac{1}{b-a} for a<=x<=b | \frac{a+b}{2} | \frac{(a-b)^2}{12} | dunif , punif , qunif , runif |
Normal | C | f_x= \frac{1}{\sqrt{2\pi\sigma}} e^{-\frac{(x-\mu)^2}{2\sigma^2}} for -\infty<X<\infty | \mu | \sigma^2 | dnorm , pnorm , qnorm , rnorm |
Exponential | C | f_x=\lambda e^{(-\lambda x)} for x>0 | \frac{1}{\lambda} | \frac{1}{\lambda^2} | dexp , pexp , qexp , rexp |
Geometry | D | p_x(k)= (1-p)^{(k-1)}p 0<X<=1 | \frac{1}{p} | \frac{1-p}{p^2} | dgeom , pgeom , qgeom , rgeom |
Beta | D | f_x=\frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)} x^{\alpha-1}(1-x)^{\beta-1} for 0<=x<=1 | \frac{\alpha}{\alpha+\beta} | \frac{\alpha\beta}{(\alpha+\beta)^2 (\alpha+\beta+1)} | dbeta , pbeta , qbeta , rbeta |