# Chapter 4 Probability Distribution

## 4.1 Introduction to Probability

A probability model consists of a nonempty set called the sample space `S`

; a collection of events that are subsets of `S`

; and a probability measure `P`

assigning a probability between 0 and 1 to each event, with \(P(\varnothing)=0\) and \(P(S)=1\) and with `P`

additive

- Sample space
`S`

: the set of possible outcomes.*example:*sample space in a single coin flip \(S={H,T}\)

- Event
`E`

: a subset of the sample space.*example:*in a single coin flip H (coin lands head) is an event.

- Probability : For each event
`E`

, \(P(E)\) means the probability of event`E`

occurring. The properties of \(P(E)\) such as:- \(P(A)\) is always a nonnegative real number, between 0 and 1 inclusive. \(0≤ P(E) ≤ 1\)
- \(P(\varnothing)=0\),
*i.e*., if`E`

is the empty set , then \(P(E) = 0\) - \(P(S)=1\),
*i.e.*, if`E`

is the entire sample space`S`

, then \(P(E)=1\) `P`

is (countably) additive, meaning that if`A1,A2, . . .`

is a finite or countable sequence of disjoint events, then \(P(A1 \cup A2 \cup . . . ) = \sum_{i} P(A_i)\)

## 4.2 Random Variables

A random variable `X`

is a function from a sample space `S`

to a real number. The distribution of random variable X is the collection of probabilities \(P(X \in B)\) for `X`

belonging to all subsets `B`

.

## 4.3 Probability Distribution

### 4.3.1 Bernouli Distribution

The random variable `X`

is said to have the Bernoulli distribution if a response variable takes only two possible values, with the probability of a values is `p`

\[Y \sim Ber(p)\] If \(X \in {0,1}\) and \(p_x(1)=1-p_x(0)=p\)

### 4.3.2 Binomial Distribution

The binomial distribution models the number of successes `k`

in a fixed number of independent trials `n`

, each with the same probability of success `p`

\[Y \sim Bin(n,p)\] If \(X \in {0,1,2, . . . ,n}\) and \(p_x(k)= \binom{n}{k} p^k (1-p)^{(n-k)}\)

In R, functions to work with the binomial distribution, such as:

`dbinom(k, n, p)`

: calculates the probability mass function, probability of getting exactly`k`

successes in size trials`n`

with a success probability of`p`

```
## Probability of getting exactly 3 successes in 5 trials with a success probability of 0.5 :
dbinom(3, size = 5, prob = 0.5)
```

`## [1] 0.3125`

`pbinom(k, n, p)`

: calculates the cumulative distribution function, probability of getting up to`k`

successes in size trials`n`

with a success probability of`p`

```
## Probability of getting up to 3 successes in 5 trials with a success probability of 0.5 :
pbinom(3, size = 5, prob = 0.5)
```

`## [1] 0.8125`

`qbinom(prop, size, p)`

: finds the number of successes`k`

such that the probability of getting that number or fewer successes is prob in size trials with a success probability of`p`

```
## Find the number of successes such that the probability of getting that number or fewer successes is 0.8
qbinom(0.8, size = 5, prob = 0.5)
```

`## [1] 3`

`rbinom(n, size, prob)`

: generates n random numbers following a binomial distribution with size trials`n`

and a success probability of`p`

```
## Generate 10 random numbers following a binomial distribution with 5 trials and a success probability of 0.5 :
rbinom(10, size = 5, prob = 0.5)
```

`## [1] 4 4 1 2 4 2 4 1 4 3`

### 4.3.3 Poisson Distribution

The Poisson distribution models the probability of a certain number of events occurring within a fixed interval of time or space, given a known average rate of occurrence.

\[Y \sim Pois(\lambda)\] If \(X \in {0,1,2, . . . }\) and \(p_x(k)= \frac{\lambda^k e^{-\lambda}}{k!}\)

In R, functions to work with the poisson distribution, such as:

`dpois(k, lambda)`

: calculates the probability of observing`x`

events in a given interval with an average rate of occurrence lambda.

```
## Probability of observing exactly 3 events in an interval with an average rate of occurrence of 2 :
dpois(3, lambda = 2)
```

`## [1] 0.180447`

`ppois(x, lambda)`

: calculates the cumulative distribution function, probability of observing up to`x`

events in a given interval with an average rate of occurrence lamnda.

```
## Probability of observing up to 3 events in an interval with an average rate of occurrence of 2 :
ppois(3, lambda = 2)
```

`## [1] 0.8571235`

`qpois(p, lambda)`

: the number of events such that the probability of observing that number or fewer events is`p`

in a given interval with an average rate of occurrence`lambda`

.

```
## Find the number of events such that the probability of observing that number or fewer events is 0.8 :
qpois(0.8, lambda = 2)
```

`## [1] 3`

`rpois(n, lambda)`

: generates`n`

random numbers following a Poisson distribution with an average rate of occurrence`lambda`

```
## Generate 10 random numbers following a Poisson distribution with an average rate of occurrence of 2 :
rpois(10, lambda = 2)
```

`## [1] 3 2 1 1 2 4 5 2 2 3`

### 4.3.4 Normal Distribution

The normal distribution, also known as the Gaussian distribution, is a probability distribution that is symmetric, bell-shaped, and characterized by its mean and standard deviation, with the majority of observations clustered around the mean.

\[Y \sim Normal(\mu, {\sigma}^2)\] If \(-\infty<X<\infty\) and \(f_x= \frac{1}{\sqrt{2\pi\sigma}} exp{-\frac{(x-\mu)^2}{2\sigma^2}}\)

In R, functions to work with the poisson distribution, such as:

`dnorm(x, mean, sd)`

: calculates the probability density at a given point x for a normal distribution with mean mean and standard deviation sd.

```
# Probability density at x = 0 for a normal distribution with mean 0 and standard deviation 1
dnorm(0, mean = 0, sd = 1)
```

`## [1] 0.3989423`

`pnorm(x, mean, sd)`

: calculates the cumulative distribution function, probability of observing a value less than or equal to x in a normal distribution with mean mean and standard deviation sd.

```
# Cumulative probability up to x = 1 for a normal distribution with mean 0 and standard deviation 1
pnorm(1, mean = 0, sd = 1)
```

`## [1] 0.8413447`

`qnorm(p, mean, sd)`

: finds the value such that the probability of observing a value less than or equal to that value

```
# Find the value such that the cumulative probability is 0.8 for a normal distribution with mean 0 and standard deviation 1
qnorm(0.8, mean = 0, sd = 1)
```

`## [1] 0.8416212`

`rnorm(n, mean, sd)`

: generates n random numbers following a normal distribution with mean mean and standard deviation sd.

```
# Generate 10 random numbers following a normal distribution with mean 0 and standard deviation 1
rnorm(10, mean = 0, sd = 1)
```

```
## [1] -1.0378804 1.1208806 0.7059895 0.3188912 0.2439931 -0.2477641
## [7] -0.8954654 1.1203232 -0.1612854 -1.4704093
```

## 4.4 Probability Distribution Summary

Distribution | Type | PMF/PDF | E(X) | Var(X) | R Function |
---|---|---|---|---|---|

Bernoulli | D | \(p_x(k)= p^k (1-p)^{(1-k)}\) \(X \in {0,1}\) | \(p\) | \(p(1-p)\) | - |

Binomial | D | \(p_x(k)= \binom{n}{k} p^k (1-p)^{(n-k)}\) \(X \in {0,1,2, . . . ,n}\) | \(np\) | \(np(1-p)\) | `dbinom` , `pbinom` , `qbinom` , `rbinom` |

Poisson | D | \(p_x(k)= \frac{\lambda^k e^{-\lambda}}{k!}\) \(X \in {0,1,2, . . . }\) | \(\lambda\) | \(\lambda\) | `dpois` , `ppois` , `qpois` , `rpois` |

Uniform | C | \(f_x= \frac{1}{b-a}\) for \(a<=x<=b\) | \(\frac{a+b}{2}\) | \(\frac{(a-b)^2}{12}\) | `dunif` , `punif` , `qunif` , `runif` |

Normal | C | \(f_x= \frac{1}{\sqrt{2\pi\sigma}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}\) for \(-\infty<X<\infty\) | \(\mu\) | \(\sigma^2\) | `dnorm` , `pnorm` , `qnorm` , `rnorm` |

Exponential | C | \(f_x=\lambda e^{(-\lambda x)}\) for \(x>0\) | \(\frac{1}{\lambda}\) | \(\frac{1}{\lambda^2}\) | `dexp` , `pexp` , `qexp` , `rexp` |

Geometry | D | \(p_x(k)= (1-p)^{(k-1)}p\) \(0<X<=1\) | \(\frac{1}{p}\) | \(\frac{1-p}{p^2}\) | `dgeom` , `pgeom` , `qgeom` , `rgeom` |

Beta | D | \(f_x=\frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)} x^{\alpha-1}(1-x)^{\beta-1}\) for \(0<=x<=1\) | \(\frac{\alpha}{\alpha+\beta}\) | \(\frac{\alpha\beta}{(\alpha+\beta)^2 (\alpha+\beta+1)}\) | `dbeta` , `pbeta` , `qbeta` , `rbeta` |