4.2 Key Concepts and Definitions

4.2.1 Random Sample

A random sample of size $n$ consists of $n$ independent observations, each drawn from the same underlying population distribution. Independence ensures that no observation influences another, and identical distribution guarantees that all observations are governed by the same probability rules.

4.2.2 Sample Statistics

4.2.2.1 Sample Mean

The sample mean is a measure of central tendency:

$\bar{X} = \frac{\sum_{i=1}^{n} X_i}{n}$

Example: Suppose we measure the heights of 5 individuals (in cm): $170, 165, 180, 175, 172$ . The sample mean is:

$\bar{X} = \frac{170 + 165 + 180 + 175 + 172}{5} = 172.4 \, \text{cm}.$

4.2.2.2 Sample Median

The sample median is the middle value of ordered data:

$\tilde{x} = \begin{cases} \text{Middle observation,} & \text{if } n \text{ is odd}, \\ \text{Average of two middle observations,} & \text{if } n \text{ is even}. \end{cases}$

4.2.2.3 Sample Variance

The sample variance measures data spread:

$S^2 = \frac{\sum_{i=1}^{n}(X_i - \bar{X})^2}{n-1}$

4.2.2.4 Sample Standard Deviation

The sample standard deviation is the square root of the variance:

$S = \sqrt{S^2}$

4.2.2.5 Sample Proportions

Used for categorical data:

$\hat{p} = \frac{X}{n} = \frac{\text{Number of successes}}{\text{Sample size}}$

4.2.2.6 Estimators

Point Estimator: A statistic ( $\hat{\theta}$ ) used to estimate a population parameter ( $\theta$ ).
Point Estimate:The numerical value assumed by $\hat{\theta}$ when evaluated for a given sample.
Unbiased Estimator: A point estimator $\hat{\theta}$ is unbiased if $E(\hat{\theta}) = \theta$ .

Examples of unbiased estimators:

$\bar{X}$ for $\mu$ (population mean).
$S^2$ for $\sigma^2$ (population variance).
$\hat{p}$ for $p$ (population proportion).
$\widehat{p_1-p_2}$ for $p_1- p_2$ (population proportion difference)
$\bar{X_1} - \bar{X_2}$ for $\mu_1 - \mu_2$ (population mean difference)

Note: While $S^2$ is unbiased for $\sigma^2$ , $S$ is a biased estimator of $\sigma$ .

4.2.3 Distribution of the Sample Mean

The sampling distribution of the mean $\bar{X}$ depends on:

Population Distribution:
- If $X \sim N(\mu, \sigma^2)$ , then $\bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right)$ .
Central Limit Theorem:
- For large $n$ , $\bar{X}$ approximately follows a normal distribution, regardless of the population’s shape.

4.2.3.1 Standard Error of the Mean

The standard error quantifies variability in $\bar{X}$ :

$\sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}}$

Example: Suppose $\sigma = 10$ and $n = 25$ . Then: $\sigma_{\bar{X}} = \frac{10}{\sqrt{25}} = 2.$

The smaller the standard error, the more precise our estimate of the population mean.