4.2 Key Concepts and Definitions

4.2.1 Random Sample

A random sample of size n consists of n independent observations, each drawn from the same underlying population distribution. Independence ensures that no observation influences another, and identical distribution guarantees that all observations are governed by the same probability rules.

4.2.2 Sample Statistics

4.2.2.1 Sample Mean

The sample mean is a measure of central tendency:

ˉX=ni=1Xin

  • Example: Suppose we measure the heights of 5 individuals (in cm): 170,165,180,175,172. The sample mean is:

ˉX=170+165+180+175+1725=172.4cm.

4.2.2.2 Sample Median

The sample median is the middle value of ordered data:

˜x={Middle observation,if n is odd,Average of two middle observations,if n is even.

4.2.2.3 Sample Variance

The sample variance measures data spread:

S2=ni=1(XiˉX)2n1

4.2.2.4 Sample Standard Deviation

The sample standard deviation is the square root of the variance:

S=S2

4.2.2.5 Sample Proportions

Used for categorical data:

ˆp=Xn=Number of successesSample size

4.2.2.6 Estimators

  • Point Estimator: A statistic (ˆθ) used to estimate a population parameter (θ).
  • Point Estimate:The numerical value assumed by ˆθ when evaluated for a given sample.
  • Unbiased Estimator: A point estimator ˆθ is unbiased if E(ˆθ)=θ.

Examples of unbiased estimators:

  • ˉX for μ (population mean).

  • S2 for σ2 (population variance).

  • ˆp for p (population proportion).

  • ^p1p2 for p1p2 (population proportion difference)

  • ¯X1¯X2 for μ1μ2 (population mean difference)

Note: While S2 is unbiased for σ2, S is a biased estimator of σ.


4.2.3 Distribution of the Sample Mean

The sampling distribution of the mean ˉX depends on:

  1. Population Distribution:
    • If XN(μ,σ2), then ˉXN(μ,σ2n).
  2. Central Limit Theorem:
    • For large n, ˉX approximately follows a normal distribution, regardless of the population’s shape.

4.2.3.1 Standard Error of the Mean

The standard error quantifies variability in ˉX:

σˉX=σn

Example: - Suppose σ=10 and n=25. Then: σˉX=1025=2.

The smaller the standard error, the more precise our estimate of the population mean.