## 2.6 Stratified sampling

In stratified sampling, the population is split into a small number of large (usually homogeneous) groups called strata, then cases are selected using a simple random sample from each stratum.

The strata must be unrelated to the variables.

For example, if the RQ is about comparing the percentage of females and males who wear hats at midday, a stratified sample of size 100 is not obtained by selecting 50 females and 50 males, for example. This is merely selecting people from each level of the explanatory variable.

The sex of the person is the explanatory variable; it does not define the strata.

Example 2.6 (Stratified sampling) To select students in a large course at a particular university, 20 of the females and 20 of the males could be selected. The sample is stratified by sex of the person.

At the university where I work, about 67% of the students are females. So, I could ensure that two-thirds of the sample was females (around 26.7, say 27) and about one-third males (about 13.3, say 13).

The animation below shows how a stratified random sample of size 40 might be selected, by randomly selecting 20 female and 20 male students.

Similarly, the second animation below shows how a stratified random sample of size 40 might be selected, by randomly selecting 27 female and 13 male students.