5.3 Lab: Sampling & randomizing

Eventually, we might find ourselves in a situation in which we have a sampling frame, i.e. a list of all units within a population who can be sampled. For budget reasons we can’t run our experiment with all units. So we will need to draw a random sample. Subsequently, we want to randomly assign units in our sample to treatment/control.

Below we create an artifical population, i.e. a dataframe that contains all units in the population. We call this dataframe population.

The population, i.e., sampling frame contains 2000 units. First, we would like to draw a sample of 200 individuals among whom we conduct our experiment. Below we do so and have a look at the first six units in our sample.

id firstname lastname age email
97102 Maaiz Park 81
6365 Morganna al-Pour 65
78890 Gedion al-Fahmy 38
82562 Sirmichael Braud 41
98332 Reanne Rodriguez 35
89963 Sareena Saracay Pena 50 Sareena.Saracay

“Simple random assignment assigns all subjects to treatment with an equal probability by flipping a (weighted) coin for each subject.” (Source). To do so we need a variable that contains units’ assignment status. Accordingly, we create a variable treatment in which we store that information.

We can do that for two treatment groups.

id firstname lastname age email treatment
97102 Maaiz Park 81 1
6365 Morganna al-Pour 65 0
78890 Gedion al-Fahmy 38 1
82562 Sirmichael Braud 41 1
98332 Reanne Rodriguez 35 0
89963 Sareena Saracay Pena 50 Sareena.Saracay 1

…for three treatment groups (below names are assigned automatically).

id firstname lastname age email treatment
97102 Maaiz Park 81 T3
6365 Morganna al-Pour 65 T3
78890 Gedion al-Fahmy 38 T2
82562 Sirmichael Braud 41 T1
98332 Reanne Rodriguez 35 T2
89963 Sareena Saracay Pena 50 Sareena.Saracay T3

Finally, an example with four treatment groups, where we decide about the assignment probabilities and name the values of the treatment variable (character vector).

id firstname lastname age email treatment
97102 Maaiz Park 81 control
6365 Morganna al-Pour 65 treatment3
78890 Gedion al-Fahmy 38 control
82562 Sirmichael Braud 41 treatment1
98332 Reanne Rodriguez 35 control
89963 Sareena Saracay Pena 50 Sareena.Saracay control

Once, we have randomly assigned our sample units to treatment groups we can conduct our experiment and assign them the real treatments based on our dataframe. For instance, we might send them different versions of a questionnaire etc.

The randomizr package contains further useful functions arguments and there are other forms of random assignment (???). See the randomizr vignette.