Chapter 7 Controlling for selection bias: randomized assignment to intervention
In this chapter we consider how to select people for the experimental and control groups of an intervention study. This is a key element of a randomized controlled trial (RCT), which is widely regarded as a gold standard approach to the evaluation of interventions. The core idea is that once a target population has been identified for study, the experimenter allocates participants to intervention and control groups in a way that minimizes bias. This turns out to be much harder than you might imagine.
There is a big literature on methods of allocation to intervention. The main principle behind modern intervention studies is that allocation should be completely random, i.e. not predictable from any characteristics of participants. There are many approaches to treatment allocation that are designed to avoid bias but which are not random. For the time being we are ignoring the question of who does the randomisation: this will be discussed below. We also defer until later (Chapter 9) the question of what to do when people drop out of a trial. We assume we have a population of potential participants and have to decide who will be in the intervention group and who will be in the control group. Consider the following methods of allocation and note your judgement as to whether they are (a) reasonable ways of avoiding bias and (b) random.
- Allocate people to the intervention group until the desired sample size is reached; then allocate all subsequent cases to the control group
- Allocate the more severe cases to the intervention group and the less severe to the control group
- Alternate between intervention and control: thus the first person on the list is an intervention case and the last one is a control
- For each potential participant, toss a coin and allocate those with ‘heads’ to the intervention group and those with ‘tails’ to the control group
- Create pairs of people matched on a key variable such as age, and then toss a coin to decide which member of each pair will be in the intervention group and which in the control group
If you have read this far, it is hoped that you will identify that B is clearly a flawed approach. Not only does it fail to equate the intervention and control groups at the outset, it also is likely to result in regression to the mean, so that greater gains will be seen for the intervention vs. the control group, for purely spurious statistical reasons. Despite its obvious flaws, this approach is sometimes seen in published literature – often taken to the extreme where an intervention group with a disorder is compared to a ‘control’ group with no disorder. This makes no sense at all – you are not controlling for anything unless the groups are equivalent at the outset, and so cannot evaluate the intervention this way.
Approach A is an improvement, but it is still prone to bias, because there may be systematic differences between people who are identified early in the recruitment phase of a study and later on. For instance, those who sign up immediately may be the most enthusiastic. Approach A also creates problems for allocation concealment, which is discussed further in Chapter 12.
Most people think that Approach C as a reasonable way to achieve equal group sizes and avoid the kinds of “time of recruitment” issues that arise with Approach A. But this is not a random approach and would therefore be regarded as problematic by modern standards of trials design. One reason, which we will discuss at more length in Chapter 12, is that this method could allow the researcher to influence who gets into the study. Suppose the experimenter allocates aphasic patients to a study to improve word-finding according to some systematic method such as their serial order on a list, but notices that the next person destined for the intervention group has particularly profound difficulties with comprehension. It may be sorely tempting to ensure that he had an “odd” number rather than an “even” number and so ended up in the control group. Similar objections arise to methods such as using a person’s case number or date of birth as the basis for intervention assignment: although this may be a fixed characteristic of the person and so appear to avoid bias, it raises the possibility that the experimenter could simply decide not to include someone in the study if they were destined for the intervention group and looked unpromising.
Approach D is clearly random, but it runs the risk that the two groups may be of unequal size and they may also be poorly matched on the pre-intervention measures, especially if sample sizes are small.
Approach E would seem like the logical solution: it avoids unequal sample sizes, ensures groups are comparable at the outset, yet includes randomisation. This method does, however, have some disadvantages: it can be difficult to apply if one does not have information about the full sample in advance, and can be problematic if participants drop out of the trial, so the matching is disrupted.
7.1 Randomization methods
We have seen the drawbacks of some of the approaches that might at first glance seem appropriate to avoid bias. Here we will briefly describe randomization methods that are used in contemporary medical trials, to give an idea of the range of possible approaches. Randomization is a fairly technical subject, and anyone designing a randomized controlled trial is advised to seek expert advice to ensure that the most effective method is used.
7.1.1 Simple randomization
In our earlier examples from the start of this chapter, we saw various examples of simple randomization (D-E). This takes the idea of the “coin toss” but adds in some degree of control of other variables which might affect the result. Typically, a coin is not used, as the random nature of the coin toss means that balanced numbers in each group will not be guaranteed. A randomization list is created by an individual who is unrelated to the trial. This list contains an equal number of allocations for a fixed number of participants which will have been pre-specified according to a sample size calculation.
In theory this might seem sufficient, but it ignores one aspect of real life: people will sign up for a trial but then drop out before assignment of intervention. This means that we potentially can have unbalanced groups despite following the list.
Furthermore as we saw earlier, simple randomization does not control key variables which may exert an effect on the outcome. For example, suppose we randomize children with a language difficulty into two groups, but we find that there are more boys than girls in one group. If we find an intervention effect as predicted, we can’t be sure whether this is just because boys and girls respond differently.
Simple randomization is rarely (if at all) used in contemporary medical trials, but other approaches have been developed to counter the drawbacks encountered.
7.1.2 Random permuted blocks
The permuted blocks approach tries to overcome the problem of unbalanced groups by allocating individuals into blocks. The idea is that patients are allocated to intervention or control in blocks, so that at certain equally spaced points across the trial, equal numbers are ensured. Let’s see how this works using an example:
Say we have chosen out block size to be four. So, in each block we have two cases assigned to intervention and two to the control group. The order of assignment of each block is random, for example, ABAB, AABB, ABBA, BBAA, BAAB, and so on. This means at any point in the trial, the randomization is only ever unbalanced by two individuals if the trial stops short or fails to recruit sufficient numbers of individuals, so the imbalance is minimal and statistical power is maintained.
However, the permuted blocks design has a downside; it is relatively easy for the researcher to see a pattern emerge in the allocation, so the randomization can become predictable. This means we then have to rely on the researcher to continue with correct allocation and not deviate, even if they have a strong belief in the intervention. It may seem remarkably cynical to assume that we need to take steps against researchers departing from random allocation to achieve a better result, but people do have unconscious biases that may lead them to do this, even if there is no conscious intent to deceive, and it is better to adopt an approach that makes it impossible for this to occur. Ideally, the randomisation is done by someone other than the person running the trial.
7.1.3 Unequal randomization
Financial and practical constraints may limit the number of participants in a trial, raising the question of whether we might improve efficiency by having more people in the treatment than the control arm. The approach of unequal randomization allows for this: a fixed ratio is decided on before allocation of individuals begins, e.g., 2:1 intervention vs control allocation. Use of this method should be considered carefully as there are both pros and cons. On the positive side, we may obtain a more accurate estimate of the effects of intervention, but we are prone to a reduction in power (Hey & Kimmelman, 2014); see also Chapter 10.
7.1.4 Stratification
We discussed above the possibility that randomized groups might differ in the proportion of males and females, making it difficult to disentangle the effect of the intervention from an effect of sex. There are usually other factors that could influence response to intervention: social and family background, presence of specific medical conditions, or educational attainment. If these are likely to be important, we can try to measure them in advance, but we cannot rule out the possibility that there may be other, unknown, variables that influence the treatment effect.
For characteristics that we can measure, stratification is one method that gives us some control, by splitting the allocation according to those variables. For example, suppose we have an intervention for school-aged children aged between 5 and 11 years. We recruit from three different cities: one with a high rate of deprivation, one with a good mix of social backgrounds, and one where most citizens are affluent home-owners. We would anticipate that there might be significant outcome differences in age and potentially differences between cities, so we use stratification to allow for this when we randomize children to intervention and control groups. The scheme would be as shown in Table 7.1.
Intervention | Control | |
---|---|---|
City 1, ages 5-7 | 23 | 24 |
City 2, ages 8-9 | 31 | 31 |
City 3, ages 10-11 | 28 | 27 |
City 1, ages 8-9 | 19 | 20 |
Cityl 2, ages 10-11 | 25 | 24 |
City 3, ages 5-7 | 35 | 35 |
City 1, ages 10-11 | 22 | 27 |
City 2, ages 5-7 | 24 | 23 |
City 3, ages 8-9 | 31 | 31 |
7.1.5 Adaptive randomisation
A problem with stratification is that it becomes weaker when the number of strata increases, particularly in small-scale trials. Suppose, for instance, we decided it was important to have six age-bands rather than three, and to separate boys and girls: we would soon find it difficult to achieve a balanced allocation.
An alternative to stratification, which has become popular in recent years, is adaptive randomisation (Hoare et al., 2013). For each new individual, the allocation to treatment ratio changes based on:
- the number of individuals already assigned to each treatment arm.
- the number of individuals in each stratification levels.
- the number of individuals in the stratum.
This method ensures that the allocation to groups remains balanced both overall and with any relevant stratifying variables or covariates, for example, gender or age.
7.1.6 Minimization
A specialized type of adaptive allocation called minimization is also very popular in clinical RCTs. Minimization was first proposed by Pocock & Simon (1975) and is a form of dynamic or adaptive randomization. Each participant’s allocation is dependent on the characteristics and allocation of participants already allocated to groups. The idea is to minimize the imbalance by considering a range of prognostic factors for each allocation. This prevents the allocation of people with particular combinations of prognostic factors into one group. Table 7.2 illustrates this method to allocate 16 individuals to a trial using an example from Scott et al. (2002).
Prognostic factor | Intervention | Control |
---|---|---|
Sex | ||
Male | 3 | 5 |
Female | 5 | 3 |
Age band | ||
21-30 | 4 | 4 |
31-40 | 2 | 3 |
41-50 | 2 | 1 |
Risk factor | ||
High Risk | 4 | 5 |
Low Risk | 4 | 3 |
We now consider allocation of the 17th participant who is Male, aged 31-40, and High Risk. We can use Pocock & Simon (1975)’s method to work out which group they should be assigned to, to minimize imbalance:
- If allocated to the intervention group, total imbalance is \[|(3+1)-5|+|(2+1)-3|+|(4+1)-5|=1\].
- If allocated to the control group, total imbalance is \[|3-(5+1)|+|2-(3+1)|+|4-(5+1)|=7\].
Since allocation to the intervention group gives imbalance of 1, whereas allocation to the control group would give imbalance of 7, this participant would be allocated to the intervention group.
7.2 Units of analysis: Individuals vs clusters
In the examples given above, allocation of intervention is done at the level of the individual. There are contexts, however, where it makes sense to use larger units containing groups of people such as classrooms, schools or clinics. Research in schools, in particular, may not readily lend itself to individualized methods, because that could involve different children in the same class receiving different interventions. This can make children more aware of which group they are in, and it can also lead to “spill-over” effects, whereby the intervention received by the first group affects other children who interact with them. To reduce spillover, it may make sense for children in classroom A to have one intervention and those in classroom B to have a control intervention. This is what happens in a clustered trial.
This method, however, raises new problems, because we now have a confound between classroom and intervention. Suppose the teacher in classroom A is a better teacher, or it just so happens that classroom B contains a higher proportion of disruptive pupils. In effect, we can no longer treat the individual as the unit of analysis. In addition, we may expect the children within a classroom to be more similar to one another than those in different classrooms, and this has an impact on the way in which the study needs to be analysed. Clustered studies typically need bigger sample sizes than non-clustered ones. We will discuss this issue further in Chapter 16.
7.3 Class exercise
Perform a literature search for an intervention/disorder of interest using the search term “randomized controlled trial” or “randomized controlled trial” or “RCT”, and download the article. (We are looking for a standard RCT, so do not pick a clustered RCT).
- Is the randomization process clearly described?
- Is it adequate?
- Did the researchers use stratification? If so, was the choice of variables for stratification sensible? If not, would stratification have helped?