7 Internal validity and experimental studies

So far, you have learnt to ask a RQ, identify different ways of obtaining data, and design the study.

In this chapter, you will learn how to ensure that the conclusions we can make are logical and sound in experimental studies. You will learn to:

  • maximise internal validity in experimental studies.
  • manage confounding in experimental studies.
  • explain, identify and manage the carry-over effect in experimental studies.
  • explain, identify and manage the Hawthorne effect in experimental studies.
  • explain, identify and manage the placebo effect in experimental studies.
  • explain, identify and manage the observer effect in experimental studies.
  • explain different descriptions of blinding.

7.1 Introduction

The conclusions drawn from a study are only as good as the data that the conclusions are based on, and the data are only as good as the study design from which the data emerge.

A good study requires high internal validity: When studying the relationship between the response and explanatory variables, we would like to be able to rule out---as much as possible---any other reason for the changes in the values of the response variable, so any remaining changes we see can be attributed to just the explanatory variable of interest.

That is, we should design studies to have high internal validity to reduce bias.

Remember the goal of study design is to maximise internal validity: to design a study to isolate the relationship of interest, by eliminating, as well as possible, all other possible explanations.

Many aspects of the design must be considered to achieve this goal, some of which are discussed in this chapter.

Data collection is often tedious, time consuming and expensive.

You usually get one chance to collect your data, but you can analyse your data as many times as you like. Since you usually get one chance to collect your data, design the study properly the first time!

Example 7.1 (Importance of internal validity) A group of researchers150 describe an experiment where free fertilizer was provided to a sample of female farmers in Mali (at the recommended amount per hectare; or at half the recommended amount per hectare).

Since all the farmers knew they were being provided with fertilizer (that is, they were not blinded), the farmers changed their farm management: they employed more hired labour and used more herbicide. Consequently, the yields for all farmers changed.

However, it is difficult to know whether this change in yield was was due to the amount of fertilizer applied, the change in labour, the change in herbicides, or a combination of these. That is, the study had poor internal validity.

Specific design strategies that we consider for maximising internally validity are:

Not all of these will be relevant to every study.

Some relevant design issues are discussed in this chapter for experimental studies. The next chapter considers design issues for observational studies.

TABLE 7.1: Different design biases studied in this book related to the researchers and the individuals
Name Who/what is aware of what What changes or is compromised
Hawthorne effect Individuals are aware of being in a study Behaviour of, or reporting by, individuals
Individuals are aware of explanatory variable value .
Placebo effect Individuals think they are in a study Behaviour of, or reporting by individuals
Individuals think they have a specific value for explanatory variable
Observer effect Researcher is aware of explanatory variable value Behaviour of researchers changes (unconsciously) to conform to expectations, and perhaps communicated to individuals who may also act or report differently

In general, making the individuals unaware (blinding) that they are in a study, or unaware of what explanatory variable values apply to them, reduces or eliminates bias.

Likewise, by making the researchers unaware (blinding) of what explanatory variable values apply to the individual reduces or eliminates bias.

In this chapter, we will work with this RQ (based on Anthony R. Bird et al.151):

Among Australians, does eating provided food made from wholegrain Himalaya 292 increase average faecal weight compared to eating provided food made from refined cereal?

For the Himalaya 292 study:

  1. Determine P, O, C and I.
  2. What are the variables?
  3. What type of study is this?
  1. P: Australians. O: average faecal weight. C: Between those eating food made using refined cereal, and those eating food made from Himalaya 292.
  2. The information that must be gathered from each individual: the faecal weight; the type of grain consumed (refined or Himalaya 292).
  3. True experiment.

Example 7.2 (Exclusion criteria) In the Himalaya study,152 the exclusion criteria were:

[...] a history of diabetes, gastrointestinal, renal, hepatic and cardiovascular disease, an intolerance to cereal-based foods, fasting plasma glucose concentrations > 6.1 mmol/l and medications or supplements likely to affect experimental endpoints

--- Bird et al.153, p. 1033

To answer this RQ, a study must be designed to collect the data. However, careful thought must be given to how the study is designed.

7.2 Managing confounding

Confounding has the potential to compromise the internal validity of the study and hence the interpretation of the results, so managing the impact of confounding is important. Suppose, for example, that the researchers created two groups:

  • Group A: Women recruited at a female-only gym.
  • Group B: Men recruited at a local nursing home.

The researchers then gave Himalaya 292 to Group A, and the refined cereal to Group B. If a difference in faecal weight was found between the two groups, the difference may because:

  • The diet (the explanatory variable) was different in each group;
  • The sex of the participants was different in both groups, since Group A was all women and Group B was all men;
  • The age of the participants in each group, since Group A is likely to be younger on average, and Group B is likely to be older on average;
  • The health and fitness levels in each group: those in Group A would generally be far healthier than those in Group B.

If a difference is found between the Himalaya 292 and refined cereal groups, it may not be because of the cereal (Table 7.2). That is, the study has very poor internal validity due to poor study design.

For example, the age of the subject may be related to faecal weight (as older people tend to eat less, and eat differently, than younger people), and the study design means that older people are more likely to consume the refined cereal.

This is an extreme case of confounding; usually, confounding is more subtle (and hence more difficult to detect) than in this example.

TABLE 7.2: Comparing Groups A and B: An extreme example of confounding


Group A Group B
Women Sex Men
Younger (in general) Age Older (in general)
Himalaya 292 Diet Refined cereal
Very fit Fitness Less fit


The key point is that the groups being compared should be as similar as possible, apart from the difference being studied (in the Himalaya 292 example, the diet that they are given).

Example 7.3 (Comparing groups) An experiment to study the effect of using ginko to enhance memory154 compared two groups: one using ginko (\(n=111\)), and one using a pretend, non-active supplement (\(n=108\)).

The authors randomly allocated participants to each group, but also compared the two groups to ensure that no obvious differences initially existed between the two groups that might explain any differences in the response variable (Table 7.3).

The table shows that the two groups are very similar in terms of age, education and gender distribution. Hence, any difference between the groups cannot be attributed to existing difference in the age, the percentage of men, or the years of education in the two groups.

TABLE 7.3: Comparing the two groups in the ginko-memory study
Characteristic Group A (Ginko) Group B (Pretend)
Average age (in years): 68.7 69.9
Men (number; percentage) 46 (41) 45 (42)
Average years of education 14.4 14.0

Researchers explored the use of dominant and non-dominant hands for chest compression in student paramedics in an experimental study.155

Students were randomly divided into two groups: DHOS (dominant hand on chest) and NDHOC (non-dominant hand on chest).

The two groups were then compared:

Demographic All participants (\(n = 75\)) DHOC (\(n = 37\)) NDHOC (\(n = 38\))
Average age (years) 23.4 22.5 24.3
Gender: percentage Female 51% 53% 47%

The two groups appear to be very similar in terms of average age of participants, and the percentage of female participants.

This means that, if differences are observed in the study between the DHOC and NDHOC groups, it is unlikely to be because the groups themselves are different in terms of age and sex of participants. The study should have reasonable internal validity.

Potentially, many extraneous variables exist. To demonstrate, we will consider just one: age. How can we make sure that the age of the participants does not cause confounding?

Confounding can be managed by:

  • Restricting the study to a certain group (for example, only people under 30).
  • Blocking. Analyse the data separately for different groups (for example, analyse the data separately for people under 30, and 30 and over).
  • Analysing using special methods (after measuring the age of each subject).
  • Randomly allocating people to groups: Older and younger people would be spread approximately evenly between groups.

The first two approaches (restricting; blocking) are useful if one or two variables are known, or thought likely, to cause confounding.

The third approach (analysing) requires recording all the variables suspected of being confounders.

The fourth approach (randomly allocating) is superior if it is possible, because it reduces the chance of confounding even for variables not even suspected as being confounding variables.

Notice that a common theme is to measuring, observing, assessing or recording any variables of potential concern, to ensure no lurking variables exist to compromise the results.

Of course, more than one of these approaches can be used, such as randomly allocating individuals to groups, but also measuring, observing, assessing or recording many other variables that can be managed through analysis (Example 7.3).

7.2.1 Restrictions

Sometimes the impact of confounding is managed by restricting the study to some groups, based on potential confounding variables, or keeping some variables constant. These variables are called control variables. If possible, a reason for this restriction should be given.

Example 7.4 (Restricting) In the Himalaya study,156 the study might be restricted to subjects aged under 30. The control variable is 'age'.

7.2.2 Blocking

Sometimes blocking is used to minimise the impacts of confounding. Blocking refers to separating the units of analysis into a small number of groups that are similar to one another, then studying those groups separately. The Himalaya study, might be blocked on age (Fig. 7.1).

Definition 7.1 (Blocking) Blocking is when units of analysis are arranged in groups (called blocks) that are similar to one another.

Blocking in the *Himalaya* study, based on age

FIGURE 7.1: Blocking in the Himalaya study, based on age

7.2.3 Analysis

Confounding variables can be accommodated in the analysis (using analysis methodology beyond what is in this book), provided those variables have been measured, observed, assessed or recorded. Because of this, measuring, observing, assessing or recording all the information likely to be important for understanding the data is important.

Measure, observe, assess or record all the information that is likely to be important for understanding the data. This may include information about

  • the individuals in the study; and
  • the circumstances of the study.

For this reason, most studies involving people record the participants' age and sex, as these two variables are common confounders. Once a sample is obtained, recording this extra information usually requires little extra effort.

Example 7.5 (Analysis) In the Himalaya 292 study, the sex, age, pre-study weight and pre-study BMI were also recorded for each individual.

Example 7.6 (Analysis) An experimental study157 compared nitrogen (N) and phosphorus (P) concentrations in maize, for evenly-injected liquid manure and band-injected liquid manure.

As potential confounding variables, the researchers also recorded the average temperature and the precipitation (between May 1 and September 30) at each site.

7.2.4 Random allocation

One way to minimise confounding is to randomly allocate individuals in the study to the treatment groups. (Remember that the word "random" has a special meaning.) The advantage of random allocation is that it should approximately evenly distribute potential confounding variables that have been identified (such as age) but also those variables that may not have even been considered as confounders, or are hard to measure or observe (such as genetic conditions).

In the Himalaya study, the units of analysis (the people in the sample) could be allocated to a group at random, and then the groups allocated a diet through a toss of a coin (Fig. 7.2).

Example 7.7 (Random allocation) In the Himalaya 292 study, the article reports that 'Subjects were allocated randomly to [...] dietary treatments...' (Bird et al.158, p. 1033).

Random allocation can occur in two places for the Himalaya study

FIGURE 7.2: Random allocation can occur in two places for the Himalaya study

Random allocation may occur when randomly allocating individuals to groups (true experiment), and/or when randomly allocating treatments to groups (true or quasi-experiment). Random allocation can be shown, in general, as in Fig. 7.3.

Random allocation in general

FIGURE 7.3: Random allocation in general

7.2.5 Random allocation vs random sampling

Random sampling and random allocation are two different concepts (Fig. 7.4), that serve two different purposes, but are often confused:

  • Random sampling allows results to be generalised to a larger population, and impacts external validity. It concerns how the sample is found to study.
  • Random allocation tries to eliminate confounding issues, by evening-out possible confounders across treatment groups. Random allocation of treatments helps establish cause-and-effect, and impacts internal validity. It concerns how the members of the chosen sample get the treatments.
Comparing random allocation and random sampling

FIGURE 7.4: Comparing random allocation and random sampling

7.3 Carry-over effect and washout periods

In the Himalaya study, what if patients spent two weeks on the Himalaya 292 diet, then the next two weeks on the refined cereal diet?

Potentially, the influence of the first diet could still be impacting the subjects' faecal weight for a little while after stopping the first diet. This could compromise the internally validity of the study.

This is called the carryover effect.

Definition 7.2 (Carryover effect) The carry-over effect is when the influence of past experience(s) of the individuals carry over to influence future experience(s) of the individuals.

In the context of experiments, this may mean that the influence of one treatment carries over into the influence of the next treatment.

Sometimes, researchers can randomly allocate the order in which the treatments (i.e., the diets) are used. That is, some participants start by spending four weeks on the Himalaya 292 diet, then (after a washout period) four weeks on the refined cereal diet; meanwhile, other participants start by spending four weeks on the refined cereal diet, then (after a washout period) four weeks on the Himalaya 292 diet.

Example 7.8 (Washout periods) A study of paramedics159 required paramedics to conduct eight different tasks (such as electrical defibrillation and intravenous cannulation).

The order in which each of the 16 paramedics performed the eight tasks was arranged so that not every paramedic started with Task 1, followed by Task 2, etc. to "control for possible effects of practice" (p.  255); that is, to mitigate the carry-over effect.

The impact of the carryover effect may be minimized by using a washout period or similar; for example, after finishing one diet, the participants spend four weeks on their usual (before study) diet, and then revert to the second diet being used.

In some studies, a washout is used. For example, after tasting a food sample, participants may rinse their mouth with water before tasting another food sample.

Example 7.9 (Carry-over effect) In the Himalaya 292 study, the authors report:

Subjects were allocated randomly to [...] dietary treatments according to a cross-over study design with each intervention phase lasting 4 weeks. There was no washout period between phases.

--- Bird et al.160, p. 1033

That is, subjects were randomly allocated to a diet: some subjects began the study on the Himalaya 292 diet while others started on the refined cereal diet. No washout period was used; however, since the response variable was recorded after four weeks on the diets, no washout period was considered necessary.

Example 7.10 (Washout) An engineering study161 examined drivers' exposure to lane-keeping system on their driving performance. Subjects were exposed to a driving simulation that used a lane-keeping system, and then to a driving simulation without using a lane-keeping system.

The researchers found that there was a carryover effect when drivers moved from a simulation with a lane-keeping system to one without a lane-keeping system.

Using a 'washout' period to minimize the carry-over effect

FIGURE 7.5: Using a 'washout' period to minimize the carry-over effect

7.4 Hawthorne effect and blinding individuals

What if the patients in the Himalaya 292 study were being watched (or waited for) while defecating? Could this lead to a misleading conclusion?

People often behave differently (either positively or negatively) if they know (or think) they are in a study or are being watched. This is called the Hawthorne effect.162 This could compromise the internal validity of the study.

Definition 7.3 (Hawthorne effect) The Hawthorne effect is the tendency of individuals to change their behaviour if they know (or think) they are being observed.

Example 7.11 (Hawthorne effect) People are more health-conscious if they know they will be followed-up on a regular basis.

For example, a study aiming to increase fruit and vegetable intake in young adults163 noted that

The changes that did occur could be explained by the Hawthorne effect [...] the intervention [...] can inherently cause participants to change behavior because they know they are being observed...

--- Clark et al.164

The impact of the Hawthorne effect can be minimized by blinding the individuals in the experiment so that they do not know:

  • that they are in a study;
  • the aims of the study, and/or
  • which treatment they are receiving.

For example, if the individuals do not know which treatment they are receiving, they cannot behave differently according to the treatment they know they are receiving.

Blinding people to knowing they are involved in a study is often difficult, as ethics usually requires individuals' informed consent.

Example 7.12 (Hawthorne effect) In the Himalaya 292 study, the authors report:

The study was explained fully to the subjects, both verbally and in writing, and each gave their written, informed consent before participating.

--- Bird et al.165, p. 1033

That is, the subjects knew they were in a study. As is usual, this was an ethics requirement (in this case, from the Ethics Committee of the CSIRO). The Hawthorne effect may influence the results.

However, the subjects did not know which diet they were on:

Volunteers were not told the identity of the test cereal in the foods provided to them.

--- Bird et al.166, p. 1033

Example 7.13 (Hawthorne effect) In an experimental study167 to compare the efficacy of a new type of toothpaste, participants were given two types of toothpaste to use (a new type, and an exisiting type), and evaluations of plaque remaining on the teeth were taken. The authors state that:

... a plaque-reducing effect was seen not only in the test group but also in the control group. This phenomenon is due to the so-called Hawthorne effect that can lead to an overestimation of the effect and false positive results.

--- Lorenz et al.168, p. 5

That is, since all participants knew they were being assessed after brushing their teeth, there may have been a tendency to brush their teeth better than usual. The authors then state:

To minimize the Hawthorne effect, longer study durations of more than 6 months were suggested.

--- Lorenz et al.169, p. 6

7.5 Placebo effect and using controls

What if people thought they were on the wholegrain diet, but they weren't? Could this lead to a misleading conclusion?

Perhaps surprisingly, individuals in a study may report effects of a treatment (either positive or negative), even if they have not received an active treatment. This could compromise the internally validity of the study.

This is called the placebo effect.

Definition 7.4 (Placebo effect) The placebo effect is when individuals report perceived or actual effects without having received the treatment.

Managing the placebo effect is difficult! However, impact of the placebo effect can be minimized using a control group: units of analysis without the treatment applied, but as similar as possible in every other way to those units of analysis receiving the treatment. This allows the effect of the treatment to be ssessed, over and above the placebo effect.

Definition 7.5 (Control) A control is a unit of analysis without the treatment applied (but as similar as possible in every other way to other units of analysis).

Sometimes the control group receives a placebo. A placebo is a non-effective treatment. Those who receive the placebo should be selected through random allocation when possible. Sometimes, using a placebo is unethical. The Wikipedia entry about placebos is intriguing.

Definition 7.6 (Placebo) A placebo is a treatment with no intended effect or active ingredient.

Example 7.14 (Placebo effect) In the Himalaya 292 study, the authors report

On each day of the intervention periods, volunteers were asked to consume a combination of bread, breakfast cereal, muffins and crackers that would supply in total 103g of the test cereal. The aim was for each volunteer to consume 60g cereal flakes (or puffed rice for the refined cereal diet), two slices of bread, one muffin and six savoury crackers each day. Volunteers were not told the identity of the test cereal in the foods provided to them

--- (Bird et al.170, p. 1033)

That is, the subjects were blinded to the diet they were exposed to. However, some may think they are on the refined cereal or Himalaya diet, and respond accordingly (perhaps unconsciously).

To test the effectiveness of a new drug, patients are to report to a GP to receive injections of a new drug. We wish to compare to people who do not get the injection.

What is the control?

The controls are not just people who don't get the injections.

Ideally, controls would be people who, like the treatment group, report to a GP and receive an injection... however, they just receive an injection that will do nothing.

Example 7.15 (Placebo effect) Three active analgesics (pain relievers) were compared to a placebo.171

Four different coloured placebos were used. The most pain relief was experienced by those taking red placebos (Fig. 7.6), who experienced even more pain relief than those given true pain relievers.

Pain relief, for various pain relief medicine

FIGURE 7.6: Pain relief, for various pain relief medicine

Example 7.16 (Placebo effect) A study of placebos172 gave half the subjects a placebo, but told them that the pill was an expensive (implying 'very effective') pain killer ($2.50 per tablet).

The other half were also given a placebo, but were told that the pill was a discount (impling 'less effective') pain killer ($0.10 per tablet).

About 85% of participants in the first group reported a pain reduction, yet only 61% in the second group reported a pain reduction. Remember that both groups actually received a placebo!

7.6 Observer effect and blinding researchers

What if the researchers assessing the outcomes knew the diet allocated to each patient, and were hoping that the new diet performed better than the refined cereal diet? Could this lead to a misleading conclusion?

Perhaps surprisingly, the researchers' expectations or hopes for how the new diet will perform may (unconsciously) influence how the researchers interact with the individuals, and perhaps (unconsciously) influence the behaviour of the individuals in the study.

This is called observer effect. (In experiments, it is sometimes called the experimenter effect.)

This could compromise the internally validity of the study.

Definition 7.7 (Observer effect) The observer effect occurs when the researchers (unconsciously) change their behaviour to conform to expectations because they know what values of the explanatory variable apply to the individuals. This may cause the individuals to change their behaviour or reporting also.

The impact of observer effect can be minimized by blinding the researchers so that they do not know which treatments the individuals are receiving. That is, the people giving the treatment and the people evaluating the treatment do not know what treatment has been given. Instead, a third party can be used.

For example, the researchers may give an assistant two drugs labelled A and B. The assistant then administers the drug and evaluates the participants' response to the treatments. Later, the assistant tells the researchers whether Drug A or Drug B performed better, but only the researchers know what drugs the labels A and B refer to.

Example 7.17 (Observer effect) In an experimental study173 that examined the impact of an injection to alleviate post-operative umbilical pain, the authors stated:

Although this study was not double-blinded, the postoperative pain scores were gathered by a nurse practitioner who was blinded to the usage of bupivacaine to avoid observer-expectancy bias [i.e., the observer effect].

--- Seo et al.174, p. 392

The observer effect does not just apply to situations where people are used as participants.

Example 7.18 (Observer effect) 'Clever Hans' was a horse that seemed to be able to perform simple mental arithmetic.

After much study, Carl Stumpf realised that the horse was responding to involuntary (and unconscious) cues from the trainer. This was discovered, in part, by using an experiment where the people interacting with the horse were blinded.

The same effect has been observed in narcotic sniffer dogs,175 who may respond to their handlers' unconscious cues.

The observer effect is about the observer unconsciously influencing the individuals; that is, the researchers are not aware that it is occurring.

If the researchers are intentionally influencing the individuals, this is called fraud.

7.7 Describing blinding

Blinding is when those involved in the study do not know information about the study.

Those involved in the study may not know:

  • that they are in a study at all;
  • the purpose of the study; and/or
  • which comparison or connection value(s) apply to them.

When participants are blinded to as much as possible, the internal validity of the study is increased. However, when people are the individuals, ethics requirements often mean that they need to know they are in a study, and the purpose of the study.

Different individuals involved in the study can be blinded:

  • A study can blind the participants to knowing what comparison group they are in.
  • A study can blind the researcher to knowing what comparison group the study individuals are in.
  • A study can blind the analysts to knowing what comparison group the individuals are in during analysis.

When as many participants are blinded as possible, the internal validity of the study is increased.

If only the participants are blinded, the study is called single blind.

If both the researchers and participants are blinded, the study is called double blind.

If the researchers, participants and the analyst are blinded, the study is called triple blind.

For clarity, we strongly recommend explicitly stating who or what is blinded. Blinding should be considered in all studies, when possible (and it is not always possible).

Blinding of participants does not just apply to people; it is also relevant with animals (Example 7.18 about Clever Hans).

Why might it be necessary to blind the analyst to the treatments being used?

Example 7.19 (Blinding) In a study comparing chest compressions with dominant and non-dominant hands of student paramedics,176 the article states that:

Participants were asked to participate in a study exploring general CPR performance but were blinded to the specific research question at any stage to reduce the chance of performance bias...

--- Cross et al.177, p. 2

Participants could not, however, be blinded to which group they were in (dominant hand on chest; non-dominant hand on chest). In this case, participants were only partially blinded.

Later, the article reports that:

Data were analysed by a biostatistician blinded to group allocation.

--- Cross et al.178, p. 3

This means that the analyst was blinded to the treatments.

Example 7.20 (Double-blinding) In a cropping study comparing yields from modern and traditional cowpea crops in Tanzania, the researchers wanted to use a double-blind study.

To do so:

...it was important that the traditional and modern seed looked exactly the same---the seed types must be indistinguishable in terms of size and color.

While information about seed type may be gradually revealed as the crop matures in the field, this does not invalidate our design because key inputs were already provided.

Since the modern seed was treated with purple powder, we also dusted the traditional type, and clearly communicated this to the farmers---they knew that seed type could not be inferred from the color.

Erwin Bulte et al.179, p. 817--818; line breaks added

7.8 Design issues: Overview

In summary, issues to consider when designing a study, when possible, include:

  • Minimising confounding (and lurking variables);
  • Minimising the carryover effect;
  • Minimising the Hawthorne effect;
  • Minimising the placebo effect;
  • Minimising the observer effect.

Ways to minimize the impact of these have been discussed (Fig. 7.7), but is not always possible. These effects are important to understand, so studies can be designed to manage or minimise their influence (to maximise internal validity). This ensures that the results and conclusions from our studies are correctly interpreted (that is, noting, for example, how the Hawthorne effect may have influenced the conclusions).

Often, however, some (or all) of these issues cannot be well managed. For instance, individuals often know they are involved in an experimental study (Hawthorne effect). In these cases, the impacts should be minimized as far as possible, and then the likely impact that these issues have on our conclusions discussed. The impact of these issues are often reported as limitations in a journal article (Chap. 9), perhaps part of the Discussion section.

Example 7.21 (Study limitations) A study of alcohol use in college females reported these limitations of their study:

The present study has several limitations. First, data were collected over 15 years ago [...] Second, only college females were assessed and findings may not generalize to college males or to broader groups of young adults [...] Third, alcohol and caffeine consumption variables were all self-reported...

--- Sydney S. Kelpin et al.180, p. 3

Design considerations. Note: Lurking variables become confounding variables when measured, observed, assessed or recorded in the study, and then they can be managed. The arrows mean that the design issue can be partially managed by the indicated means

FIGURE 7.7: Design considerations. Note: Lurking variables become confounding variables when measured, observed, assessed or recorded in the study, and then they can be managed. The arrows mean that the design issue can be partially managed by the indicated means

Example 7.22 (Study design) In a study of student paramedics comparing chest compressions with dominant and non-dominant hands,181 as discussed in Example 7.19, the participants were partially blinded: they were blinded to the purpose of the study, but not to which group they were allocated.

The analyst was also blinded to the group allocations.

Later, the article reports that:

...participants were allocated randomly to one of two groups: 'dominant hand on chest' or 'non-dominant hand on chest'. Group allocation was determined by a computer-generated randomisation schedule...

--- Cross et al.182, p. 3

This study used a number of good design features.

7.9 Summary

Designing effective experimental studies requires researchers to manage or minimise confounding where possible, by:

  • restricting the study to certain groups;
  • blocking;
  • through special analysis methods; and/or
  • through random allocation.

Well-designed experimental studies also try to manage:

  • the carry-over effect (for example, using a washout period, or randomly allocating the order of treatments);
  • the Hawthorne effect (for example, by blinding participants to the treatment);
  • the placebo effect (for example, by blinding participants to the treatments and by using controls); and
  • the observer effect (for example, by blinding the researchers to the treatments being applied).

The following short video may help explain some of these concepts:

7.10 Quick review questions

A study on the bruising of apples183 aimed to determine the relationship between the recorded surface temperature of the apple, and the depth of bruising.

The researchers purposefully hit apples with three different forces (200, 700 and 1200 mJ) to inflict bruises.

The researchers then recorded the depth of the bruising, and recorded the surface temperature at each bruise location.

The study was conducted separately for three different regions of the apple (lower; middle; upper), and each apple was only used once.

  1. The response variable is
  2. The explanatory variable is
  3. What is the best description for the variable 'The location of the bruising'?
  4. True or false: The researchers could minimise the effects of confounding by incorporating potential confounding variables in the analysis.
  5. True or false: The researchers could use random allocation of the treatments to the apples to minimise confounding.
  6. True or false: The carry-over effect is likely to be a big problem in this study.
  7. True or false: The Hawthorne effect is likely to be a big problem in this study.
  8. True or false: The placebo effect is likely to be a big problem in this study.
  9. True or false: Observer bias is likely to be a big problem in this study.

  1. Which of the following statements are true?

    • Experimental studies must use random samples.
    • An experimental study must blind the researchers.
    • An experimental study must blind the participants.
    • Experimental studies must use a control group.
    • In experimental studies, the treatments must be allocated by the researchers.
  2. Extraneous variables are variables that are related to the response variable.
    Which of the following types of variables are special types of extraneous variables?

    • Lurking variables.
    • Explanatory variables.
    • Confounding variables.

  1. Consider a study comparing the average weight loss for patients who do at least 30 minutes of exercise a day (Group A), to patients who do less than 30 minutes of exercise a day (Group B).
    Which of the following are true?

    • The extraneous variable is the amount of exercise per day (in hours)
    • The response variable is the weight loss for each person
    • The explanatory variable is whether or not the patient performs at least 30 minutes of exercise per day
    • The response variable is the average weight loss
    • The explanatory variable is the amount of exercise the patient does per day (in hours)
    • Age is likely to be a lurking variable.
    • Age is an extraneous variable.
    • Age is likely to be a confounding variable.
  2. Which of the following are possible confounding variables?

    • The sex of the patients.
    • The initial weight of the patients.
    • The names of the patients

Progress:

7.11 Exercises

Selected answers are available in Sect. D.7.

Exercise 7.1 A scientist is comparing the effects of two types of fertiliser on the yield of tomatoes (based on Mariel Gullian Klanian et al.184). He plants tomato seedlings, and fertilises with Fertiliser I, and later measures the yield of tomatoes. He then immediately plants more tomato seedlings in the same field, and fertilises with Fertilizer II, and measures the yield of tomatoes.

What potential problems can you identify with the study design?

Exercise 7.2 A scientist is expecting that tap water will taste the same as bottled water in a taste test (based on Eric Teillet et al.185). The scientist provides people with a plastic cup of either bottled or tap water, and she asks them to give a rating of the taste on a scale of 1 (terrible) to 5 (fantastic).

What potential problems can you identify with the study design?

Exercise 7.3 Consider this RQ (based on Teillet et al.186)):

Among university students, is the taste of tap water different than the taste of bottled water?

This RQ needs some clarification, but you decide to answer this question using an experiment. How would you manage:

  1. Random allocation?
  2. Blinding?
  3. Double blinding?
  4. Finding a control?
  5. Finding a random sample?

Exercise 7.4 In a study of time spent applying sunscreen187 the Aim was to 'determine whether time spent on sunscreen application is related to the amount of sunscreen used' (Heerfordt et al.188, p. 117). The authors state this about the study design:

The volunteers were asked to apply the provided sunscreen [...] the way they would normally do on a sunny day at the beach in Denmark [...] The volunteers wore swimwear during the whole session. No other information was given. Participants applied sunscreen behind a curtain and were not observed during application. Measurements of time and sunscreen weight were made without the subjects' being aware of this.

--- Heerfordt et al.189, p. 118

  1. What are the response and explanatory variables?
  2. The researchers also recorded age, height, weight and body surface area of each participant. Why would they have done this?
  3. The researchers also compared the mean values of the response variable for males and females, and the mean values of the explanatory variable for males and females. Why would they have done this?
  4. What design features are being used in the second quote?