3.6 External and internal validity

All studies should be designed to be externally valid (Chap. 5) and internally valid (Chaps. 7 and 8) as far as possible.

A study is externally valid if the results are likely to be generalise to other groups in the population, apart from those studied in the sample.

For a study to be externally valid, it first needs to be internally valid. Using a random sample helps ensure external validity. In addition, the use of inclusion and exclusion criteria (Sect. 2.3.1) helps clarify to whom or what the results may apply outside of the sample being studied.

Definition 3.8 (External validity) Externally validity refers to the ability to generalise the results to other groups in the population, apart from the sample studied.

For a study to be truly externally valid, the sample must be random sample.

A study is externally valid if the results from the sample studied are likely to apply to the intended population. It does not mean that the results apply more widely than the intended population.

Example 3.9 Suppose the population in a study is USC students. The sample would be the students studied. The study is externally valid if the sample is a random sample from the population of students.

The results will not necessarily apply to Sunshine Coast residents, but this has nothing to do with externally validity. External validity concerns how the sample represents the intended population in the RQ, which is USC students. The study is not concerned with Sunshine Coast residents.

Internally validity refers to how reasonable and logical the results from the study are: that is, the strength of the inferences made from the study.

High internal validity means that changes in the response variable can confidently be related to changes in the explanatory variable in the group that was studied; the possibility of other explanations for changes in the response variable have been minimised.

Definition 3.9 (Internal validity) Internally valid refers to the strength of the inferences that can be made from the sample under study.

One of many threats to internal validity might be that the groups being compared are different to begin with (for example, if the group receiving echinacea is younger (on average) than the group receiving no medication).

To check this, the baseline characteristics of the individuals in the groups can be compared: the groups being compared should be as similar as possible, so that any differences in the outcome cannot be attributed to pre-existing difference in the two groups being compared.

Example 3.10 (Baseline characteristics) In a study of treating depression in adults (Danielsson et al. 2014), three treatments were compared: exercise, basic body awareness therapy, or advice.

If any differences between the treatments were found, the researchers need to be confident that the differences were due to the treatment.

For this reason, the three groups were compared to ensure the groups were similar in terms of average ages, percentage of women, taking of anti-depressants, and many other aspects.

An internally valid study requires studies to be carefully designed; this is discussed at length later (Chaps. 7 and 8). In general, well-designed experimental studies are more likely to be internally valid than observational studies (Fig. 3.8).

Well-designed true experiments are more likely to have high internal validity

FIGURE 3.8: Well-designed true experiments are more likely to have high internal validity


Danielsson L, Papoulias I, Petersson E-L, Carlsson J, Waern M. Exercise or basic body awareness therapy as add-on treatment for major depression: A controlled study. Journal of Affective Disorders. 2014;168:98–106.