4.2 Standardisation

A technique that is very useful and commonly used in statistics is that of standardisation. In the case of the Normal distribution, this involves converting values that belong to a given Normal distribution into values that belong to a standard normal distribution. Recall that \(X\sim N(\mu, \sigma^2)\) and that if \(\mu = 0\) and \(\sigma = 1\), this is a special case of the normal distribution called the standard normal distribution. Normally, instead of \(X\), we use \(Z\) to denote a random variable that is distributed according to this special standard normal distribution. That is,

  • \(Z \sim N(0, 1)\).

Let us again consider the continuous random variable \(X\) that denotes the height in cm of university students and is normally distributed such that \(X\sim N(172.38,9.85^2)\). Values from this distribution can be 'standardised' to the standard normal distribution by taking a given value, subtracting the mean, and then dividing by the standard deviation. The result is then called a '\(z\)-score'. The \(z\)-score formula can be summarised as

\[z = \displaystyle \frac{x - \mu}{\sigma},\]

where \(x\) is the value we wish to standardise. Let's look at a few example heights, and what their corresponding \(z\)-scores would be.

  • For a height of 172.38cm, we have \(z = (172.38 - 172.38) / 9.85 = 0 / 9.85 = 0\). This makes sense, because 172.38 is in fact the average height according to this distribution, so its corresponding value in the standard normal distribution is 0, since the standard normal distribution has a mean of 0.
  • For a height of 182.23cm, we have \(z = (182.23 - 172.38) / 9.85 = 9.85 / 9.85 = 1\). This means that 182.23 is exactly one standard deviation above the mean. (Do you know how to check this?)
  • For a height of 165cm, we have \(z = (165 - 172.38) / 9.85 = -7.38 / 9.85 \approx -0.7492\). A \(z\)-score of -0.7492 tells us that the value is below average (because it is negative) but within one standard deviation of the mean (because it is between 0 and -1).

In general, \(z\)-scores close to zero tell us that the value is close to average. \(z\)-scores larger than 2 or 3 (or smaller than -2 or -3) can be considered more extreme.

Let's see how the above three examples can be depicted in the picture below:

Hopefully you can see that although the scale of the x-axis is different between the two distributions, the shape of the distributions, and the values' relativity to each other, stays exactly the same. Standardisation is useful for many different reasons, one of which is that a standardised score (\(z\)-score) immediately gives us an indication of how extreme a particular value may or may not be.

Your turn:

  1. Consider a variable \(X\) that is normally distributed with a mean of 172.38 and standard deviation of 9.85. For a value of \(x = 162.53\), what is the corresponding \(z\)-score? (Where relevant, round your answer to 4 decimal places)

  2. Consider a variable \(X\) that is normally distributed with a mean of 172.38 and standard deviation of 9.85. For a value of \(x = 180\), what is the corresponding \(z\)-score? (Where relevant, round your answer to 4 decimal places)

  3. Consider a variable \(X\) that is normally distributed with a mean of 172.38 and standard deviation of 9.85. For a value of \(x = 200\), what is the corresponding \(z\)-score? (Where relevant, round your answer to 4 decimal places)

  1. -1
  2. 0.7736
  3. 2.8041