17 Distributions and models

So far, you have learnt to ask a RQ, identify different ways of obtaining data, design the study, collect the data describe the data, summarise data graphically and numerically, and understand the decision-making process.

In this chapter, you will learn about distributions and models to describe the distribution of populations and samples. You will learn to:

  • describe distributions.
  • describe populations using normal distributions.
  • use \(z\)-scores to compute probabilities related to normal distributions.
  • use \(z\)-scores to 'work backwards' from probabilities for normal distributions.

17.1 Introduction

In the decision-making process used in statistics, an assumption is made about the population parameter, and then, based on this assumption, the values expected from the sample statistics can be described.

The expectations about the sample statistic are based around how the statistic (such as a sample mean, or a sample proportion, or a sample odds ratio) is distributed: what values it can take in various samples, and how often.

A model is used to describe this sampling distribution. For example, if I deal 15 cards, the statistic could be 'the proportion of red cards in a hand of 15'. The model would describe how often we would see 0 red cards in 15, 1 red card in 15, 2 red cards in 15, ... up to 15 red cards in 15 (Sect. 15.4).

Under certain circumstances, many different statistics have a similarly-shaped distribution: a bell-shaped (or normal) distribution. We now study this distribution, as it often is the basis for describing what values the statistic can be expected to take, based on the assumption about the population that we begin with.

17.2 Distributions: An example

To begin, consider the heights of all Australian adult males. Clearly, the height of all Australian adult males is unknown: no-one has ever, or could ever realistically, measure the height of all Australian adult males. The Australian Bureau of Statistics (ABS), however, takes samples of Australians to compute estimates of the heights and other measurements.

A model could be assumed for the heights of all Australian adult males. This is a theoretical idea that might be a useful description of the heights of Australian adult males in the population. Suppose a model for the heights of Australian adult males is adopted that has:

  • a symmetric distribution,
  • with a mean height of 175 cm, and
  • a standard deviation of 7 cm.

Then, the distribution of the heights of Australian adult males may look like Fig. 17.1. That is, most Australian adult males are between about 168 and 182cm, and very few are taller than 196cm or shorter than 154cm.

A model for the heights of Australian adult males

FIGURE 17.1: A model for the heights of Australian adult males

This model represents an idealised, or assumed, picture of the histogram of the heights of all Australian adult males in the population. If this model is a accurate, the distribution of heights in any sample, may be shaped a bit like this, but sampling variation will exist.

Any one sample will look a bit different than this model, but this model captures the general feel of the histogram from many of these samples. For example, see the animation below, where many samples of \(n=100\) men are taken.

The model of heights has approximately a bell-shape: that is, most values are near the average height, but a small number of men are very tall or very short. A bell-shaped distribution is formally called a normal distribution or a normal model. A normal distribution is a way of modelling the population.

A model is a theoretical or ideal concept. In the same way that a model skeleton isn't 100% accurate (wire joins?) and certainly not exactly like your skeleton, it suitably approximates reality. None of us probably have a skeleton exactly like the model, but the model is still useful and helpful.

Likewise, no variable has exactly a normal distribution, but the model is still useful and helpful. The model is a theoretical way of describing the distribution in the population.

17.3 Normal distributions

A suitable model for the heights of all Australian adult males may be described (Fig. 17.1) as having:

  • An approximately normal shape,
  • With a mean height of \(\mu = 175\) cm, and
  • A standard deviation of \(\sigma = 7\) cm.

This model for the heights of Australian adult males is a theoretical idea about the unknown population: it does not represent any particular sample of data. The model can be thought of as an 'average' of the histograms of the data from many samples.

Indeed, if this model turns out to be poor at describing what appears in these many samples, the parameters of the model (that is, the values of \(\mu\) and \(\sigma\)) can be adjusted so the model does describe the sample data well.

In fact, sample evidence suggests that the average height of Australians has been increasing356 and so the mean of the model may need to be changed at various times to remain a good model for heights of Australian adult males.

17.4 Standardising (\(z\)-scores)

Since many statistics have a normal distribution (under certain circumstances), the 68--95--99.7 rule can be used to understand the distribution of sample statistics.

Recall that the 68--95--99.7 rule states that, for any normal distribution (Fig. 13.10):

  • 68% of values lie within 1 standard deviation of the mean;
  • 95% of values lie within 2 standard deviations of the mean; and
  • 99.7% of values lie within 3 standard deviations of the mean.

These percentages only depend on how many standard deviations (\(\sigma\)) a value (\(x\)) is from the mean (\(\mu\)). This information can be used to learn about how values are distributed.

Example 17.1 (The 68--95--99.7 rule) Suppose heights of Australian adult males have a mean of \(\mu = 175\)cm, and a standard deviation of \(\sigma = 7\)cm, and (approximately) follow a normal distribution. Using this model, what proportion of Australian adult men are taller than 182cm?

Drawing the situation is helpful (Fig. 17.2). Notice that \(175 + 7 = 182\)cm is one standard deviation above the mean. We know that 68% of values are within one standard deviation of the mean, so that 32% are outside that range (smaller or larger) (Fig. 17.2). Hence, 16% are taller than one standard deviation above the mean, so the answer is about 16%. (Another 16% are less than one standard deviation below the mean, or less than \(175 - 7 = 168\)cm in height.)

Again, the percentages only depend on how many standard deviations (\(\sigma\)) the value (\(x\)) is from the mean (\(\mu\)), and not the actual values of \(\mu\) and \(\sigma\).

What proportion of Australian adult males are taller than 182cm?

FIGURE 17.2: What proportion of Australian adult males are taller than 182cm?

Example 17.2 (The 68--95--99.7 rule) Suppose heights of Australian adult males have a mean of \(\mu = 175\)cm, and a standard deviation of \(\sigma = 7\)cm, and (approximately) follow a normal distribution. Using this model, what proportion are shorter than 161cm? Again, drawing the situation is helpful (Fig. 17.3).

Since \(175 - (2\times 7) = 161\), then 161cm is two standard deviation below the mean. Since 95% of values are within two standard deviation of the mean, 5% are outside that range (half smaller, half larger; see Fig. 17.3), so that 2.5% are shorter than 161cm. (Another 2.5% are taller than \(175 + 14 = 189\)cm.)

What proportion of Australian adult males are shorter than 161cm?

FIGURE 17.3: What proportion of Australian adult males are shorter than 161cm?

Again, the percentages only depend on how many standard deviations (\(\sigma\)) the value (\(x\)) is from the mean (\(\mu\)). The number of standard deviations that an observation is from the mean is called a \(z\)-score.

A \(z\)-score is computed using

\[ z = \frac{ x - \mu}{\sigma}. \] Converting values to \(z\)-scores is called standardising.

Definition 17.1 (z-score) A \(z\)-score measures how many standard deviations a value is from the mean. In symbols:

\[\begin{equation} z = \frac{x - \mu}{\sigma}, \tag{17.1} \end{equation}\] where \(x\) is the value, \(\mu\) is the mean of the distribution, and \(\sigma\) is the standard deviation of the distribution.

Example 17.3 (z-scores) In Example 17.1, the \(z\)-score for a height of 182cm is

\[ z = \frac{x-\mu}{\sigma} = \frac{182 - 175}{7} = 1, \] one standard deviation above the mean.

In Example 17.2, the \(z\)-score for a height of 161cm is

\[ z = \frac{x-\mu}{\sigma} = \frac{161 - 175}{7} = -2, \] two standard deviations below the mean (a negative \(z\)-score means the value is below the mean).

The \(z\)-score is the number of standard deviations the observation is away from the mean. The \(z\)-score is also called the standardised value or standard score, and is calculated using Equation (17.1). Note that:

  • \(z\)-scores are negative for observations below the mean, and positive for observations above the mean.
  • \(z\)-scores are numbers without units (that is, it is not in kg, or cm, etc.).

Example 17.4 (The 68--95--99.7 rule) Consider the model for the heights of Australian adult males: a normal distribution, mean \(\mu = 175\), standard deviation \(\sigma = 7\) (Fig. 17.1).

Using this model:

  • The mean is zero standard deviations from the mean: \(z = 0\).
  • 168cm and 182cm are one standard deviation from the mean: \(z = -1\) and \(z = 1\) respectively.
  • 161cm and 189cm are two standard deviations from the mean: \(z = -2\) and \(z = 2\) respectively.
  • 154cm and 196cm are three standard deviations from the mean: \(z = -3\) and \(z = 3\) respectively.

17.5 Approximating areas using the 68--95--99.7 rule

Suppose again that heights of Australian adult males have a mean of \(\mu = 175\)cm, and a standard deviation of \(\sigma = 7\)cm, and (approximately) follow a normal distribution (Fig. 17.4).

Example 17.5 (Normal distribution areas) Using this model, what proportion of men are shorter than 160cm?

Again, drawing the situation is helpful (Fig. 17.5).

The empirical rule and heights of Australian adult males

FIGURE 17.4: The empirical rule and heights of Australian adult males

What proportion of Australian adult males are shorter than 160cm?

FIGURE 17.5: What proportion of Australian adult males are shorter than 160cm?

Proceeding as before, we need to ask 'How many standard deviation below the mean is 160cm?' Using Equation (17.1) to compute the \(z\)-score, \(160\)cm corresponds to a \(z\)-score of

\[ z = \frac{160 - 175}{7} = -2.14; \] that is, \(2.14\) standard deviations below the mean.

What percentage of observations are less than this \(z\)-score? This case is not covered by the 68--95--99.7 rule, though we can use the 68--95--99.7 rule to make some rough estimates.

About 2.5% of observations are less than 2 standard deviations below the mean (Example 17.1); that is, about 2.5% of men are shorter than 161cm.

So the percentages males even shorter than 161cm (that is, further into the tail of the distribution), will be less than 2.5%. While we don't know the probability exactly, it will be smaller than 2.5%.

Estimates in this way are crude, but often serviceable. However, better estimates of 'areas under the normal curve' are found using tables compiled for this very purpose.

These tables are in Appendix B.2. 'Percentages' under a normal curve are also called 'areas' under the normal curve. The total area under a normal curve is one (or 100%), since it represent all possible values that could be observed.

We now learn how to use these tables, then come back to Example 17.5.

17.6 Exact areas from normal distributions

Areas under normal distributions can be found using:

The online tables are easier to use.

17.6.1 Using the online tables

The online tables (which work differently to the hard-copy tables) can be found in Appendix B.2. Consider the same example again: the percentage of observations smaller than \(z = -2\).

The online tables (Appendix B.2) work with two decimal places, so consider the \(z\)-score as \(z = -2.00\).

In the tables, enter the value -2.00 in the search region just under the column labelled z.score (see the animation below). After pressing Enter, the answer is shown in the column headed Area.to.left: the probability of finding a \(z\)-score less than \(-2\) is 0.0228, or about 2.28%.

Using either the hard-copy or online tables gives an answer of about 2.28%. Using the 68--95--99.7 rule, the answer we obtained was \(2.5\)%. Recall that the 68--95--99.7 rule is an approximation only.

17.6.2 Using the hard-copy tables

To demonstration the use of the normal distribution tables, consider the percentage of observations smaller than \(z = -2\) (that is, two standard deviations below the mean) in a normal distribution.

Like the online tables, the hard-copy tables work with \(z\)-scores to two decimal places, so consider the \(z\)-score as \(z=-2.00\).

On the tables, find \(-2.0\) in the left margin of the table, and find the second decimal place (in this case, 0) in the top margin of the table (Fig. 17.6): where these intersect is the area (or probability) less than the \(z\)-score. So the probability of finding a \(z\)-score less than \(z = -2\) is 0.0228, or about 2.28%. (The online tables work differently.)

Using the hard-copy tables to compute the probability that $z$ is less than $-2$

FIGURE 17.6: Using the hard-copy tables to compute the probability that \(z\) is less than \(-2\)

The tables give the area to the left of the \(z\)-score that is looked up.

17.7 Comparing exact and approximate areas

Armed with knowledge of obtaining exact areas, let's return to Example 17.5:

Example 17.6 (Using normal distributions) Suppose heights of Australian adult males have a mean of \(\mu = 175\)cm, and a standard deviation of \(\sigma = 7\)cm, and (approximately) follow a normal distribution. Using this model, what proportion are shorter than 160cm?

The general approach to computing probabilities from normal distributions is:

  • Draw a diagram: Mark on 160 cm (Fig. 17.5).
  • Shade the required region of interest: 'less than 160 cm tall' (Fig. 17.5).
  • Compute the \(z\)-score using Equation (17.1).
  • Use the \(z\) tables in Appendix B.2.
  • Compute the answer.

The number of standard deviations that 160cm is from the mean is found using Equation (17.1):

\[\begin{align*} z &= \frac{x-\mu}{\sigma} \\[3pt] &= \frac{160-175}{7} = \frac{-15}{7} = -2.14. \end{align*}\] That is, 160cm is 2.14 standard deviations below the mean, so use \(z = -2.14\) in the tables. The diagram at the top of the tables reminds us that this is the probability (area) that the value of \(z\) is less than \(z = -2.14\) (Fig. 17.5). The probability of finding an Australian man less than 160cm tall is about 1.6%.

More complicated questions can be asked too, as shown in the next section.

17.8 Examples using \(z\)-scores

Example 17.7 (Normal distributions) Dario M. Aedo-Ortiz, Eldon D. Olsen, and Loren D. Kellogg357 simulated mechanized forest harvesting systems.358

As part of their study, they assumed that the specific trees in their study would vary in diameter, with

  • a normal distribution; with
  • a mean of \(\mu = 8.8\) inches; and
  • a standard deviation of \(\sigma = 2.7\) inches.

Using this model, what is the probability that a tree has a diameter greater than than 6 inches?

Follow the steps identified earlier:

  • Draw a normal curve, and mark on 6 inches (Fig. 17.7, top panel).
  • Shade the region corresponding to 'greater than 6 inches' (Fig. 17.7, bottom panel).
  • Compute the \(z\)-score using Eq. (17.1). Here, \(x = 6\), \(\mu = 8.8\), \(\sigma = 2.7\), so \(\displaystyle z = (6 - 8.8)/2.7 = -2.8/2.7 = -1.04\) to two decimal places.
  • Use tables: The probability of a tree diameter shorter than 6 inches is \(0.1492\). (The tables always give area less than the value of \(z\) that is looked up.)
  • Compute the answer: Since the total area under the normal distribution is one, the probability of a tree diameter greater than 6 inches is \(1 - 0.1492 = 0.8508\), or about 85%.
What proportion of tree diameters are greater than 6 inches?

FIGURE 17.7: What proportion of tree diameters are greater than 6 inches?

The normal-distribution tables in the Appendix always provide area to the left of the \(z\)-scores that is looked up. Drawing a picture of the situation is important: it helps visualise how to get the answer from what the table give us.

Remember: The total area under the normal distribution is one.

Match the diagram in Fig. 17.8 with the meaning for the tree-diameter model (recall: \(\mu=8.8\) inches):

  1. Tree diameters greater than 11 inches.
  2. Tree diameters between 6 and 11 inches.
  3. Tree diameters less than 11 inches.
  4. Tree diameters between 3 and 6 inches.

1: matches B.

2: matches C.

3: matches D.

4: matches A.

Match the diagram with the description

FIGURE 17.8: Match the diagram with the description

Example 17.8 (Normal distributions) Using the model for tree diameters in Example 17.7,359 what is the probability that a tree has a diameter between 6 and 11 inches?

First, draw the situation, and shade 'between 6 and 10 inches' (Fig. 17.9). Then, compute the \(z\)-scores for both tree diameters:

\[\begin{align*} \text{6 inches: } &z = \displaystyle \frac{6 - 8.8}{2.7} = -1.04;\\[6pt] \text{11 inches: } &z = \displaystyle \frac{11 - 8.8}{2.7} = 0.81. \end{align*}\] Table B can then be used to find the area to the left of \(z = -1.04\), and also the area to the left of \(z = 0.81\). However, neither of these provide the area between \(z = -1.04\) and \(z = 0.81\) (Fig. 17.10).

What proportion of tree diameters are between 6 and 11 inches?

FIGURE 17.9: What proportion of tree diameters are between 6 and 11 inches?

What proportion of tree diameters are between 6 and 11 inches? The two shaded areas given are what we find by using the tables with $z=-1.04$ and $z=0.81$, but neither give us the area we are seeking

FIGURE 17.10: What proportion of tree diameters are between 6 and 11 inches? The two shaded areas given are what we find by using the tables with \(z=-1.04\) and \(z=0.81\), but neither give us the area we are seeking

Looking carefully at the areas from the tables and the area sought, that area between the two \(z\)-scores is

\[ 0.7910 - 0.1492 = 0.6418; \] see the animation below. The probability that a tree has a diameter between 6 and 11 inches is about 0.6418, or about 64%.

17.9 Unstandardising: Working backwards

Using the model for tree diameters in Example 17.7360 again, suppose now the diameters of the smallest 10% of trees needs to be identified. What are these diameters?

Example 17.9 (Normal distributions backwards) Consider again the trees study. The tree diameters can be modelled with

  • a normal distribution; with
  • a mean of \(\mu = 8.8\) inches; and
  • a standard deviation of \(\sigma = 2.7\) inches.

Identify the diameters of the smallest 10% of trees.

This is a different problem than before; previously, the tree diameter was known, so a \(z\)-score could be computed, and hence a probability (Fig. 17.11, top panel).

This time, the probability is known, and a tree diameter is sought. That is, working 'backwards' is needed (Fig. 17.11, bottom panel), so the \(z\)-tables need to be used 'backwards' too.

Working with $z$-scores

FIGURE 17.11: Working with \(z\)-scores

17.9.1 Using the hard-copy tables

When the \(z\) scores (in the margins of the tables were known, the areas were found in the body of the table. If the area (or probability) is known (found in the body of the table), the corresponding \(z\)-score can be found (in the margins of the table), and hence the observation \(x\); see the animation below. The closest area to 10% in the tables is 0.1003, or 10.03%.

To identify the diameters of the smallest 10% of trees, the \(z\)-score that has an area to the left of 10% (or 0.10) needs to be found (at least, as close as possible to 0.10).

17.9.2 Using the online tables

When the area (or probability) is known, special online tables can be used (Appendix B.3). In these tables, enter the area to the left in search box under Area.to.left, and the corresponding \(z\)-scores appears under the z.score column (see the animation below).

Using either the hard-copy or online tables, the appropriate \(z\)-value is \(1.28\) standard deviations below the mean (Fig. 17.12). Then, the \(z\)-score can be converted to an observation value \(x\) using the unstandardising formula361:

\[ x = \mu + z\sigma. \] Using this unstandardising formula:

\[\begin{align*} x &= \mu + (z\times\sigma) \\ &= 8.8 + (-1.28 \times 2.7) = 5.344; \end{align*}\] that is, about 10% of trees have diameters less than about 5.3 inches.

Tree diameters: The smallest 10\%

FIGURE 17.12: Tree diameters: The smallest 10%

Definition 17.2 (Unstandardizing formula) When the \(z\)-score is known, the corresponding value of the observation \(x\) is

\[\begin{equation} x = \mu + z\sigma. \tag{17.2} \end{equation}\] This is called the unstandardising formula.

Ball bearings labelled as "50mm bearings" actually have diameters that follow a normal distribution with mean 50mm and standard deviation 0.1mm.

The smallest 15% of bearings are too small for sale. What size bearings cannot be sold?

The closest area from the tables is 0.1492, corresponding to \(z = -1.04\). Using the unstandardising formula, \(x = 50 + (-1.04\times 0.1) = 49.896\).

This means that bearings less than about 49.9 mm in diameter cannot be sold.

Example 17.10 (Normal distributions backwards) Using the model for tree diameters in Example 17.7362 again, suppose now the diameters of the largest 25% of trees needs to be identified. What are these diameters?

The tree diameters can be modelled with

  • a normal distribution; with
  • a mean of \(\mu=8.8\) inches; and
  • a standard deviation of \(\sigma=2.7\) inches.

Again, we need to work 'backwards' (Fig. 17.13, bottom panel), so the \(z\)-tables need to be used 'backwards' too. The largest 25% implies large trees, so we would expect a diameter larger than the mean.

Using a diagram is important (Fig. 17.13): the tables work with the area to the left of the value of interest, which is 75%.

Using either the hard-copy or online tables, the appropriate \(z\)-value is \(z = 0.674\). Then, the \(z\)-score can be converted to an observation value \(x\) using the unstandardising formula:

\[\begin{align*} x &= \mu + (z\times\sigma) \\ &= 8.8 + (0.674 \times 2.7) = 10.621. \end{align*}\] That is, about 25% of trees have diameters larger than about 10.6 inches.

Tree diameters: The largest 25\% is the same as the smallest 75\%

FIGURE 17.13: Tree diameters: The largest 25% is the same as the smallest 75%

Example 17.11 (Normal distributions) A study Pekka Huhtanen, Mohammad Ramin, and E. H. Cabezas-Garcia363 of methane produced by animals modelled the retention time of food in sheep using a normal distribution, where the modelled used:

  • the mean retention time as \(\mu = 42.5\) hours, and
  • the standard deviation of the retention time as \(\sigma = 3.68\) hours.

We can draw this normal distribution (Fig. 17.14). We can then apply the 68--95--99.7 rule:

  • About 68% of retention times are between 38.83 and 46.18 hrs;
  • About 95% of retention times are between 35.14 and 49.86 hrs;
  • About 99.7% of retention times are between 31.46 and 53.54 hrs.

Using this model, what proportion of sheep have a retention time less than 40 hours?

A retention time of 40 hours corresponds to a \(z\)-score of:

\[ z = \frac{40 - 42.5}{3.68} = -0.68. \] This is a negative number, since 40 hours is below the mean.

We can then use the normal distribution tables that give us the area to the left of the \(z\)-score that we look up.

The area to the left of \(z = -0.68\) is 0.2483, or about 24.8%: about 24.8% of sheep have a retention times less than 40 hours.


What proportion of sheep have a retention time greater than 48 hours (two days)?

  • A retention time of 48 hours corresponds to what \(z\)-score?
  • Using the normal distribution tables, what is the area to the left of this \(z\)-score?
  • Then, what is the area to the right of this \(z\)-score?

What proportion of sheep have a retention time between 36 and 48 hours?

  • A retention time of 36 hours corresponds to what \(z\)-score?
  • Using the normal distribution tables, what is the area to the left of \(z = -1.77\)?

This is not the area that we are seeking...

From earlier, the area to the left of \(z = 1.49\) is 0.9319. But this is not the area to that we are seeking either...

From the two areas that we know, we can find the area that we are seeking:

  • 48 hours corresponds to \(z = 1.49\). The area to the left of this \(z\)-score is 0.9319.
  • 36 hours corresponds to \(z = -1.77\). The area to the left of this \(z\)-score is 0.0384.
  • The difference between these two areas is what we are seeking: \(0.9319 - 0.0384 = 0.8935\).

So the proportion is about 0.894 (or 89.4%).


Consider the 35% of sheep with the shortest retention times. What are these retention times?

  • Will it be greater than, or smaller than, the mean?

We don't know exactly where to draw the retention time that this corresponds to on the diagram; it's just somewhere to the left of the mean...

This time, we know the area to the left, but we do not know the value (or \(z\)-score).

Previously, we knew the retention value (and hence the \(z\)-score), but not the area. This is like a 'backwards problem'.

We need to look up the area to the left in the body of the table, and hence determine the corresponding \(z\)-score.

From the tables, a \(z\)-score of \(z = -0.39\) has an area to the left of 0.3483... which is as close as we can get.

We know the \(z\)-score, so we can find the retention value, using the unstandardising formula: \(x = \mu + (z \times \sigma)\).

  • What is the retention time?
Retention times

FIGURE 17.14: Retention times

17.10 Summary

A model is a way of theoretically describing the distribution of some quantitative variable in a population. One common model is a normal model or normal distribution, which is a bell-shaped distribution with a theoretical mean \(\mu\) and a theoretical standard deviation \(\sigma\). Probabilities can be computed from normal distributions using \(z\)-scores.

17.11 Quick revision questions

  1. Consider again the model for tree diameters in Example 17.7:364 a normal distribution with \(\mu = 8.8\) inches, and \(\sigma = 2.7\) inches.

    1. A tree diameter of 7.9 inches corresponds to a \(z\)-score (to two decimal places) of:
    2. The probability that a tree has a diameter less than 7.9 inches is (as a decimal value):
    3. The probability that a tree has a diameter greater than 7.9 inches is (as a decimal value):
    4. A tree diameter of 9 inches corresponds to a \(z\)-score (to two decimal places) of (as a decimal value):
    5. The probability that a tree has a diameter less than 9 inches is (as a decimal value):
    6. The probability that a tree has a diameter greater than 9 inches is (as a decimal value):
  2. In a simulation of methods to coat corn seeds (with fertilizer and crop protection chemicals, etc.), Pasha et al. (2016) modelled the seed diameter as having a normal distribution, with mean 7.5mm and standard deviation of 0.225mm.

    1. What is the probability that a seed has a diameter of more than 8mm?
    2. What is the probability that a seed has a diameter less than 7.1mm?
    3. What is the probability that a seed has a diameter between 7.5 and 8mm?
    4. What is the diameter of the smallest 30% of seeds?
    5. What is the diameter of the largest 90% of the seeds?
  3. Are the following statements true or false?

    • The unstandardising formula can be used to compute probabilities.
    • About 68% of observations are within two standard deviations of the mean.
    • Positive \(z\)-scores correspond to values larger than the mean.
    • A \(z\)-score tells us how many standard deviations a value is away from the mean.
    • A \(z\)-score larger than 4 is impossible.
    • A \(z\)-score of zero is located at the mean value.

Progress:

17.12 Exercises

Selected answers are available in Sect. D.17.

Exercise 17.1 Consider again the study by Aedo-Ortiz, Olsen, and Kellogg365, who studied the diameter of trees in certain forests. The tree diameters can be modelled with

  • a normal distribution; with
  • a mean of \(\mu=8.8\) inches; and
  • a standard deviation of \(\sigma=2.7\) inches.

For these trees:

  1. What is the probability that a tree will have a diameter less than 8 inches?
  2. What is the probability that a tree will have a diameter greater than 9 inches?
  3. What is the probability that a tree will have a diameter between 7 and 10 inches?
  4. The largest 15% of trees have what diameters?
  5. The smallest 25% of trees have what diameters?

Exercise 17.2 In a study366 to help understand factors influencing preterm births, the researchers modelled the gestation length of healthy babies as having a normal distribution with a mean of 40 weeks, and a standard deviation of 1.64 weeks. Using this model:

  1. What proportion of births are longer than 39 weeks (that is, nine months)?
  2. In Australia, a premature birth is defined as a birth occuring before 37 weeks. What proportion of births are expected to be premature?
  3. According to Health Direct, 'Babies born between 32 and 37 weeks may need care in a special care nursery'. What proportion of healthy births would be expected to be born between 32 and 37 weeks gestation?
  4. How long is the gestation length for the longest 5% of pregnancies?
  5. How long is the gestation length for the shortest 5% of pregnancies?

Exercise 17.3 IQ scores are designed to have a mean of 100 and a standard deviation of 15. Mensa is a society for people with a high IQ:

Membership of Mensa is open to persons who have attained a score within the upper two percent of the general population on an approved intelligence test that has been properly administered and supervised.

--- Mensa webpage

What IQ score is needed to join Mensa?

Exercise 17.4 IQ scores are designed to have a mean of 100 and a standard deviation of 15. Jay L. Zagorsky367 reports that

...Congress requires the Pentagon to reject all military recruits whose IQ is in the bottom 10% of the population...

--- Zagorsky368, p. 403

What IQs scores lead to a rejection from the US military?

Exercise 17.5 IQ scores are designed to have a mean of 100 and a standard deviation of 15. Match the diagram in Fig. 17.15 with the meaning.

  1. IQs greater than 110.
  2. IQs between 90 and 115.
  3. IQs less than 110.
  4. IQs greater than 85.
Match the diagram with the description

FIGURE 17.15: Match the diagram with the description

Exercise 17.6 IQ scores are designed to have a mean of 100 and a standard deviation of 15. Match the diagram in Fig. 17.15 with the meaning.

  1. The largest 25% of IQ scores.
  2. The smallest 10% of IQ scores.
  3. The largest 70% of IQ scores.
  4. The smallest 60% of IQ scores.
Match the diagram with the description

FIGURE 17.16: Match the diagram with the description

Exercise 17.7 A study of the impact of charging electric vehicles (EVs) on electricity demands369 modelled the time at which people began charging their EVs at home. Based on a survey,370 they modelled the time at which EVs began charging as having a mean of 5:30pm, with a standard deviation of 2.28 hrs. For this model:

  1. What is the probability that an EVs will begin charging after 9pm?
  2. What is the probability that an EVs will begin charging before 5pm?
  3. What is the probability that an EVs will begin charging between 5pm and 6pm?
  4. 30% of the EVs begin charging after what time?
  5. The earliest 15% of charging begins when?

Hint: This question is much easier if you convert times into 'minutes after midnight'!