31 Tests for one proportion
So far, you have learnt to ask a RQ, design a study, and classify and summarise the data. You have also learnt to construct confidence intervals. In this chapter, you will learn to:
 conduct hypothesis tests for one sample proportion, using a \(z\)test.
 determine whether the conditions for using these methods apply in a given situation.
31.1 Introduction: rolling dice
In a toy store one day (for my children...), I saw 'loaded dice' for sale. The packaging claimed 'One loaded & one normal'. I bought two packs! However, there was no indication as to which die was loaded. How could I determine which was 'loaded'? I guess had to roll the dice...
Suppose I selected one die to roll. If that die happened to be the fair die, I'd expect that each face would appear approximately (but not exactly) onesixth of the time (using classical probability; Sect. 19.4). So, I could roll one die, and see how often a ⚀ actually appeared. Using the decisionmaking process discussed earlier (Sect. 20.3), then I could decide if that die was the fair die.
I could ask the decisionmaking RQ:
For this die, is the population proportion of rolls that show a ⚀ equal to \(1/6\)?
31.2 Statistical hypotheses and notation
If the die was fair, about onesixth of rolls would produce a ⚀ ... but not necessarily exactly onesixth, due to sampling variation. Sometimes the value of \(\hat{p}\) would be a little smaller than \(1/6\), and sometimes a little larger, even if \(p\) really was \(1/6\).
By initially assuming the population proportion of ones is \(1/6\), the possible values of the sample proportion from all possible rolls of the fair die could be determined; that is, the sampling distribution could be described. Then, reasonablyexpected values of the sample proportion can be compared to the observed value of \(\hat{p}\) from the single sample.
If the sample proportion of rolls that are ⚀ is not exactly \(1/6\), two explanations exist:
 The population proportion really is \(1/6\), and the sample proportion \(\hat{p}\) is not exactly \(1/6\) due to sampling variation; or
 The population proportion really is not \(1/6\); that is, the sample proportion \(\hat{p}\) is not exactly \(1/6\) because the die is not fair.
These two possible explanations are called statistical hypotheses. If \(p\) is defined as the population proportion of ones, then the hypotheses above are:
 \(H_0\): \(p = 1/6\), called the null hypothesis; and
 \(H_1\): \(p \ne 1/6\), the called alternative hypothesis.
The hypotheses propose values for the unknown population proportion (the parameter \(p\)). Proposing values for the sample proportion (i.e., the statistic \(\hat{p}\)) is silly: we know the observed value of \(\hat{p}\) after rolling the die.
The alternative hypothesis can take different forms, depending on the research question. Here, the RQ here is open to the value of \(p\) being smaller or larger than \(1/6\); that is, two possibilities are considered (since the value of \(p\) may be higher or lower than \(1/6\) if the die is loaded). Hence, we write \(p\ne 1/6\), which is called a twotailed alternative hypothesis. An alternative hypothesis like \(p > 1/6\) or \(p < 1/6\) is a onetailed hypothesis.
31.3 Describing the sampling distribution
When the population proportion of rolls that are a ⚀ really is \(p = 1/6\), what values of the sample proportion are reasonable to expect, given sampling variation? The answer depends on the sample size. In one roll of a die, rolling a ⚀, and hence finding a sample proportion of \(\hat{p} = 1\), is not unreasonable. However, in \(20\ 000\) rolls, a sample proportion of \(\hat{p} = 1\) would be incredibly unlikely for a fair die.
Hypothesis testing always begins by assuming the null hypothesis is true. Here, that means initially assuming that \(p = 1/6\). In Chap. 24, the sampling distribution of a sample proportion was given when \(p\) is a known value (see Sect. 24.1). Hence, if I decide to use \(n = 100\) rolls of the die, the sampling distribution for this die situation can be described as:
 an approximate normal distribution,
 with mean of \(1/6\),
 with a standard deviation of \(\displaystyle \text{s.e.}(\hat{p}) = \sqrt{\frac{ (1/6) \times \left(1  (1/6)\right)}{100}} = 0.037267\).
This is how the values of \(\hat{p}\) would vary if \(p\) really was \(1/6\), and if certain conditions are met (Sect. 31.9).
The mean of this distribution is the mean of all possible values of \(\hat{p}\); the value of that mean is \(p\). Similarly, the standard deviation of this distribution is the standard error, denoted \(\text{s.e.}(\hat{p})\), the standard deviation of all possible values of the statistic \(\hat{p}\).
The notation \(\text{s.e.}(\hat{p})\) denotes the standard error of the sample proportion. Its value is the standard deviation of the proportions computed from all possible samples of a given size \(n\).
When computing the standard error for a proportion, take care!
 The formula for a confidence interval uses the sample proportion \(\hat{p}\) (see Eq. (24.4)), since we only have sample information to work with when forming a confidence interval.
 The formula for a hypothesis test uses the population proportion \(p\) from the null hypothesis (see Eq. (24.2)), since hypothesis testing assumes the null hypothesis is true, and hence the value of \(p\) is known.
In both cases, use a proportion in the formula, not a percentage (i.e., \(0.16666...\) rather than \(16.666...\)%). Don't forget to take the square root!
A picture of this sampling distribution (Fig. 31.1) shows how the sample proportion varies when \(n = 100\) across all possible samples, simply due to sampling variation, when \(p = 1/6 = 0.1666...\). Values of \(\hat{p}\) between about \(13\)% and \(20\)% would seem to occur reasonably frequently when \(p = 1/6\). Values of \(\hat{p}\) larger than \(0.25\) look unlikely when \(n = 100\); values less than \(0.10\) also appear unlikely, but not impossible. A value above \(0.30\) looks almost impossible.
In my \(100\) rolls of one die, \(41\) showed a ⚀, a sample proportion of \(\hat{p} = 41/100 = 0.41\). From Fig. 31.1the values of \(\hat{p}\) from all possible samplesthis is practically impossible if the die was fair. What I observed was almost impossible... but I really did observe it. A reasonable conclusion is that the assumption I was makingthat the die is fairis not tenable, nor supported by the evidence (i.e., the data).
31.4 Computing the value of the test statistic
One way to measure how far the sample proportion \(\hat{p} = 0.41\) is from the population proportion \(p = 1/6\) in \(100\) rolls is to use a \(z\)score, since the sampling distribution (Fig. 31.1) has an approximate normal distribution.
Since the mean is \(p\) and standard deviation is \(\text{s.e.}(\hat{p})\), the \(z\)score is
\[\begin{align*}
z
&= \frac{\text{sample statistic}  \text{mean of the distribution}}{\text{standard deviation of the distribution}}\\
&= \frac{\hat{p}  p }{\text{s.e.}(\hat{p})}
= \frac{0.41  0.1666...}{0.037267} = 6.53.
\end{align*}\]
In this context, the \(z\)score is called a test statistic.
It means that the observed sample proportion is more than six standard deviations from the mean, which is highly unusual according to the \(68\)\(95\)\(99.7\) rule (or Tables).
31.5 Determining \(P\)values
The value of the \(z\)score shows that the value of \(\hat{p}\) is highly very unusual... but how unusual? Quantifying how unusual is assessed using a \(P\)value, which is used widely in scientific research.
\(P\)values refer to the area more extreme than the calculated \(z\)score in the normal distribution; that is, in the tails of the distribution. For twotailed \(P\)values, the \(P\)value is the combined area in the lower and upper tails. For onetailed \(P\)values, the \(P\)value is the area in one tail only. Clearly, since the \(P\)value is a probability, its value is always between \(0\) and \(1\).
\(P\)values can be approximated using the \(68\)\(95\)\(99.7\) rule and a diagram (Sect. 22.5; Sect. 31.5.1), or more precisely using the \(z\)tables in App. B.1 (Sect. 22.7; Sect 31.5.2). \(P\)values are also reported by software for most statistical tests.
31.5.1 Approximating \(P\)values: the \(68\)\(95\)\(99.7\) rule
The \(68\)\(95\)\(99.7\) rule can be used to determine approximate \(P\)values only:
 If the calculated \(z\)score was \(z = 1\), the twotailed \(P\)value would be the shaded area in Fig. 31.2 (left panel): about \(32\)%, based on the \(68\)\(95\)\(99.7\) rule. The twotailed \(P\)value would be the same if \(z = 1\). The onetailed \(P\)value would be the area in onetail: About \(16\)%, based on the \(68\)\(95\)\(99.7\) rule.
 If the calculated \(z\)score was \(z = 2\), the twotailed \(P\)value would be the shaded area shown in Fig. 31.2 (right panel): about \(5\)%, based on the \(68\)\(95\)\(99.7\) rule. The twotailed \(P\)value would be the same if \(z = 2\). The onetailed \(P\)value would be the area in onetail: About \(2.5\)%, based on the \(68\)\(95\)\(99.7\) rule.
If the \(z\)score is a little larger than \(z = 1\), say \(z = 1.2\), then the tail area will be a little smaller than the tail area when \(z = 1\) (Fig. 31.3, left panel). The twotailed \(P\)value is a little smaller than \(0.32\).
Similarly, when the \(t\)score is a bit smaller than \(z = 2\), say \(z = 1.9\), the tail area will be a little larger than the tail area when \(z = 2\) (Fig. 31.3, right panel). The twotailed \(P\)value is a little larger than \(0.05\).
31.5.2 More precise \(P\)values: using tables
Using the tables of areas under normal distributions (Appendix B.1.), more precise \(P\)values can be found using the ideas from Sect. 22.6. For instance (see Fig. 31.3):
 For \(z = 1.2\): the area to the left of \(z = 1.2\) is \(0.1151\), and the area to the right of \(z = 1.2\) is \(0.1151\), so the twotailed \(P\)value is \(0.1151 + 0.1151 = 0.2302\). This is a little smaller than \(0.32\), as estimated above.
 For \(z = 1.9\): the area to the left of \(z = 1.9\) is \(0.0287\), and the area to the right of \(z = 1.9\) is \(0.0287\), so the twotailed \(P\)value is \(0.0287 + 0.0287 = 0.0574\). This is a little larger than \(0.05\), as estimated above.
In this dierolling example, where the \(z\)score is \(6.53\), the tail area is very small (using Appendix B.1), and zero to four decimal places (Fig. 31.1). Since \(P\)values are never exactly zero, we write \(P < 0.001\) (that is, the \(P\)value is less than \(0.0001\)).
31.6 Making decisions with \(P\)values
\(P\)values tells us the probability of observing the sample statistic (or one even more extreme), assuming the null hypothesis is true. In the dierolling example, the \(P\)value is the probability of observing the value of \(\hat{p} = 0.41\) (or more extreme), just through sampling variation (chance) if \(p = 1/6\). Since the \(P\)value is a probability (of something quite specific), it is a value between \(0\) and \(1\). Then (see the animation below).
 'Big' \(P\)values mean that the sample statistic (i.e., \(\bar{p}\)) could reasonably have occurred through sampling variation in one of the many possible samples, if the assumption made about the parameter (stated in \(H_0\)) was true: The data do not contradict the assumption in \(H_0\).
 'Small' \(P\)values mean that the sample statistic (i.e., \(\hat{p}\)) is unlikely to have occurred through sampling variation in one of the many possible samples, if the assumption made about the parameter (stated in \(H_0\)) was true: The data do contradict the assumption in \(H_0\).
What is meant by 'small' and 'big' in this contexts? This is arbitrary: no definitive rules exist. A \(P\)value smaller than \(1\)% (that is, smaller than \(0.01\)) is usually considered 'small', and a \(P\)value larger than \(10\)% (that is, larger than \(0.10\)) is usually considered 'big'. Between the values of \(1\)% and \(10\)% is often a 'grey area'. Commonly, however, a \(P\)value less than \(0.05\) is often considered 'small'.
In this dierolling example, where the \(P\)value is very small, the data contradict the null hypothesis (that \(p = 1/6\)), suggesting that the die is probably not fair.
Be careful interpreting the results! We cannot be sure that the die is unfair. A small \(P\)value is not proof that the die is loaded. The die may be fair but, due to sampling variation, the sample we observed simply produced an unusually high proportion of ⚀ rolls.
Hence, the result is interpreted as 'there is evidence that the die is unfair'. The onus is on the data to refute the null hypothesis, the initial assumption.
31.7 Writing conclusions
In \(100\) rolls of the other die, I found a ⚀ on \(15\) rolls, so that \(\hat{p} = 0.15\). Following the procedures above (check!) and using the same hypotheses, \(z = 0.45\) and (using tables) the twotailed \(P\)value is \(2\times 0.3264 = 0.6529\). This means that the sample result was not unusual if \(p = 1/6\).
Be careful interpreting the results! A large \(P\)value does not necessarily mean that the die is fair! It only means that the proportions of rolls that produce a ⚀ is not unusual... but perhaps the due is loaded in some other way (i.e., to produce morethanexpected numbers of ⚀ rolls).
Be careful interpreting the results! A large \(P\)value does not necessarily mean that the die produces is fair! The die may indeed be loaded to produce a largerthanexpected numbers of ⚀ rolls... but, due to sampling variation, the sample we observed simply did not provide evidence to make that conclusion.
Hence, the result is interpreted as 'there is no evidence that the die is fair'. The onus is on the data (i.e., evidence) to refute the assumption made in the null hypothesis.
In general, communicating the results of any hypothesis test requires:
 An answer to the RQ, worded in terms of how much evidence exists to support the alternative hypothesis.
 A summary of the evidence used to reach that conclusion (such as the \(z\)score and \(P\)value, including if the \(P\)value is one or twotailed).
 Sample summary information, including a CI (see Chap. 24), summarising the data used to make the decision.
So for the dierolling example, write:
The sample provides very strong evidence (\(z = 6.53\); twotailed \(P < 0.001\)) that the proportion of sixes is not \(1/6\) (\(\hat{p} = 0.41\); approx. \(95\)% CI: \(0.312\) to \(0.508\); \(n = 100\) rolls) in the population.
The components are:
 An answer to the RQ: 'The sample provides very strong evidence... that the population proportion is not \(1/6\)'; notice the wording states how much evidence exists in the sample to support the alternative hypothesis.
 The evidence used to reach the conclusion: '\(z = 6.53\); twotailed \(P < 0.001\))'.
 Sample summary information (including a CI).
Since the null hypothesis is initially assumed to be true, the onus is on the evidence to refute the null hypothesis. Hence, conclusions are worded in terms of how strongly the evidence (i.e., sample data) support the alternative hypothesis.
The alternative hypothesis may or may not be true... but we report how much evidence (data) supports the alternative hypothesis.
31.8 Process overview
Let's recap the decisionmaking process, in this context about rolling a ⚀:

Assumption:
Write the null hypothesis and alternative hypothesis about the parameter (based on the RQ), where \(p\) is the population proportion of rolls that are
⚀:
 \(H_0\): \(p = 1/6\), and
 \(H_1\): \(p \ne 1/6\) (this is a twotailed alternative hypothesis).
 Expectation: The sampling distribution describes what values to reasonably expect from the sample statistic across all possible samples, if the null hypothesis is true. Under certain circumstances, the sample proportions will vary with an approximate normal distribution around a mean of \(p = 1/6\) with a standard deviation of \(\text{s.e.}(\hat{p}) = 0.0372678\).
 Observation: Compute the \(z\)score: \(z = 6.53\), a measure of the distance between the assumed population value, and the observed sample value.
 Decision: Determine if the data are consistent with the assumption, by computing the \(P\)value. Here, the \(P\)value is (much) less than \(0.001\). The \(P\)value can be computed by software, or approximated using the \(68\)\(95\)\(99.7\) rule. The conclusion is that very strong evidence exists that \(p\) is not \(1/6\).
31.9 Statistical validity conditions
All hypothesis tests have underlying conditions to be met so that the results are statistically valid; that is, \(P\)values can be found accurately because the sampling distribution is an approximate normal distribution. For a hypothesis test for one proportion, these conditions are similar to those for the CI for one proportion (Sect. 24.6).
The statistical validity conditions for a test for a single proportion is that the expected number of individuals in the group of interest (i.e, \(n\times p\)) and in the group not of interest (i.e., \(n\times (1  p)\)) both exceed five; that is:
 \(n\times p > 5\), and \(n\times (1  p) > 5\).
The value of \(5\) here is a rough figure here, and some books give other values (such as \(10\) or \(15\)). This condition ensures that the sampling distribution of the sample proportions has an approximate normal distribution (so that, for example, the \(68\)\(95\)\(99.7\) rule can be used).
Example 31.1 (Statistical validity) The hypothesis test regarding the dice is statistically valid, since \(n\times p = 100 \times (1/6) = 16.666\dots\) and \(n\times (1  p) = 83.333\dots\), so both comfortably exceed five.
31.10 Example: dominance of birds
A study (Barve and Dhondt 2017) compared two types of birds (male greenbacked tits; male cinereous tits) to see which was more behaviourally dominant over winter. If the species were equallydominant, then about \(50\)% of the interactions would be won by each species (i.e., \(p = 0.50\)). However, in the \(45\) interactions observed between the two species, greenbacked tits won \(37\) of these interactions (i.e., \(\hat{p} = 37/45 0.82222\)).
Of course, every sample of \(45\) interactions would produce a different sample proportion, so the difference between this sample proportion and \(p = 0.5\) could be due to sampling variation.
To test if the proportion of interactions were equally shared, the hypotheses are:
\[
\text{$H_0$: } p = 0.5\quad\text{and}\quad\text{$H_1$: } p \ne 0.5 \text{ (twotailed)}.
\]
The test will be statistically valid, since \(n\times p = 45\times 0.5 = 22.5\) and \(n\times (1  p) = 22.5\); both exceed five.
The standard error is
\[
\text{s.e.}(\hat{p})
= \sqrt{\frac{p (1  p)}{n}}
= \sqrt{\frac{0.50 \times (1  0.50)}{45}}
= 0.0745356....
\]
Then, the value of the test statistic is:
\[
z
= \frac{\hat{p}  p}{\text{s.e.}(p)}
= \frac{0.82222  0.50}{0.0745356}
= 4.322.
\]
This is a very large \(z\)score, so the \(P\)value will be very small, using the \(68\)\(95\)\(99.7\) rule or tables.
The \(95\)% CI for the proportion requires the standard error computed using \(\hat{p}\)
\[
\text{s.e.}(\hat{p})
= \sqrt{\frac{\hat{p} (1  \hat{p})}{n}}
= \sqrt{\frac{0.82222 \times (1  0.82222)}{45}}
= 0.056999...
\]
An approximate \(95\)% CI is from \(0.708\) to \(0.936\).
We write:
There is very strong evidence in the sample (\(P < 0.001\); \(z = 4.325\)) that the interactions were not won equally between each species (\(\hat{p} = 0.8222\) won by greenbacked tits; \(n = 45\); approximate \(95\)% CI: \(0.708\) to \(0.936\)) in the population.
31.11 Chapter summary
To test a hypothesis about a population proportion \(p\):
 Initially assume the value of \(p\) in the null hypothesis to be true.
 Then, describe the sampling distribution, which describes what to expect from the sample statistic across all possible samples, based on this assumption: under certain statistical validity conditions, the sample mean varies with:
 an approximate normal distribution,
 centered around the hypothesised value of \(p\),
 with a standard deviation of \(\displaystyle \text{s.e.}(\hat{p}) = \sqrt{\frac{p (1  p)}{n}}\).
 The observations are summarised, and the value of the test statistic computed:
\[ z = \frac{ \hat{p}  p}{\text{s.e.}(p)}, \] where \(p\) is the hypothesised value given in the null hypothesis. An approximate \(P\)value can be estimated using the \(68\)\(95\)\(99.7\) rule, or using tables.
31.12 Quick review questions
A study of diseases in native Americans (Kizer et al. 2006) found \(381\) obese or overweight patients in \(449\) patients. In the USA general population, the percentage obese or overweight is \(65\)%. The researchers wanted to determine if the percentage of obesity/overweight native Americans was greater than that of the general population.
 True or false: The population proportion of overweight/obese native Americans is \(0.65\).
 True or false: The sample size is \(n = 381\).
 The sample proportion \(\hat{p}\) is (to four decimal places):
 True or false: The null hypothesis is \(H_0\): \(p = 0.65\).
 True or false: The alternative hypothesis is onetailed.
 True or false: In a onesample test of proportion, the \(z\)score is always large.
 For this test, the computed \(z\)score is (to two decimal places):
 True or false? We always accept the null hypothesis.
31.13 Exercises
Selected answers are available in App. E.
Exercise 31.1 The study of herbal medicines is complicated, as blinding subjects is difficult: placebos are often easily identifiable by eye, by taste, or by smell.
One study (Loyeung et al. 2018) examined if subjects could identify potential placebos at better rate than just guessing. The \(81\) subjects were each presented with a choice of five different supplements, four of which were placebos. Subjects were asked to select which one was the legitimate herbal supplement based on the taste. Of these, \(50\) correctly selected the true herbal supplement.
 If the subjects were selecting the true herbal supplement randomly, what proportion of subjects would be expected to select the correct supplement as the true herbal medicine?
 Write the hypotheses for addressing the aims of the study.
 Is this a one or twotailed test? Explain.
 Sketch the sampling distribution of the sample proportion, assuming the null hypothesis is correct.
 Is there evidence that people can identify the true supplement by taste?
Exercise 31.2 A study of the measlesrubella vaccination in Korea (Kim et al. 2004) wished to compare the proportion of children with measles antibodies to the World Health Organization (WHO) target proportion (for children aged \(5\) to \(9\) years old: \(10\)%).
The aim of the study was to test if the proportion of Korean children with the measles antibody in the population was \(10\)% or better (lower); the hypotheses are: In the study, \(55\) children out of \(972\) had the antibody present
 Compute the sample proportion \(\hat{p}\) of children with measles antibodies.
 Write the hypotheses for the test. Is the test one or twotailed?
 Compute the standard error for the test.
 Compute the \(z\)score and determine the \(P\)value.
 Write a conclusion.
 Are the statistical validity conditions satisfied?
Exercise 31.3 In a study of western sawshelled turtles (Streeting et al. 2022), eggs were incubated at \(27^\circ\)C, and \(29\) males and \(44\) females hatched. Are the proportions of male and female turtles that hatch at this temperature equal?
Exercise 31.4 In the 2019/2020 English Premier League (EPL), the home team won \(91\) out of \(208\) games, while the away team won \(67\). (\(50\) games were draws.) (Data from: https://sportsstatistics.com/sportsdata/soccerdatasets/) Ignoring draws, is there evidence of a homeside advantage (i.e., the homeside winning percentage is greater than \(50\)%)?
Exercise 31.5 In a study to increase activity in library users (Maeda 2013), pedal machine were introduced on the first floor of Joyner Library for use by students at East Carolina University, where \(60.2\)% of all students were females. Students were observed using the machine on \(589\) occasions, of which \(295\) times were by females
Is there evidence that the proportion of females users of the machines was lower than the overall female proportion at the university? What would you conclude?
Exercise 31.6 In a 1995 study (Koenen 1995), \(88\) of the \(357\) visitors to Las Vegas casinos were smokers. At the time, \(25.5\)% of the general U.S. population were smokers (based on data from the U.S. National Center for Health Statistics). Are casinogoers just as likely to be a smokers as the general U.S. population?
Exercise 31.7 Researchers developed a glutenfree pasta made from breadfruit (Nochera and Ragone 2019). In the study sample, \(57\) of the \(71\) participants stated that they liked the pasta. Do the researchers have sufficient evidence to claim that the 'majority of people like breadfruit pasta'?
Exercise 31.8 A study of black spinytailed iguanas in Florida (an invasive species) compared the snoutvent length (SVL) for various sizes iguanas (Avery et al. 2014). Of the \(275\) iguanas with a SVL between \(100\) and \(149\) mm,\(146\) were female.
Assuming female and male iguanas were equally present in the population, is there evidence that female and male iguanas were equallylikely to be found with SVL in this range?
Exercise 31.9 Carpal Tunnel Syndrome (CTS) is a painful condition in the wrists. A study (Boltuch et al. 2020) was interested in whether 'a relationship exists between the palmaris tendon [and] carpal tunnel syndrome (CTS)' (Boltuch et al. (2020), p. 493). The palmaris longus (PL) tendon is visually absent in about \(15\)% of the population. The researchers found PL was visually absent in \(33\) of \(516\) CTS wrists in their sample.
Is there evidence to suggest that rate of PL absence is different in CTS cases?
Exercise 31.10 In a study of resistance of some commercial corn varieties to the European corn borer (Siegfried et al. 2014), borers were collected from corn in Iowa and Nebraska.
Researchers aimed to estimate the frequency of resistance to the toxin in the corn. By mating borers collected from the field with various resistant laboratory individuals, they could determine what proportion of resistant individuals to expect in the second generation offspring. In one study of \(n = 172\) secondgeneration individuals, \(24\) were found to be resistant. The expectation was that \(1\)in\(16\) would be resistant if the field borers were resistant.
Perform a hypothesis test to determine if the data suggest that the borers were resistant (that is, if the population proportion is \(1/16\)) as expected.
Exercise 31.11 In a study of streetlight preferences of drivers (Davidovic et al. 2019), drivers were asked to conduct a series of manoeuvres under \(3000\)K LED light and then under \(4000\)K LED lights. They were then asked to decide which streetlight they preferred.
Out of the \(52\) subjects, \(29\) preferred the \(3000\)K LED lights. Is there evidence that the choice between the two streetlights is random, or is there evidence of a preference for one over the other?
Exercise 31.12 A study of Magellanic penguins (Vanstreels et al. 2013) found dead or stranded on the southern Brazilian coast found \(73\) adult penguins. Of these, \(47\) were female,
Assuming female and male penguins were equally present in the population, we would expect about half the dead or stranded penguins to be female and male. Is this what the data suggest?