32 Tests for comparing odds
So far, you have learnt to ask a RQ, identify different ways of obtaining data, design the study, collect the data describe the data, summarise data graphically and numerically, understand the tools of inference, and to form confidence intervals.
In this chapter, you will learn about hypothesis tests for odds ratios. You will learn to:
- conduct hypothesis tests for an OR (i.e., comparing two proportions, or comparing two odds), using chi-square tests using jamovi and SPSS output.
- determine whether the conditions for using these methods apply in a given situation.

32.1 Introduction: Meals on-campus

In Sect. 25.1 a study496 was introduced to examine the eating habits of university students.
Researchers cross-classified? \(n=183\) students into groups according to two qualitative variables:
- Where they live: With their parents, or not with their parents;
- Whether they eat most of their meals off-campus, or most of their meals on-campus.
Lives with parents | Doesn't live with parents | Total | |
---|---|---|---|
Most off-campus | 52 | 105 | 157 |
Most on-campus | 2 | 24 | 26 |
Total | 54 | 129 | 183 |
Since both variables observed on each student (the unit of analysis) are qualitative, means are not appropriate. However, the data can be compiled into a two-way table of counts (Table 32.1).
Since both qualitative variables have two levels, the table is a \(2\times 2\) table. A graphical summary is shown in Fig. 25.1, and a numerical summary in Table 32.2. (The details of the computations appear in Sect. 25.1).
Odds of having most meals off-campus | Percentage having most meals off-campus | Sample size | |
---|---|---|---|
Living with parents | 26 | 96.3 | 54 |
Not living with parents | 4.375 | 81.4 | 129 |
Odds ratio | 5.943 |
The parameter is the population OR, comparing the odds of eating most meals off-campus for students living with their parents to students not living with their parents.
Understanding how software computes the odds ratio is very important for understanding the output.
In jamovi and SPSS, the odds ratio can be interpreted in either of these two ways (i.e., both ways are correct):
The odds are the odds of eating most meals off-campus (Row 1 of Table 32.1). Then, the odds ratio compares these odds for students living with their parents (Column 1 of Table 32.1) to those not living with their parents (Row 2 of Table 32.1). That is, the odds are \(52/2= 26\) (for those living with parents) and \(105/24 = 4.375\) (for those not living with parents), so the OR is then \(26/4.375 = 5.943\), as in the output (jamovi: Fig. 32.1; SPSS: Fig. 32.2).
The odds are the odds of living with parents (Column 1 of Table 32.1). Then, the odds ratio compares these odds for students eating most meals off-campus (Row 1 of Table 32.1) to the odds of students eating most meals on-campus (Row 2 of Table 32.1). That is, the odds of living with parents are \(52/105= 0.49524\) (for those eating most meals off-campus) and \(2/24 = 0.083333\) (for those eating most meals on-campus), so the OR is then \(0.49524/0.083333 = 5.943\), as in the output (jamovi: Fig. 32.1; SPSS: Fig. 32.2).
In other words, the odds and odds ratios are relative to the first row or first column.
Unlike the previous decision-making RQs, this RQ does not concerns means. Instead, the RQ can be written in terms of comparing proportions, odds, or odds ratios.
For reasons that we can't delve into, usually the odds ratio (OR) is used as the parameter. One important reason is that software produces output related to testing the OR. Using the OR, the RQ could be written as
Is the population odds ratio of eating most meals off-campus, comparing students who live with their parents to students not living with their parents, equal to one?
Alternatively, but probably easier to understand, is to write the RQ in terms of comparing the odds in the two groups explicitly:
Are the population odds of students eating most meals off-campus the same for students living with their parents and for students not living with their parents?
The RQ can also be worded as comparing the percentages (or proportions) of students eating meals off-campus in each group. This is equivalent to the RQs above, but is not directly related to the software output, which works with odds ratios.
Another alternative, which sounds less direct but is useful for two-way tables larger than \(2\times 2\) (see Sect. 32.10), is worded in terms of relationships or associations between the variables:
Is there a relationship (or association) between where students eat most of their meals and whether or not the student lives with their parents?
All of these are equivalent. Usually, for \(2\times2\) tables, working with odds or odds ratios is best, because most software (including jamovi and SPSS) readily produces CIs for the odds ratio.
32.2 Hypotheses and notation: Comparing odds
For two-way tables of counts, the parameter is the population odds ratio. As usual, the null hypothesis is the 'no difference, no change, no relationship' position. So in this context:
-
\(H_0\): The population OR is one;
or (equivalently):
The population odds are the same in each group.
This hypothesis proposes that the sample odds are not the same due to sampling variation. This is the initial assumption.
The alternative hypothesis is
-
\(H_1\): The population OR is not one;
or (equivalently):
The population odds are not the same in each group.
The alternative hypothesis is always two-tailed for analysing two-way tables of counts.
For analysing two-way tables of counts, the alternative hypotheses are always two-tailed.
The hypotheses can also be written in terms of differences in percentages (or proportions), though the software output is usually expressed in terms of odds. The hypotheses can also be written in terms of associations:
- \(H_0\): In the population, there is no association between the two variables
- \(H_1\): In the population, there is an association between the two variables
The RQ and hypothesis only needs to be given in one of these ways. The RQ and hypotheses should be consistent (for example, if the RQ is written in terms of odds, the hypotheses should be written in terms of odds).
As usual, following the decision-making process, start by assuming that the null hypothesis is true: that the population odds ratio is one.
32.3 Expected values: Comparing odds
Assuming that the odds of having most meals off-campus is the same for both groups (that is, the population OR is one), how would the sample OR be expected to vary from sample to sample just because of sampling variation?
If the population OR was one, the odds are the same in both groups; equivalently, the percentages are the same in both groups. That is, the percentage of students eating most meals off-campus is the same for students living with and not living with their parents.
Let's consider the implication. From Table 32.1, 157 students out of 183 ate most meals off-campus; that is,
\[ \frac{157}{183} \times 100 = 85.79\% \] of the students in the entire sample ate most of their meals off-campus.
If the percentage of students who eat most of their meals off-campus is the same for those who live with their parents and those who don't, then we'd expect 85.79% of students in both groups to be equal to this value. That is, we would expect
- 85.79% of the 54 students (that is, 46.33) who live with their parents to eat most meals off-campus; and
- 85.79% of the 129 students (that is, 110.67) who don't live with their parents to eat most meals off-campus.
That is, the percentage (and hence the odds) is the same in each group. Those are the numbers that are expected to appear if the percentage was exactly the same in each group (Table 32.3), if the null hypothesis (the assumption) was true.
Consider the expected counts in Table 32.3.
Confirm that the odds of having most meals off-campus is the same for students living with their parents, and for students not living with their parents.
How do those expected values compare to what was observed? For example:
- 46.33 of the 54 students who live with their parents are expected to eat most meals off-campus; yet we observed 52.
- 110.67 of the 129 students who don't live with their parents are expected to eat most meals off-campus; yet we observed 105.
The observed and expected counts are similar, but not the exactly same. This is no surprise: each sample will produce slightly different observed counts (sampling variation). The difference between what the observed and expected counts may be explained by sampling variation (that is, the null hypothesis explanation).
You do not have to compute the expected values when you answer one of these types of RQs (software does it for you).
However, seeing how the decision-making process works in this context is helpful.
When discussing previous hypothesis tests, the sampling distribution of the sample statistic (in this case, the sampling distribution of the sample odds ratio) was described, and this sampling distribution had an approximate normal distribution (whose standard deviation is called the standard error). However, the sampling distribution of the odds ratio is more involved497 so will not be presented.
Lives with parents | Doesn't live with parents | Total | |
---|---|---|---|
Most off-campus | 46.33 | 110.67 | 157 |
Most on-campus | 7.67 | 18.33 | 26 |
Total | 54.00 | 129.00 | 183 |
32.4 The test statistic: Comparing odds
The decision-making process compares what is expected from the sample statistic if the null hypothesis about the parameter is true (Table 32.3) to what is observe in the sample (Table 32.1).
Previously, when the summary statistics were means, \(t\)-tests were used. However, these data are not summarised by means, and a different test statistic is used.
Rather than using a \(t\)-score as the test-statistic, the test-statistic here is a 'chi-squared' statistic, written \(\chi^2\). A \(\chi^2\) statistic measures the overall size of the differences between the expected counts and observed counts, over the entire table.
The Greek letter \(\chi\) is pronounced 'ki', as in kite.
The test statistic \(\chi^2\) is pronounced as 'chi-squared'.
From the software (jamovi: Fig. 32.1; SPSS: Fig. 32.2), \(\chi^2=6.934\).
In a \(2\times 2\) table of counts
(when the 'degrees of freedom', or df
, is equal to 1,
as shown in the computer output),
the square root of the \(\chi^2\) value is
approximately equivalent to a \(z\)-score.
So here, the equivalent \(z\)-score is about \(\sqrt{6.934} = 2.63\), which is fairly large: a small \(P\)-value is expected.
More generally, for two-way tables of any size,
\[ \sqrt{\frac{\chi^2}{\text{df}}} \] is like a \(z\)-score, where df is the 'degrees of freedom' (related to the size of the table498), as shown in the software output.
This allows a \(P\)-value to be estimated using the 68--95--99.7 rule from the value of the \(\chi^2\) statistic.

FIGURE 32.1: The jamovi output for computing a CI and conducting a test

FIGURE 32.2: The SPSS output for computing a CI and conducting a test
In a chi-squared test,
with a given number of 'degrees of freedom'
(written df
in the software output),
the value of
\[
\sqrt{\frac{\chi^2}{\text{df}}}
\]
is like a \(z\)-score.
This allows the \(P\)-value to be estimated using the 68--95--99.7 rule.
32.5 \(P\)-values: Comparing odds
The differences between the observated sample statistic (the sample OR) and the hypothesised population parameter (the population OR of one) is summarised by \(\chi^2=6.934\) (approximately equivalent to \(z=2.63\)). Using the 68--95--99.7 rule, a small \(P\)-value is expected.
The corresponding two-tailed \(P\)-value reported by
jamovi
(Fig. 32.1, under the p
column)
and SPSS
(Fig. 32.2,
in the Asymptotic Significance (2-sided)
column
and Pearson Chi-Square
row)
is very small
(\(0.008\) to three decimals).
Recall that, for two-way tables of counts, the alternative hypotheses are always two-tailed, so a two-tailed \(P\)-value is always reported.
Click on the hotspots in the following image, to see what the SPSS output tells us.
32.6 Conclusions: Comparing odds
As usual, a very small \(P\)-value (\(0.008\) to three decimals) means there is very strong evidence supporting \(H_1\): the evidence suggests a difference in the population odds in the two groups. We write:
The sample provides strong evidence (\(\chi^2=6.934\); two-tailed \(P=0.008\)) that the odds in the population of having most meals off-campus is different for students living with their parents (odds: 26) and students not living with their parents (odds: 4.375; OR: \(5.94\); 95% CI from \(1.35\) to \(26.1\)).
Again, as seen in Sect. 29.7, the conclusion includes three components: The answer to the RQ; the evidence used to reach that conclusion ('\(\chi^2=6.934\); two-tailed \(P=0.008\)'); and some sample summary statistics (inclding the 95% CI for the odds ratio).
The conclusion also makes clear what the odds and the odds ratio mean. The odds are describing as the 'odds... of having most meals off-campus', and the OR as then comparing these odds between 'students living with their parents... and students not living with their parents'.
For two-way tables, RQs are best framed in terms of ORs or odds (but can be framed in terms of proportions or percentages, or associations or relationships).
For consistency: if the RQ is about the odds ratio, the hypotheses and conclusion should be about the odds ratio; if the RQ is about odds, the hypotheses and conclusion should be about the odds; and so on.
32.7 Statistical validity conditions
As usual, these results hold under certain conditions. The test above is statistically valid if:
- All expected counts are at least five.
Some books may give other (but similar) conditions.
In addition to the statistical validity condition, the test will be
- internally valid if the study was well designed; and
- externally valid if the sample is a simple random sample and is internally valid.
The statistical validity condition refers to the expected (not the observed) counts. SPSS tells us if a problem exists with the expected count condition, underneath the first output table in Fig. 25.3. In jamovi, the expected counts must be explicitly requested to see if this condition is satisfied (Fig. 32.3).

FIGURE 32.3: The expected values, as computed in jamovi
For the student-eating data, the smallest observed count is 2 (living with parents; most meals off-campus), but the smallest expected count is 7.67, which is greater than five. The size of the expected counts is important for the statistical validity condition.
Example 32.1 (Statistical validity) For the university-student eating data, all the cells have an expected count of at least five so the statistical validity condition is satisfied.
32.8 Example: Pet birds

A study examined people with lung cancer, and a matched set of similar controls who did not have lung cancer, and compared the proportion in each group that had pet birds.499
These data were studied in Sect. 25.6; the data are shown again in Table 32.4, and the numerical summary in Table 32.5 (the computations are shown in Sect. 25.6).
Adults with lung cancer | Adults without lung cancer | Total | |
---|---|---|---|
Kept pet birds | 98 | 101 | 199 |
Did not keep pet birds | 141 | 328 | 469 |
Total | 239 | 429 | 668 |
One RQ in the study was:
Are the odds of having a pet bird the same for people with lung cancer (cases) and for people without lung cancer (controls)?
The parameter is the population OR, comparing the odds of keeping a pet bird, for adults with lung cancer to adults who do not have lung cancer.
The RQ could also be written as:
- Is the percentage of people having a pet bird the same for people with lung cancer (cases) and for people without lung cancer (controls)?
- Is the odds ratio of people having a pet bird, comparing people with lung cancer (cases) and for people without lung cancer (controls), equal to one?
- Is there a relationship between having a pet bird and having lung cancer?
Of these, the first is probably the easiest to understand.
From this RQ (which is written in terms of odds), the hypotheses could be written as:
- \(H_0\): The odds of having a pet bird is the same for people with lung cancer (cases) and for people without lung cancer (controls).
- \(H_1\): The odds of having a pet bird is not the same for people with lung cancer (cases) and for people without lung cancer (controls).
The null hypothesis could also be written as:
- The percentage of people having a pet bird is the same for people with lung cancer (cases) and for people without lung cancer (controls).
- The odds ratio of people having a pet bird, comparing people with lung cancer (cases) and for people without lung cancer (controls), is equal to one.
- There is no relationship between having a pet bird and having lung cancer.
Of these, the first is probably the easiest to understand.
Begin by assuming the null hypothesis is true: no difference exists between the odds in the population. Based on this assumption, the expected counts can be found.
From the data (Table 32.4), overall \(199\div 668 = 29.79\)% of people own a pet bird.
If there really was no difference in the odds (or the percentages) of owning a pet bird between those with and without lung cancer, about \(29.79\)% of the people in both lung cancer groups are expected to own a pet bird.
Odds of keeping pet bird | Percentage keeping pet bird | Sample size | |
---|---|---|---|
With lung cancer: | 0.6950 | 41.0% | 239 |
Without lung cancer: | 0.3079 | 25.5% | 429 |
Odds ratio: | 2.26 |
About 29.79% of the 239 lung-cancer cases (or 71.20) would be expected to have a pet bird, and about 29.79% of the 429 non-lung-cancer cases (or 127.80) would be expected to have a pet bird.
A table of these expected counts (Table 32.6). shows that all expected counts are greater than five. In practice, you do not need to compute the expecte counts; software does this automatically.
Adults with lung cancer | Adults without lung cancer | Total | |
---|---|---|---|
Kept pet birds | 71.2 | 127.8 | 199 |
Did not keep pet birds | 167.8 | 301.2 | 469 |
Total | 239.0 | 429.0 | 668 |
The numbers in Table 32.6 are what is expected, if the percentage of people owning a pet bird is the same for lung cancer and non-lung cancer cases. How close are the expected and observed counts (in Table 32.4)?
To compare the sample statistic (what we observed) with the hypothesised population parameter, software is used to compute the value of \(\chi^2\) (jamovi: Fig. 32.4; SPSS: Fig. 32.5): \(\chi^2=22.374\), approximately equivalent to a \(z\)-score of
\[ \sqrt{22.374/1} = 4.730, \] which is very large. Hence, a small \(P\)-value is expected.
The software shows that the \(P\)-value is very small (\(P<0.001\)). As usual, a small \(P\)-value means that there is very strong evidence supporting \(H_1\), if \(H_0\) is assumed true. That is, the evidence suggests there is a difference in the odds in the population. We write:
The sample provides very strong evidence (\(\chi^2=22.374\); two-tailed \(P<0.001\)) that the odds in the population of having a pet bird is not the same for people with lung cancer (odds: 0.695) and for people without lung cancer (odds: 0.308; OR: \(2.26\); 95% CI from \(1.6\) to \(3.2\)).

FIGURE 32.4: jamovi output for the pet-birds data

FIGURE 32.5: SPSS output for the pet-birds data
This doesn't imply that owning a pet bird causes lung cancer. Why not?
(Answer is here500.)
32.9 Example: B12 deficiency

A study in New Zealand501 asked:
Among a certain group of women, are the odds of being vitamin B12 deficient different for women on a vegetarian diet compared to women on a non-vegetarian diet?
The population was 'predominantly overweight/obese women of South Asian origin living in Auckland'. The RQ could be worded in terms of odds ratios or proportions, too.
To test the claim, the hypotheses are:
- \(H_0\): \(\text{population odds for vegetarians} = \text{population odds for non-vegetarians}\): The odds of B12 deficiency are the same in both groups.
- \(H_1\): \(\text{population odds for vegetarians} \ne \text{population odds for non-vegetarians}\): The odds of B12 deficiency are not the same in both groups.
The parameter is the population OR, comparing the odds of being B12 deficient, for vegetarians to non-vegetarians.
Here, the odds refer to the odds of a woman being B12 deficient. As with the RQ, the hypotheses could be worded in terms of odds ratios, proportions (or percentages), or relationships.
The data are shown in Table 32.7, and the numerical summary in Table 32.8.
Since the RQ is about odds, a side-by-side bar chart is produced (Fig. 32.6) as the graphical summary.
B12 deficient | Not B12 deficient | Total | |
---|---|---|---|
Vegetarians | 8 | 26 | 34 |
Non-vegetarians | 8 | 82 | 90 |
Total | 16 | 108 | 124 |
Odds B12 deficient | Percentage B12 deficient | Sample size | |
---|---|---|---|
Vegetarians: | 0.3077 | 23.5% | 34 |
Non-vegetarians: | 0.0976 | 8.9% | 90 |
Odds ratio: | 3.15 |

FIGURE 32.6: A side-by-side barchart comparing the number of women B12 deficient
The software output (jamovi: Fig. 32.7; SPSS: Fig. 32.8) shows that the OR (and 95% CI) is \(3.154\) (\(1.077\) to \(9.238\)). The chi-square value is \(4.707\), approximately equivalent to \(z\)-score of
\[ \sqrt{\frac{4.707}{1}} = 2.17; \] a small \(P\)-value is expected using the 68--95--99.7 rule.
The software output shows that the two-tailed \(P\)-value is \(0.030\), which is indeed 'small'.

FIGURE 32.7: jamovi output for the B12 data

FIGURE 32.8: SPSS output for the B12 data
We conclude:
The sample provides moderate evidence (\(\chi^2 = 4.707\); \(P=0.030\)) that the odds in the population of being vitamin B12 deficient is different for vegetarian women (odds: 0.3077) compared to non-vegetarian women (odds: 0.0976; OR: \(3.2\); 95% CI: \(1.1\) to \(9.2\)).
The statistically valid shoud be checked. The jamovi output (Fig. 32.7) shows that the smallest expected count is 4.39. Likewise, the text under the first table of SPSS output in Fig. 32.8 says that
1 cells (25.0%) have expected count less than 5. The minimum expected count is 4.39.
The smallest expected count is smaller than five, so the results may be statistically invalid. Nonetheless, only one cell has an expected count less than five, and only just under 5, so we shouldn't be too concerned (but it should be noted).
32.10 Example: Kerbside dumping

A study of dumping households goods on the kerbside in Brisbane502 asked people about their opinions on the dumping. All participants were from Brisbane suburbs where a high level of kerbside dumping occurred.
The data are summarised in Table 32.9. Notice that this is a \(2\times 3\) table of counts, so it is more difficult to define a parameter.
Acceptable | Not acceptable | Conditionally acceptable | |
---|---|---|---|
Reuseable | 22 | 18 | 15 |
Non-reuseable | 6 | 36 | 1 |
The software output is shown in Fig. 32.9 (for jamovi), and Fig. 32.10 (for SPSS); a graphical summary in Fig. 32.11.

FIGURE 32.9: jamovi output for the kerbside-dumping data

FIGURE 32.10: SPSS output for the kerbside-dumping data

FIGURE 32.11: A side-by-side bar chart for the kerbside-dumping data
Most of the numerical summary must be produced manually (Table 32.10), since software only produces odds ratios for \(2\times 2\) tables.
Odds | Odds ratio | Percentage | Sample size | |
---|---|---|---|---|
Acceptable: | 3.667 | (Reference) | 78.6% | 28 |
Not acceptable: | 0.5 | 0.136 | 33.3% | 54 |
Conditionally acceptable: | 15 | 4.09 | 93.4% | 16 |
In Table 32.10, the odds are that the given opinion refers to Reusable goods. Here are some of the details of these calculations:
- For Acceptable goods: the odds that these are reusable goods is \(22/6 = 3.667\).
- For Not acceptable goods: the odds that these are reusable goods is \(18/36 = 0.5\).
- For Conditionally acceptable goods: the odds that these are reusable goods is \(15/1 = 15\).
Then the odds ratios can be computed:
- Comparing the odds of Not acceptable to Acceptable: \(0.5/3.667 = 0.136\).
- Comparing the odds of Conditionally acceptable to Acceptable: \(15/3.667 = 4.09\).
Note that Table 32.10 has three groups to compare, so three odds calculations.
However, the summary has \(3 - 1 = 2\) odds ratios, since odds ratios compare two odds. The level to which the other two are compared is called the Reference level. In Table 32.10, the reference level is 'Acceptable'.
(In a \(2\times 2\) table, with two groups to compare, the summary has only \(2 - 1 = 1\) odds ratio.)
The hypotheses can be expressed in many ways (in terms of odds, odds ratio, or percentages), but perhaps the easiest approach with two-way tables larger than \(2\times 2\) is worded in terms of relationships or associations between the two variables:
- \(H_0\): There is no association between the type of rubbish and the opinion of kerbside duming.
- \(H_1\): There is an association between the type of rubbish and the opinion of kerbside duming.
From the software output, the \(\chi^2\)-value is 26.318 and the degrees of freedom is two, so this \(\chi^2\) value is approximately equivalent to a \(z\)-score of
\[ \sqrt{\frac{26.318}{2}} = 3.63. \] This is a large \(z\)-score so, using the 68--95--99.7 rule, a very small \(P\)-value is expected; indeed, the software output reports \(P<0.001\). This suggests very strong evidence in the sample that opinions are not the same for reuseable and non-reuseable rubbish.
The conclusion could be written as
The sample provides very strong evidence (\(\chi^2=26.318\); \(\text{df}=2\)) that there is a relationship in the population between opinions about kerbside dumping and the type of rubbish.
While sample summary information could be added, the conclusion statements then become cumbersome. Instead, pointing readers to the numerical summary (Table 32.10) is probably better. Furthermore, CIs are not reported since the software does not produce CIs for tables larger than \(2\times 2\).
All expected values all exceed 5 (as in the jamovi (Fig. 32.9) and SPSS output (Fig. 32.10)), even though one observed count is less than five. The results are statistically valid.
32.11 Summary
To test a hypothesis about a population odds ratio, based on the value of the sample odds ratio, initially assume the value of the population odds ratio in the null hypothesis (usually one) to be true. Then, expected counts (Step 2) can be computed. Since the sample odds ratio varies from sample to sample, under certain statistical validity conditions a quantity closely-related to the sample odds ratio varies with an approximate normal distribution. This distribution describes what values of the sample odds ratio could be expected in the sample if the value of the populations odds ratio in the null hypothesis was true. The test statistic is a \(\chi^2\) statistic, which compares the expected and observed counts. (The value of \(\sqrt{\chi^2/\text{df}}\) is like a \(z\)-score, where 'df' is the 'degrees of freedom' reported by software, and so an approximate \(P\)-value can be estimated using the 68--95--99.7 rule.) Software reports the \(P\)-value to assess whether the data are consistent (Step 4) with the assumption.
A study was conducted on people wearings hats and sunglasses, producing the SPSS output below.
Click on the \(P\)-value.
32.12 Quick review questions

A study503 of the adoption of electric vehicle (EVs) by a certain group of professional Americans (Example 5.14) compiled the data in Table 32.11. Output from using jamovi is shown in Fig. 32.12.
Yes | No | |
---|---|---|
No post-grad | 24 | 8 |
Post-grad study | 51 | 29 |
The \(\chi^2\) value is:
The approximately-equivalent \(z\)-score (to two decimal places) is:
-
Using the 68--95--99.5 rule, the \(P\)-value is:
From the software output, the \(P\)-value is:
-
The alternative hypothesis will be:
True or false: There is no evidence of a difference in the odds of buying a car in the next 10 years, between those with and without post-graduate study.

FIGURE 32.12: jamovi output for the EV study
Progress:
32.13 Exercises
Selected answers are available in Sect. D.30.
Exercise 32.1 Researchers504 studied the number of sandflies caught in light traps set at 3 and 35 feet above ground in eastern Panama. They asked:
In eastern Panama, are the odds of finding a male sandfly the same at 3 feet above ground as at 35 feet above ground?
The data are compiled into a table (Table 32.12), and summarised numerically (Table 32.13; partially edited) and graphically (Fig. 32.13). Use the jamovi output (Fig. 32.14) to evaluate the evidence, complete Table 32.13, and write a conclusion.
3 feet above ground | 35 feet above ground | |
---|---|---|
Males | 173 | 125 |
Females | 150 | 73 |
Odds | Percentage | Sample size | |
---|---|---|---|
3 feet: | ?? | ?? | 298 |
35 feet: | 1.71 | 67.3% | 223 |
Odds ratio: | 0.67 |

FIGURE 32.13: A side-by-side barchart of the sandflies data

FIGURE 32.14: Using jamovi to compute a CI for the sandflies data
Exercise 32.2 A prospective observational study in Western Australia compared the heights of scars from burns received.505 The data are shown in Table 32.14. SPSS was used to analyse the data (Fig. 32.15). (This study also appeared in Exercise 25.1, where the odds ratio, and the CI for the odds ratio, were computed.)
- Perform a hypothesis test to determine if the odds of having a smooth scar are the same for women and men.
- Write down the conclusion.
- Is the test statistically valid?
Women | Men | |
---|---|---|
Scar height 0mm (smooth) | 99 | 216 |
Scar height more than 0mm, less than 1mm | 62 | 115 |

FIGURE 32.15: Using jamovi to compute a CI for the scar-height data
Exercise 32.3 In a study of turbine failures,506 73 turbines were run for around 1800 hours, and seven developed fissures (small cracks). Forty-two different turbines were run for about 3000 hours, and nine developed fissures.

FIGURE 32.16: jamovi output for the turbine data

FIGURE 32.17: jamovi output for the turbine data: expected counts
Exercise 32.4 The Southern Oscillation Index (SOI) is a standardised measure of the air pressure difference between Tahiti and Darwin, and has been shown to be related to rainfall in some parts of the world,507 and especially Queensland.508
As an example,509 the rainfall at Emerald (Queensland) was recorded for Augusts between 1889 to 2002 inclusive, in Augusts when the monthly average SOI was positive, and when the SOI was non-positive (that is, zero or negative), as shown in Table 32.15. (This study also appeared in Exercise 25.4.)
- Using the jamovi output in Fig. 32.18, perform a hypothesis test to determine if the odds of having no rain is the same Augusts with non-positive and negative SOI.
- Write down the conclusion.
- Is the test statistically valid?
Non-positive SOI | Positive SOI | |
---|---|---|
No rainfall recorded | 14 | 7 |
Rainfall recorded | 40 | 53 |

FIGURE 32.18: jamovi output for the Emerald-rain data
Exercise 32.5 A research study conducted in Brisbane510 recorded the number of people at the foot of the Goodwill Bridge, Southbank, who wore sunglasses and hats.
The data were recorded between 11:30am to 12:30pm. Table 32.16 records the number of females and males wearing hats.
- Compute the percentages of females wearing a hat.
- Compute the percentages of males wearing a hat.
- Compute the odds of a female wearing a hat.
- Compute the odds of a male wearing a hat.
- Compute the odds ratio of wearing a hat, comparing females to males.
- Compute the odds ratio of wearing a hat, comparing males to females.
- Find the 95% CI for the appropriate OR.
- Using the SPSS output in Fig. 32.19, perform a hypothesis test to determine if the odds of wearing a hat is the same for females and males.
- Write down the conclusion.
- Is the test statistically valid?
Not wearing hat | Wearing hat | |
---|---|---|
Male | 307 | 79 |
Female | 344 | 22 |

FIGURE 32.19: SPSS output for the hats data
Exercise 32.6 A study511 asked people about their mobile-phone interactions while crossing the road as pedestrians. Part of the data are summarised in Table 32.17.
- Compute the column percentages.
- Compute the odds of low exposure to each behaviour.
- Write the hypothesis for conducting a hypothesis test.
- Compute the expected counts.
- After analysis in jamovi, the value of \(\chi^2\) is 20.923 with two degrees of freedom. What is the approximately-equivalent \(z\)-score? Would you expect a large or small \(P\)-value?
- The \(P\)-value is given as \(P<0.000\). Write a conclusion.
Answer call | Respond to text | Reply to email | |
---|---|---|---|
Low exposure | 263 | 259 | 302 |
High exposure | 94 | 98 | 51 |